[jira] [Updated] (FLINK-22541) add json format filter params
[ https://issues.apache.org/jira/browse/FLINK-22541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sandy du updated FLINK-22541: - Description: In my case,one kafka topic store multiple table data,for example: \{"id":"121","source":"users","content":{"name":"test01","age":20,"addr":"addr1"}} \{"id":"122","source":"users","content":{"name":"test02","age":23,"addr":"addr2"}} \{"id":"124","source":"users","content":{"name":"test03","age":34,"addr":"addr3"}} \{"id":"124","source":"order","content":{"orderId":"11","price":34,"addr":"addr1231"}} \{"id":"125","source":"order","content":{"orderId":"12","price":34,"addr":"addr1232"}} \{"id":"126","source":"order","content":{"orderId":"13","price":34,"addr":"addr1233"}} I just want to consume data from talbe order,flink sql ddl like this: CREATE TABLE order ( orderId STRING, age INT, addr STRING ) with ( 'connector'='kafka', 'topic'='kafkatopic', 'properties.bootstrap.servers'='localhost:9092', 'properties.group.id'='testGroup', 'scan.startup.mode'='earliest-offset', 'format'='json', 'path-fliter'='$[?(@.source=="order")]', 'path-data'='$.content' ); path-fliter and path-data can use JsonPath ([https://github.com/json-path/JsonPath]) was: In my case,one kafka topic multiple table data,for example: {"id":"121","source":"users","content":\{"name":"test01","age":20,"addr":"addr1"}} {"id":"122","source":"users","content":\{"name":"test02","age":23,"addr":"addr2"}} {"id":"124","source":"users","content":\{"name":"test03","age":34,"addr":"addr3"}} {"id":"124","source":"order","content":\{"orderId":"11","price":34,"addr":"addr1231"}} {"id":"125","source":"order","content":\{"orderId":"12","price":34,"addr":"addr1232"}} {"id":"126","source":"order","content":\{"orderId":"13","price":34,"addr":"addr1233"}} I just want to consume data from talbe order,flink sql ddl like this: CREATE TABLE order ( orderId STRING, age INT, addr STRING ) with ( 'connector'='kafka', 'topic'='kafkatopic', 'properties.bootstrap.servers'='localhost:9092', 'properties.group.id'='testGroup', 'scan.startup.mode'='earliest-offset', 'format'='json', 'path-fliter'='$[?(@.source=="order")]', 'path-data'='$.content' ); path-fliter and path-data can use JsonPath (https://github.com/json-path/JsonPath) > add json format filter params > -- > > Key: FLINK-22541 > URL: https://issues.apache.org/jira/browse/FLINK-22541 > Project: Flink > Issue Type: Improvement > Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile) >Affects Versions: 1.11.0, 1.12.0 >Reporter: sandy du >Priority: Minor > > In my case,one kafka topic store multiple table data,for example: > > \{"id":"121","source":"users","content":{"name":"test01","age":20,"addr":"addr1"}} > > \{"id":"122","source":"users","content":{"name":"test02","age":23,"addr":"addr2"}} > > \{"id":"124","source":"users","content":{"name":"test03","age":34,"addr":"addr3"}} > > \{"id":"124","source":"order","content":{"orderId":"11","price":34,"addr":"addr1231"}} > > \{"id":"125","source":"order","content":{"orderId":"12","price":34,"addr":"addr1232"}} > > \{"id":"126","source":"order","content":{"orderId":"13","price":34,"addr":"addr1233"}} > > I just want to consume data from talbe order,flink sql ddl like this: > CREATE TABLE order ( > orderId STRING, > age INT, > addr STRING > ) > with ( > 'connector'='kafka', > 'topic'='kafkatopic', > 'properties.bootstrap.servers'='localhost:9092', > 'properties.group.id'='testGroup', > 'scan.startup.mode'='earliest-offset', > 'format'='json', > 'path-fliter'='$[?(@.source=="order")]', > 'path-data'='$.content' > ); > > path-fliter and path-data can use JsonPath > ([https://github.com/json-path/JsonPath]) > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-22541) add json format filter params
sandy du created FLINK-22541: Summary: add json format filter params Key: FLINK-22541 URL: https://issues.apache.org/jira/browse/FLINK-22541 Project: Flink Issue Type: Improvement Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile) Affects Versions: 1.12.0, 1.11.0 Reporter: sandy du In my case,one kafka topic multiple table data,for example: {"id":"121","source":"users","content":\{"name":"test01","age":20,"addr":"addr1"}} {"id":"122","source":"users","content":\{"name":"test02","age":23,"addr":"addr2"}} {"id":"124","source":"users","content":\{"name":"test03","age":34,"addr":"addr3"}} {"id":"124","source":"order","content":\{"orderId":"11","price":34,"addr":"addr1231"}} {"id":"125","source":"order","content":\{"orderId":"12","price":34,"addr":"addr1232"}} {"id":"126","source":"order","content":\{"orderId":"13","price":34,"addr":"addr1233"}} I just want to consume data from talbe order,flink sql ddl like this: CREATE TABLE order ( orderId STRING, age INT, addr STRING ) with ( 'connector'='kafka', 'topic'='kafkatopic', 'properties.bootstrap.servers'='localhost:9092', 'properties.group.id'='testGroup', 'scan.startup.mode'='earliest-offset', 'format'='json', 'path-fliter'='$[?(@.source=="order")]', 'path-data'='$.content' ); path-fliter and path-data can use JsonPath (https://github.com/json-path/JsonPath) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] flinkbot edited a comment on pull request #15821: [FLINK-21911][table] Add built-in greatest/least functions support
flinkbot edited a comment on pull request #15821: URL: https://github.com/apache/flink/pull/15821#issuecomment-830427889 ## CI report: * 701b87adda8ea30306cf1d99e7aea84f1ded716a Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17478) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] fangyuefy commented on pull request #15817: [FLINK-14393][webui] Add an option to enable/disable cancel job in we…
fangyuefy commented on pull request #15817: URL: https://github.com/apache/flink/pull/15817#issuecomment-830530227 > Thanks for creating this PR @fangyuefy. I've tested it, and it works. Nicely done. I will address my comments while merging this PR. Thanks @tillrohrmann for reviewing the code, I fully agree with the modifications and suggestions. This is my first community contribution, what else do I need to do to make this PR merge? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #15821: [FLINK-21911][table] Add built-in greatest/least functions support
flinkbot edited a comment on pull request #15821: URL: https://github.com/apache/flink/pull/15821#issuecomment-830427889 ## CI report: * 701b87adda8ea30306cf1d99e7aea84f1ded716a Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17478) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot commented on pull request #15821: [FLINK-21911][table] Add built-in greatest/least functions support
flinkbot commented on pull request #15821: URL: https://github.com/apache/flink/pull/15821#issuecomment-830427889 ## CI report: * 701b87adda8ea30306cf1d99e7aea84f1ded716a UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot commented on pull request #15821: [FLINK-21911][table] Add built-in greatest/least functions support
flinkbot commented on pull request #15821: URL: https://github.com/apache/flink/pull/15821#issuecomment-830424683 Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community to review your pull request. We will use this comment to track the progress of the review. ## Automated Checks Last check on commit 701b87adda8ea30306cf1d99e7aea84f1ded716a (Fri Apr 30 22:04:07 UTC 2021) **Warnings:** * No documentation files were touched! Remember to keep the Flink docs up to date! Mention the bot in a comment to re-run the automated checks. ## Review Progress * ❓ 1. The [description] looks good. * ❓ 2. There is [consensus] that the contribution should go into to Flink. * ❓ 3. Needs [attention] from. * ❓ 4. The change fits into the overall [architecture]. * ❓ 5. Overall code [quality] is good. Please see the [Pull Request Review Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full explanation of the review process. The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands The @flinkbot bot supports the following commands: - `@flinkbot approve description` to approve one or more aspects (aspects: `description`, `consensus`, `architecture` and `quality`) - `@flinkbot approve all` to approve all aspects - `@flinkbot approve-until architecture` to approve everything until `architecture` - `@flinkbot attention @username1 [@username2 ..]` to require somebody's attention - `@flinkbot disapprove architecture` to remove an approval you gave earlier -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] snuyanzin opened a new pull request #15821: [FLINK-21911][table] Add built-in greatest/least functions support
snuyanzin opened a new pull request #15821: URL: https://github.com/apache/flink/pull/15821 ## What is the purpose of the change The PR implements three functions from the list mentioned in FLINK-21911 : `GREATEST`, `LEAST` ## Brief change log Implementation, tests and docs for `GREATEST`, `LEAST` ## Verifying this change - `MiscFunctionsITCase` ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): no - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: yes - The serializers: no - The runtime per-record code paths (performance sensitive): no - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn/Mesos, ZooKeeper: no - The S3 file system connector: no ## Documentation - Does this pull request introduce a new feature? yes - If yes, how is the feature documented? docs / JavaDocs -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (FLINK-21911) Support GREATEST and LEAST functions in SQL
[ https://issues.apache.org/jira/browse/FLINK-21911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-21911: --- Labels: pull-request-available (was: ) > Support GREATEST and LEAST functions in SQL > --- > > Key: FLINK-21911 > URL: https://issues.apache.org/jira/browse/FLINK-21911 > Project: Flink > Issue Type: Sub-task > Components: Table SQL / API >Reporter: Timo Walther >Assignee: Sergey Nuyanzin >Priority: Major > Labels: pull-request-available > > We should discuss if we want to support math MIN / MAX in Flink SQL. It seems > also other vendors do not support it out of the box: > [https://stackoverflow.com/questions/124417/is-there-a-max-function-in-sql-server-that-takes-two-values-like-math-max-in-ne] > But it might be quite useful and a common operation in JVM languages. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] fdalsotto commented on a change in pull request #15813: [FLINK-22519][flink-python] support tar.gz python archives
fdalsotto commented on a change in pull request #15813: URL: https://github.com/apache/flink/pull/15813#discussion_r624232556 ## File path: flink-python/src/main/java/org/apache/flink/python/env/beam/ProcessPythonEnvironmentManager.java ## @@ -326,9 +327,21 @@ private void constructArchivesDirectory(Map env) throws IOExcept // extract archives to archives directory for (Map.Entry entry : dependencyInfo.getArchives().entrySet()) { -ZipUtils.extractZipFileWithPermissions( -entry.getKey(), -String.join(File.separator, archivesDirectory, entry.getValue())); +String filePath = entry.getKey(); +if (filePath.endsWith(".zip") || filePath.endsWith(".jar")) { Review comment: that'd be extra effort to only support windows installations for a very low number of users. not sure if I'd support this -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #15768: [FLINK-22451][table] Support (*) as parameter of UDFs in Table API
flinkbot edited a comment on pull request #15768: URL: https://github.com/apache/flink/pull/15768#issuecomment-826735938 ## CI report: * dced81c2ffc8c59f9d9311346e71309129aa73cf Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17473) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #15820: [FLINK-22535][runtime] CleanUp is invoked for task even when the task…
flinkbot edited a comment on pull request #15820: URL: https://github.com/apache/flink/pull/15820#issuecomment-830052015 ## CI report: * c11e6cb0c0817a53ef456462f610e3d079f590f9 Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17472) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Comment Edited] (FLINK-10165) JarRunHandler/JarRunRequestBody should allow to pass program arguments as escaped json list
[ https://issues.apache.org/jira/browse/FLINK-10165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17202057#comment-17202057 ] Matthias edited comment on FLINK-10165 at 4/30/21, 7:39 PM: This issue was visited as part of the Engine team's backlog grooming on Sep 25, 2020. The relevant code change is [JarRunHandler#106|https://github.com/apache/flink/commit/fefe866bad47b1c4a2f92eded19bc7a5059f1277#diff-f83b0a6d9d1ab75ac867244111bb6b15R106]. It was moved into [JarHandlerUtils#207|https://github.com/apache/flink/blob/7381304930a964098df52d9fa79a55241538b301/flink-runtime-web/src/main/java/org/apache/flink/runtime/webmonitor/handlers/utils/JarHandlerUtils.java#L207] was (Author: mapohl): This issue was visited as part of the Engine team's backlog grooming on Sep 25, 2020. The relevant code change is [JarRunHandler#106https://github.com/apache/flink/commit/fefe866bad47b1c4a2f92eded19bc7a5059f1277#diff-f83b0a6d9d1ab75ac867244111bb6b15R106]. It was moved into [JarHandlerUtils#207|https://github.com/apache/flink/blob/7381304930a964098df52d9fa79a55241538b301/flink-runtime-web/src/main/java/org/apache/flink/runtime/webmonitor/handlers/utils/JarHandlerUtils.java#L207] > JarRunHandler/JarRunRequestBody should allow to pass program arguments as > escaped json list > --- > > Key: FLINK-10165 > URL: https://issues.apache.org/jira/browse/FLINK-10165 > Project: Flink > Issue Type: Improvement > Components: Runtime / REST >Affects Versions: 1.6.0, 1.10.2, 1.11.2 >Reporter: Maciej Prochniak >Priority: Minor > Labels: auto-deprioritized-major > > Currently program arguments are parsed from plain string: > [https://github.com/apache/flink/blob/master/flink-runtime-web/src/main/java/org/apache/flink/runtime/webmonitor/handlers/JarRunHandler.java#L106] > It doesn't allow to put quotes or new lines in arguments - in particular it's > difficult to pass json as argument. I think it would be good to pass > arguments as json list - then jackson would handle escaping. It'd be a bit > more problematic for query string parameters... WDYT [~Zentol]? -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] flinkbot edited a comment on pull request #15768: [FLINK-22451][table] Support (*) as parameter of UDFs in Table API
flinkbot edited a comment on pull request #15768: URL: https://github.com/apache/flink/pull/15768#issuecomment-826735938 ## CI report: * 11787dc2e0c372007bddc148e0d4aec2ed9a275f Azure: [CANCELED](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17470) * dced81c2ffc8c59f9d9311346e71309129aa73cf Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17473) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #15789: [FLINK-21181][runtime] Wait for Invokable cancellation before releasing network resources
flinkbot edited a comment on pull request #15789: URL: https://github.com/apache/flink/pull/15789#issuecomment-827996694 ## CI report: * 4c5180310bf76e96f2665bf53531eccb1fa86421 UNKNOWN * 46e1be2c4832080bf1cb48c509b77cd88872d024 Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17468) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (FLINK-19164) Release scripts break other dependency versions unintentionally
[ https://issues.apache.org/jira/browse/FLINK-19164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Serhat Soydan updated FLINK-19164: -- Labels: (was: stale-assigned) > Release scripts break other dependency versions unintentionally > --- > > Key: FLINK-19164 > URL: https://issues.apache.org/jira/browse/FLINK-19164 > Project: Flink > Issue Type: Bug > Components: Deployment / Scripts, Release System >Reporter: Serhat Soydan >Assignee: Serhat Soydan >Priority: Minor > > All the scripts below has a line to change the old version to new version in > pom files. > [https://github.com/apache/flink/blob/master/tools/change-version.sh#L31] > [https://github.com/apache/flink/blob/master/tools/releasing/create_release_branch.sh#L60] > [https://github.com/apache/flink/blob/master/tools/releasing/update_branch_version.sh#L52] > > It works like find & replace so it is prone to unintentional errors. Any > dependency with a version equals to "old version" might be automatically > changed to "new version". See below to see how to produce a similar case. > > +How to re-produce the bug:+ > * Clone/Fork Flink repo and for example checkout version v*1.11.1* > * Apply any changes you need > * Run "create_release_branch.sh" script with OLD_VERSION=*1.11.1* > NEW_VERSION={color:#de350b}*1.12.0*{color} > ** In parent pom.xml, an auto find of maven-dependency-analyzer > version will be done automatically and *unintentionally* which will break the > build. > > > org.apache.maven.shared > maven-dependency-analyzer > *1.11.1* > > > > org.apache.maven.shared > maven-dependency-analyzer > {color:#de350b}*1.12.0*{color} > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (FLINK-22506) YARN job cluster stuck in retrying creating JobManager if savepoint is corrupted
[ https://issues.apache.org/jira/browse/FLINK-22506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17337517#comment-17337517 ] Paul Lin edited comment on FLINK-22506 at 4/30/21, 5:09 PM: [~trohrmann] Thanks a lot for the input. Now I'm suspecting maybe the value of `yarn.application-attempt-failures-validity-interval` is too low (I'm using the default 10s), given that in my case a retry may take 1 min. I'll investigate further, and close the issue if it's a configuration problem. Thanks again! was (Author: paul lin): [~trohrmann] Thanks a lot for the input. Now I'm suspecting maybe the value of `yarn.application-attempt-failures-validity-interval` is too low (I'm using the default), given that in my case a retry may take 1 min. I'll investigate further, and close the issue if it's a configuration problem. Thanks again! > YARN job cluster stuck in retrying creating JobManager if savepoint is > corrupted > > > Key: FLINK-22506 > URL: https://issues.apache.org/jira/browse/FLINK-22506 > Project: Flink > Issue Type: Improvement > Components: Deployment / YARN >Affects Versions: 1.11.3 >Reporter: Paul Lin >Priority: Major > Attachments: corrupted_savepoint.log, yarn application attempts.png > > > If a non-retryable error (e.g. the savepoint is corrupted or unaccessible) > occurs during the initiation of the job manager, the job cluster exits with > an error code. But since it does not mark the attempt as failed, it won't be > count as a failed attempt, and YARN will keep retrying forever. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (FLINK-22506) YARN job cluster stuck in retrying creating JobManager if savepoint is corrupted
[ https://issues.apache.org/jira/browse/FLINK-22506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17337517#comment-17337517 ] Paul Lin edited comment on FLINK-22506 at 4/30/21, 5:08 PM: [~trohrmann] Thanks a lot for the input. Now I'm suspecting maybe the value of `yarn.application-attempt-failures-validity-interval` is too low (I'm using the default), given that in my case a retry may take 1 min. I'll investigate further, and close the issue if it's a configuration problem. Thanks again! was (Author: paul lin): [~trohrmann] Thanks a lot for the input. Now I'm suspecting maybe the value of `yarn.application-attempt-failures-validity-interval` is low (I'm using the default), given that in my case a retry may take 1 min. I'll investigate further, and close the issue if it's a configuration problem. Thanks again! > YARN job cluster stuck in retrying creating JobManager if savepoint is > corrupted > > > Key: FLINK-22506 > URL: https://issues.apache.org/jira/browse/FLINK-22506 > Project: Flink > Issue Type: Improvement > Components: Deployment / YARN >Affects Versions: 1.11.3 >Reporter: Paul Lin >Priority: Major > Attachments: corrupted_savepoint.log, yarn application attempts.png > > > If a non-retryable error (e.g. the savepoint is corrupted or unaccessible) > occurs during the initiation of the job manager, the job cluster exits with > an error code. But since it does not mark the attempt as failed, it won't be > count as a failed attempt, and YARN will keep retrying forever. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-22506) YARN job cluster stuck in retrying creating JobManager if savepoint is corrupted
[ https://issues.apache.org/jira/browse/FLINK-22506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17337517#comment-17337517 ] Paul Lin commented on FLINK-22506: -- [~trohrmann] Thanks a lot for the input. Now I'm suspecting maybe the value of `yarn.application-attempt-failures-validity-interval` is low (I'm using the default), given that in my case a retry may take 1 min. I'll investigate further, and close the issue if it's a configuration problem. Thanks again! > YARN job cluster stuck in retrying creating JobManager if savepoint is > corrupted > > > Key: FLINK-22506 > URL: https://issues.apache.org/jira/browse/FLINK-22506 > Project: Flink > Issue Type: Improvement > Components: Deployment / YARN >Affects Versions: 1.11.3 >Reporter: Paul Lin >Priority: Major > Attachments: corrupted_savepoint.log, yarn application attempts.png > > > If a non-retryable error (e.g. the savepoint is corrupted or unaccessible) > occurs during the initiation of the job manager, the job cluster exits with > an error code. But since it does not mark the attempt as failed, it won't be > count as a failed attempt, and YARN will keep retrying forever. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-10644) Batch Job: Speculative execution
[ https://issues.apache.org/jira/browse/FLINK-10644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17337515#comment-17337515 ] Till Rohrmann commented on FLINK-10644: --- Sounds good to me [~wangwj]. Thanks for the update. > Batch Job: Speculative execution > > > Key: FLINK-10644 > URL: https://issues.apache.org/jira/browse/FLINK-10644 > Project: Flink > Issue Type: New Feature > Components: Runtime / Coordination >Reporter: JIN SUN >Assignee: BoWang >Priority: Major > Labels: stale-assigned > > Strugglers/outlier are tasks that run slower than most of the all tasks in a > Batch Job, this somehow impact job latency, as pretty much this straggler > will be in the critical path of the job and become as the bottleneck. > Tasks may be slow for various reasons, including hardware degradation, or > software mis-configuration, or noise neighboring. It's hard for JM to predict > the runtime. > To reduce the overhead of strugglers, other system such as Hadoop/Tez, Spark > has *_speculative execution_*. Speculative execution is a health-check > procedure that checks for tasks to be speculated, i.e. running slower in a > ExecutionJobVertex than the median of all successfully completed tasks in > that EJV, Such slow tasks will be re-submitted to another TM. It will not > stop the slow tasks, but run a new copy in parallel. And will kill the others > if one of them complete. > This JIRA is an umbrella to apply this kind of idea in FLINK. Details will be > append later. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-22506) YARN job cluster stuck in retrying creating JobManager if savepoint is corrupted
[ https://issues.apache.org/jira/browse/FLINK-22506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17337510#comment-17337510 ] Till Rohrmann commented on FLINK-22506: --- Ok, then I have misunderstood the ticket a bit. I thought that any application master failure would be handled as a failed attempt and counts towards the {{yarn.application-attempts}}. I don't think that we ever mark an Yarn attempt explicitly as failed. Hence, I thought that it should work with {{yarn.application-attempts}} and {{yarn.application-attempt-failures-validity-interval}}. > YARN job cluster stuck in retrying creating JobManager if savepoint is > corrupted > > > Key: FLINK-22506 > URL: https://issues.apache.org/jira/browse/FLINK-22506 > Project: Flink > Issue Type: Improvement > Components: Deployment / YARN >Affects Versions: 1.11.3 >Reporter: Paul Lin >Priority: Major > Attachments: corrupted_savepoint.log, yarn application attempts.png > > > If a non-retryable error (e.g. the savepoint is corrupted or unaccessible) > occurs during the initiation of the job manager, the job cluster exits with > an error code. But since it does not mark the attempt as failed, it won't be > count as a failed attempt, and YARN will keep retrying forever. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] flinkbot edited a comment on pull request #15768: [FLINK-22451][table] Support (*) as parameter of UDFs in Table API
flinkbot edited a comment on pull request #15768: URL: https://github.com/apache/flink/pull/15768#issuecomment-826735938 ## CI report: * 48814bd4d49167579110479e3f3cecabff6d443a Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17307) * 11787dc2e0c372007bddc148e0d4aec2ed9a275f Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17470) * dced81c2ffc8c59f9d9311346e71309129aa73cf Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17473) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Comment Edited] (FLINK-10644) Batch Job: Speculative execution
[ https://issues.apache.org/jira/browse/FLINK-10644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17337504#comment-17337504 ] wangwj edited comment on FLINK-10644 at 4/30/21, 4:53 PM: -- Hi, [~trohrmann] [~wind_ljy] I think the closest version of Flink to my Blink version I built this feature on maybe is 1.7 or 1.8 Though it seems a little far from the latest version of Flink, I found that the code which I want to modify is not much different from that after I read the code of Blink and Flink (master branch). So, I am confident to contribute this issue. I think the multi-threading in the ExecutionGraph is two executions finished at the same time in a same ExecutionVertex. ExecutionVertex.executionFinished() method may be called as two different execution at the same time. Maybe I call it "multi-threading" is not very accurate here. How does the speculative execution play together with other sinks? Does it only work for the file based sinks? The speculative execution could also support sink to Key-value databases, such as Hologres, HBase etc. In our scenario, the batch job usually writes data into Hologres (similar to HBase) or Pangu (similar to HDFS). How does the blacklisting mechanism work? Does it work also for the K8s and Mesos integration or only for the Yarn integration? The blacklist module is a thread that maintains the black machines of this job and removes expired elements periodically. Each element in blacklist contains IP and timestamp. The timestamp is used to decide whether the elements of the blacklist is expired or not. My code only supports Yarn integration. But as far as I know, we could use nodeaffinity or podaffinity to achieve the same goal with Yarn PlacementConstraint in K8s integration. As the mesos integration will be deprecated in Flink-1.13, I didn’t consider it. How much is the change encapsulated by the SchedulerNG interface? I agree with that the SchedulerNG interface is more or less self-contained, and I will consider your proposal carefully in FLIP and coding. In the next step, I'll move on to figure out what changes are needed in Flink (master branch) then try to write a POC. Then I will send e-mail to d...@flink.apache.org to discuss this feature. Then I will write a FLIP and a vote on it. Thanks was (Author: wangwj): Hi, [~trohrmann] [~wind_ljy] I think the closest version of Flink to my Blink version I built this feature on maybe is 1.7 or 1.8 Though it seems a little far from the latest version of Flink, I found that the code which I want to modify is not much different from that after I read the code of Blink and Flink (master branch). So, I am confident to contribute this issue. I think the multi-threading in the ExecutionGraph is two executions finished at the same time in a same ExecutionVertex. ExecutionVertex.executionFinished() method may be called as two different execution at the same time. Maybe I call it "multi-threading" is not very accurate here. How does the speculative execution play together with other sinks? Does it only work for the file based sinks? The speculative execution could also support sink to Key-value databases, such as Hologres, HBase etc. In our scenario, the batch job usually writes data into Hologres (similar to HBase) or Pangu (similar to HDFS). How does the blacklisting mechanism work? Does it work also for the K8s and Mesos integration or only for the Yarn integration? The blacklist module is a thread that maintains the black machines of this job and removes expired elements periodically. Each element in blacklist contains IP and timestamp. The timestamp is used to decide whether the elements of the blacklist is expired or not. My code only supports Yarn integration. But as far as I know, we could use nodeaffinity or podaffinity to achieve the same goal with Yarn PlacementConstraint in K8s integration. As the mesos integration will be deprecated in Flink-1.13, I didn’t consider it. How much is the change encapsulated by the SchedulerNG interface? I agree with that the SchedulerNG interface is more or less self-contained, and I will consider your proposal carefully in FLIP and coding. In the next step, I'll move on to figure out what changes are needed in Flink (master branch) then write a POC. Then I will send e-mail to d...@flink.apache.org to discuss this feature. Then I will write a FLIP and a vote on it. Thanks > Batch Job: Speculative execution > > > Key: FLINK-10644 > URL: https://issues.apache.org/jira/browse/FLINK-10644 > Project: Flink > Issue Type: New Feature > Components: Runtime / Coordination >Reporter: JIN SUN >Assignee: BoWang >Priority: Major > Labels: stale-assigned > > Strugglers/outlier are tasks
[jira] [Comment Edited] (FLINK-10644) Batch Job: Speculative execution
[ https://issues.apache.org/jira/browse/FLINK-10644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17337504#comment-17337504 ] wangwj edited comment on FLINK-10644 at 4/30/21, 4:52 PM: -- Hi, [~trohrmann] [~wind_ljy] I think the closest version of Flink to my Blink version I built this feature on maybe is 1.7 or 1.8 Though it seems a little far from the latest version of Flink, I found that the code which I want to modify is not much different from that after I read the code of Blink and Flink (master branch). So, I am confident to contribute this issue. I think the multi-threading in the ExecutionGraph is two executions finished at the same time in a same ExecutionVertex. ExecutionVertex.executionFinished() method may be called as two different execution at the same time. Maybe I call it "multi-threading" is not very accurate here. How does the speculative execution play together with other sinks? Does it only work for the file based sinks? The speculative execution could also support sink to Key-value databases, such as Hologres, HBase etc. In our scenario, the batch job usually writes data into Hologres (similar to HBase) or Pangu (similar to HDFS). How does the blacklisting mechanism work? Does it work also for the K8s and Mesos integration or only for the Yarn integration? The blacklist module is a thread that maintains the black machines of this job and removes expired elements periodically. Each element in blacklist contains IP and timestamp. The timestamp is used to decide whether the elements of the blacklist is expired or not. My code only supports Yarn integration. But as far as I know, we could use nodeaffinity or podaffinity to achieve the same goal with Yarn PlacementConstraint in K8s integration. As the mesos integration will be deprecated in Flink-1.13, I didn’t consider it. How much is the change encapsulated by the SchedulerNG interface? I agree with that the SchedulerNG interface is more or less self-contained, and I will consider your proposal carefully in FLIP and coding. In the next step, I'll move on to figure out what changes are needed in Flink (master branch) then write a POC. Then I will send e-mail to d...@flink.apache.org to discuss this feature. Then I will write a FLIP and a vote on it. Thanks was (Author: wangwj): Hi, [~trohrmann] [~wind_ljy] The closest version of Flink to my Blink version I built this feature on is 1.5.1 Though it seems a little far from the latest version of Flink, I found that the code which I want to modify is not much different from that after I read the code of Blink and Flink (master branch). So, I am confident to contribute this issue. I think the multi-threading in the ExecutionGraph is two executions finished at the same time in a same ExecutionVertex. ExecutionVertex.executionFinished() method may be called as two different execution at the same time. Maybe I call it "multi-threading" is not very accurate here. How does the speculative execution play together with other sinks? Does it only work for the file based sinks? The speculative execution could also support sink to Key-value databases, such as Hologres, HBase etc. In our scenario, the batch job usually writes data into Hologres (similar to HBase) or Pangu (similar to HDFS). How does the blacklisting mechanism work? Does it work also for the K8s and Mesos integration or only for the Yarn integration? The blacklist module is a thread that maintains the black machines of this job and removes expired elements periodically. Each element in blacklist contains IP and timestamp. The timestamp is used to decide whether the elements of the blacklist is expired or not. My code only supports Yarn integration. But as far as I know, we could use nodeaffinity or podaffinity to achieve the same goal with Yarn PlacementConstraint in K8s integration. As the mesos integration will be deprecated in Flink-1.13, I didn’t consider it. How much is the change encapsulated by the SchedulerNG interface? I agree with that the SchedulerNG interface is more or less self-contained, and I will consider your proposal carefully in FLIP and coding. In the next step, I'll move on to figure out what changes are needed in Flink (master branch) then write a POC. Then I will send e-mail to d...@flink.apache.org to discuss this feature. Then I will write a FLIP and a vote on it. Thanks > Batch Job: Speculative execution > > > Key: FLINK-10644 > URL: https://issues.apache.org/jira/browse/FLINK-10644 > Project: Flink > Issue Type: New Feature > Components: Runtime / Coordination >Reporter: JIN SUN >Assignee: BoWang >Priority: Major > Labels: stale-assigned > > Strugglers/outlier are tasks that run slower than most
[jira] [Commented] (FLINK-10644) Batch Job: Speculative execution
[ https://issues.apache.org/jira/browse/FLINK-10644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17337504#comment-17337504 ] wangwj commented on FLINK-10644: Hi, [~trohrmann] [~wind_ljy] The closest version of Flink to my Blink version I built this feature on is 1.5.1 Though it seems a little far from the latest version of Flink, I found that the code which I want to modify is not much different from that after I read the code of Blink and Flink (master branch). So, I am confident to contribute this issue. I think the multi-threading in the ExecutionGraph is two executions finished at the same time in a same ExecutionVertex. ExecutionVertex.executionFinished() method may be called as two different execution at the same time. Maybe I call it "multi-threading" is not very accurate here. How does the speculative execution play together with other sinks? Does it only work for the file based sinks? The speculative execution could also support sink to Key-value databases, such as Hologres, HBase etc. In our scenario, the batch job usually writes data into Hologres (similar to HBase) or Pangu (similar to HDFS). How does the blacklisting mechanism work? Does it work also for the K8s and Mesos integration or only for the Yarn integration? The blacklist module is a thread that maintains the black machines of this job and removes expired elements periodically. Each element in blacklist contains IP and timestamp. The timestamp is used to decide whether the elements of the blacklist is expired or not. My code only supports Yarn integration. But as far as I know, we could use nodeaffinity or podaffinity to achieve the same goal with Yarn PlacementConstraint in K8s integration. As the mesos integration will be deprecated in Flink-1.13, I didn’t consider it. How much is the change encapsulated by the SchedulerNG interface? I agree with that the SchedulerNG interface is more or less self-contained, and I will consider your proposal carefully in FLIP and coding. In the next step, I'll move on to figure out what changes are needed in Flink (master branch) then write a POC. Then I will send e-mail to d...@flink.apache.org to discuss this feature. Then I will write a FLIP and a vote on it. Thanks > Batch Job: Speculative execution > > > Key: FLINK-10644 > URL: https://issues.apache.org/jira/browse/FLINK-10644 > Project: Flink > Issue Type: New Feature > Components: Runtime / Coordination >Reporter: JIN SUN >Assignee: BoWang >Priority: Major > Labels: stale-assigned > > Strugglers/outlier are tasks that run slower than most of the all tasks in a > Batch Job, this somehow impact job latency, as pretty much this straggler > will be in the critical path of the job and become as the bottleneck. > Tasks may be slow for various reasons, including hardware degradation, or > software mis-configuration, or noise neighboring. It's hard for JM to predict > the runtime. > To reduce the overhead of strugglers, other system such as Hadoop/Tez, Spark > has *_speculative execution_*. Speculative execution is a health-check > procedure that checks for tasks to be speculated, i.e. running slower in a > ExecutionJobVertex than the median of all successfully completed tasks in > that EJV, Such slow tasks will be re-submitted to another TM. It will not > stop the slow tasks, but run a new copy in parallel. And will kill the others > if one of them complete. > This JIRA is an umbrella to apply this kind of idea in FLINK. Details will be > append later. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (FLINK-22506) YARN job cluster stuck in retrying creating JobManager if savepoint is corrupted
[ https://issues.apache.org/jira/browse/FLINK-22506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17337499#comment-17337499 ] Paul Lin edited comment on FLINK-22506 at 4/30/21, 4:42 PM: [~knaufk] I've attached the jm logs and the screen shoot of YARN application web UI. Please take a look. I first reported the issue as a bug, because I think the max number of attempts (which is set to 2) is not respected in this case, but I'm fine with making it an improvement. was (Author: paul lin): [~knaufk] I've attached the jm logs and the screen shoot yarn application web UI. Please take a look. I first reported the issue as a bug, because I think the max number of attempts (which is set to 2) is not respected in this case, but I'm fine with making it an improvement. > YARN job cluster stuck in retrying creating JobManager if savepoint is > corrupted > > > Key: FLINK-22506 > URL: https://issues.apache.org/jira/browse/FLINK-22506 > Project: Flink > Issue Type: Improvement > Components: Deployment / YARN >Affects Versions: 1.11.3 >Reporter: Paul Lin >Priority: Major > Attachments: corrupted_savepoint.log, yarn application attempts.png > > > If a non-retryable error (e.g. the savepoint is corrupted or unaccessible) > occurs during the initiation of the job manager, the job cluster exits with > an error code. But since it does not mark the attempt as failed, it won't be > count as a failed attempt, and YARN will keep retrying forever. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-22506) YARN job cluster stuck in retrying creating JobManager if savepoint is corrupted
[ https://issues.apache.org/jira/browse/FLINK-22506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17337499#comment-17337499 ] Paul Lin commented on FLINK-22506: -- [~knaufk] I've attached the jm logs and the screen shoot yarn application web UI. Please take a look. I first reported the issue as a bug, because I think the max number of attempts (which is set to 2) is not respected in this case, but I'm fine with making it an improvement. > YARN job cluster stuck in retrying creating JobManager if savepoint is > corrupted > > > Key: FLINK-22506 > URL: https://issues.apache.org/jira/browse/FLINK-22506 > Project: Flink > Issue Type: Improvement > Components: Deployment / YARN >Affects Versions: 1.11.3 >Reporter: Paul Lin >Priority: Major > Attachments: corrupted_savepoint.log, yarn application attempts.png > > > If a non-retryable error (e.g. the savepoint is corrupted or unaccessible) > occurs during the initiation of the job manager, the job cluster exits with > an error code. But since it does not mark the attempt as failed, it won't be > count as a failed attempt, and YARN will keep retrying forever. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-22506) YARN job cluster stuck in retrying creating JobManager if savepoint is corrupted
[ https://issues.apache.org/jira/browse/FLINK-22506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Lin updated FLINK-22506: - Attachment: yarn application attempts.png > YARN job cluster stuck in retrying creating JobManager if savepoint is > corrupted > > > Key: FLINK-22506 > URL: https://issues.apache.org/jira/browse/FLINK-22506 > Project: Flink > Issue Type: Improvement > Components: Deployment / YARN >Affects Versions: 1.11.3 >Reporter: Paul Lin >Priority: Major > Attachments: corrupted_savepoint.log, yarn application attempts.png > > > If a non-retryable error (e.g. the savepoint is corrupted or unaccessible) > occurs during the initiation of the job manager, the job cluster exits with > an error code. But since it does not mark the attempt as failed, it won't be > count as a failed attempt, and YARN will keep retrying forever. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-22506) YARN job cluster stuck in retrying creating JobManager if savepoint is corrupted
[ https://issues.apache.org/jira/browse/FLINK-22506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Lin updated FLINK-22506: - Attachment: corrupted_savepoint.log > YARN job cluster stuck in retrying creating JobManager if savepoint is > corrupted > > > Key: FLINK-22506 > URL: https://issues.apache.org/jira/browse/FLINK-22506 > Project: Flink > Issue Type: Improvement > Components: Deployment / YARN >Affects Versions: 1.11.3 >Reporter: Paul Lin >Priority: Major > Attachments: corrupted_savepoint.log > > > If a non-retryable error (e.g. the savepoint is corrupted or unaccessible) > occurs during the initiation of the job manager, the job cluster exits with > an error code. But since it does not mark the attempt as failed, it won't be > count as a failed attempt, and YARN will keep retrying forever. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] flinkbot edited a comment on pull request #15820: [FLINK-22535][runtime] CleanUp is invoked for task even when the task…
flinkbot edited a comment on pull request #15820: URL: https://github.com/apache/flink/pull/15820#issuecomment-830052015 ## CI report: * c548fc5c79450d898eebc1c6523be98472fe1cdd Azure: [CANCELED](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17467) * c11e6cb0c0817a53ef456462f610e3d079f590f9 Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17472) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #15768: [FLINK-22451][table] Support (*) as parameter of UDFs in Table API
flinkbot edited a comment on pull request #15768: URL: https://github.com/apache/flink/pull/15768#issuecomment-826735938 ## CI report: * 48814bd4d49167579110479e3f3cecabff6d443a Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17307) * 11787dc2e0c372007bddc148e0d4aec2ed9a275f Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17470) * dced81c2ffc8c59f9d9311346e71309129aa73cf UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #15131: [FLINK-21700][security] Add an option to disable credential retrieval on a secure cluster
flinkbot edited a comment on pull request #15131: URL: https://github.com/apache/flink/pull/15131#issuecomment-794957907 ## CI report: * 8cb12deac019898bb69f7a5faf7f803dd27d71d0 Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17465) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Comment Edited] (FLINK-10644) Batch Job: Speculative execution
[ https://issues.apache.org/jira/browse/FLINK-10644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17334427#comment-17334427 ] wangwj edited comment on FLINK-10644 at 4/30/21, 4:15 PM: -- [~trohrmann] Hi Till. I am from the search and recommendation department of Alibaba in China. Happy to share and discuss my job here. Our big data processing platform uses Flink Batch to process extremely huge data every day. Many long-tail tasks are produced everyday and we have to kill these processes manually, which leads to a poor user experience. So I tried to solve this problem. I think that speculative execution means that two executions in a ExecutionVertex running at a same time, and failover means that two executions running at two different time. Based on this, I think this feature(speculative execution) is theoretically achievable. So, I have implemented a speculative execution for batch job based on Blink, and it had a significant effect in our product cluster. I did as follows: (1)Which kind of ExecutionJobVertex is suitable enable speculative execution feature in a batch job? The speculative execution feature correlates with the implementation details of the region failover. So, I found that a ExecutionJobVertex will enable speculative execution feature only if all input edges and output edges of this ExecutionJobVertex are blocking(Condition A). (2)How to distinguish long-tail task? I distinguish long-tail task based on the intervals between the current time and the execution first create/deploying time before it failover. When an ExecutionJobVertex meets Condition A and a configurable percentage of executions has been finished in the ExecutionJobVertex, the speculative execution thread starts to really work. In the ExecutionJobVertex, when the running time of a execution is greater than a configurable multiple of the median of the running time of other finished executions, this execution is defined as long-tail execution.(Condition B) (3)How to make the speculative execution algorithm more precise? Baesd on the speculative execution algorithm in Condition B, I solved the problem of long-tail tasks in our product cluster. In the next step, we may add the throughput of the task to the speculative execution algorithm through the heartbeat of TaskManagers with JobManager. (4)How to schedule another execution in a same ExecutionVertex? We have changed the currentExecution in ExecutionVertex to a list, which means that there can be multiple executions in an ExecutionVertex at the same time. Then we reuse the current scheduling logic to schedule the speculative execution. (5)How to make the speculative task run on another machine from the original execution. We have implemented a machine-dimensional blacklist per job. The machine IP was added in the blacklist when an execution is recognized as a long-tail execution. The blacklist would remove the machine IP when it is out of date. When the executions are scheduled, we will add information of the blacklist to yarn PlacementConstraint. In this way, I can ensure that the yarn container is not on the machines in the blacklist. (6)How to avoid errors when multiple executions finish at the same time in an ExecutionVertex? In ExecutionVertex executionFinished() method, multi-thread synchronization was used to ensure that only one execution would successfully finished in an ExecutionVertex. All the other executions will go to the cancellation logic. (7)How to deal with multiple sink files in one ExecutionVertex when the job is sink to files? When batch job will sink to file, we will add an executionAttemptID suffix to the file name. Finally, I will delete or rename these files in finalizeOnMaster(). Here we should pay attention to the situation of flink stream job processing bounded data sets. (8)In batch job with all-to-all shuffle, how did the downstream original execution and speculative execution select the ResultSubPartition of the upstream executions? Two executions of a upstream ExecutionVertex will produce two ResultPartitions. When the upstream ExecutionVertex finished, we will update the inputChannel of down stream execution to the fastest finished execution of upstream. Here we should pay attention to the situation when the down stream execution meet DataConsumptionException. It will restart with the upstream execution that has been finished. (9)How to display information about speculative task on the Flink web ui? After I have implemented this feature. When speculative execution runs faster then original execution, the flink ui will show that this task has been cancelled. But the result of the job is correct, which is in full compliance with our expectations. I don’t know much about the web, I will ask my colleague for help. [~trohrmann] My implementation has played a big role in our product cluster in
[jira] [Comment Edited] (FLINK-22509) ./bin/flink run -m yarn-cluster -d submission leads to IllegalStateException
[ https://issues.apache.org/jira/browse/FLINK-22509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17337484#comment-17337484 ] Yang Wang edited comment on FLINK-22509 at 4/30/21, 4:10 PM: - [~knaufk] [~rmetzger] I agree with that we should deprecate {{\-m yarn-cluster}}, which also means the {{FlinkYarnSessionCli}}. Instead, we need to suggest our users to migrate to the client unified executor interface {{--target yarn-per-job}}. However, I hesitate to deprecate it too soon(e.g. 1.13). The only reason why we have not deprecated it after introducing the unified executor interface is about the CLI compatibility. I am pretty sure a lot of companies are using the {{flink run}} to integrate with their deployers in production. Because we do not provide a very good client SDK or deploy interfaces. It will take a big burden and surprise for them if we do not let them know early enough. was (Author: fly_in_gis): [~knaufk] [~rmetzger] I agree with that we should deprecate {{-m yarn-cluster}}, which also means the {{FlinkYarnSessionCli}}. Instead, we need to suggest our users to migrate to the client unified executor interface {{--target yarn-per-job}}. However, I hesitate to deprecate it too soon(e.g. 1.13). The only reason why we have not deprecated it after introducing the unified executor interface is about the CLI compatibility. I am pretty sure a lot of companies are using the {{flink run}} to integrate with their deployers in production. Because we do not provide a very good client SDK or deploy interfaces. It will take a big burden and surprise for them if we do not let them know early enough. > ./bin/flink run -m yarn-cluster -d submission leads to IllegalStateException > > > Key: FLINK-22509 > URL: https://issues.apache.org/jira/browse/FLINK-22509 > Project: Flink > Issue Type: Bug > Components: Deployment / YARN >Affects Versions: 1.13.0, 1.14.0 >Reporter: Robert Metzger >Assignee: Robert Metzger >Priority: Major > > Submitting a detached, per-job YARN cluster in Flink (like this: > {{./bin/flink run -m yarn-cluster -d > ./examples/streaming/TopSpeedWindowing.jar}}), leads to the following > exception: > {code} > 2021-04-28 11:39:00,786 INFO org.apache.flink.yarn.YarnClusterDescriptor > [] - Found Web Interface > ip-172-31-27-232.eu-central-1.compute.internal:45689 of application > 'application_1619607372651_0005'. > Job has been submitted with JobID 5543e81db9c2de78b646088891f23bfc > Exception in thread "Thread-4" java.lang.IllegalStateException: Trying to > access closed classloader. Please check if you store classloaders directly or > indirectly in static fields. If the stacktrace suggests that the leak occurs > in a third party library and cannot be fixed immediately, you can disable > this check with the configuration 'classloader.check-leaked-classloader'. > at > org.apache.flink.runtime.execution.librarycache.FlinkUserCodeClassLoaders$SafetyNetWrapperClassLoader.ensureInner(FlinkUserCodeClassLoaders.java:164) > at > org.apache.flink.runtime.execution.librarycache.FlinkUserCodeClassLoaders$SafetyNetWrapperClassLoader.getResource(FlinkUserCodeClassLoaders.java:183) > at > org.apache.hadoop.conf.Configuration.getResource(Configuration.java:2570) > at > org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2783) > at > org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2758) > at > org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2638) > at org.apache.hadoop.conf.Configuration.get(Configuration.java:1100) > at > org.apache.hadoop.conf.Configuration.getTimeDuration(Configuration.java:1707) > at > org.apache.hadoop.conf.Configuration.getTimeDuration(Configuration.java:1688) > at > org.apache.hadoop.util.ShutdownHookManager.getShutdownTimeout(ShutdownHookManager.java:183) > at > org.apache.hadoop.util.ShutdownHookManager.shutdownExecutor(ShutdownHookManager.java:145) > at > org.apache.hadoop.util.ShutdownHookManager.access$300(ShutdownHookManager.java:65) > at > org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:102) > {code} > The job is still running as expected. > Detached submission with {{./bin/flink run-application -t yarn-application > -d}} works as expected. This is also the documented approach. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (FLINK-22509) ./bin/flink run -m yarn-cluster -d submission leads to IllegalStateException
[ https://issues.apache.org/jira/browse/FLINK-22509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17337484#comment-17337484 ] Yang Wang edited comment on FLINK-22509 at 4/30/21, 4:10 PM: - [~knaufk] [~rmetzger] I agree with that we should deprecate {{-m yarn-cluster}}, which also means the {{FlinkYarnSessionCli}}. Instead, we need to suggest our users to migrate to the client unified executor interface {{--target yarn-per-job}}. However, I hesitate to deprecate it too soon(e.g. 1.13). The only reason why we have not deprecated it after introducing the unified executor interface is about the CLI compatibility. I am pretty sure a lot of companies are using the {{flink run}} to integrate with their deployers in production. Because we do not provide a very good client SDK or deploy interfaces. It will take a big burden and surprise for them if we do not let them know early enough. was (Author: fly_in_gis): [~knaufk] [~rmetzger] I agree with that we should deprecate {{\-m yarn-cluster}}, which also means the {{FlinkYarnSessionCli}}. Instead, we need to suggest our users to migrate to the client unified executor interface {{--target yarn-per-job}}. However, I hesitate to deprecate it too soon(e.g. 1.13). The only reason why we have not deprecate it after introducing the unified executor interface is about the CLI the compatibility. I am pretty sure a lot of companies are using the {{flink run}} to integrate with their deployers in production. Because we do not provide a very good client SDK or deploy interfaces. It will take a big burden and surprise for them if we do not let them know early enough. > ./bin/flink run -m yarn-cluster -d submission leads to IllegalStateException > > > Key: FLINK-22509 > URL: https://issues.apache.org/jira/browse/FLINK-22509 > Project: Flink > Issue Type: Bug > Components: Deployment / YARN >Affects Versions: 1.13.0, 1.14.0 >Reporter: Robert Metzger >Assignee: Robert Metzger >Priority: Major > > Submitting a detached, per-job YARN cluster in Flink (like this: > {{./bin/flink run -m yarn-cluster -d > ./examples/streaming/TopSpeedWindowing.jar}}), leads to the following > exception: > {code} > 2021-04-28 11:39:00,786 INFO org.apache.flink.yarn.YarnClusterDescriptor > [] - Found Web Interface > ip-172-31-27-232.eu-central-1.compute.internal:45689 of application > 'application_1619607372651_0005'. > Job has been submitted with JobID 5543e81db9c2de78b646088891f23bfc > Exception in thread "Thread-4" java.lang.IllegalStateException: Trying to > access closed classloader. Please check if you store classloaders directly or > indirectly in static fields. If the stacktrace suggests that the leak occurs > in a third party library and cannot be fixed immediately, you can disable > this check with the configuration 'classloader.check-leaked-classloader'. > at > org.apache.flink.runtime.execution.librarycache.FlinkUserCodeClassLoaders$SafetyNetWrapperClassLoader.ensureInner(FlinkUserCodeClassLoaders.java:164) > at > org.apache.flink.runtime.execution.librarycache.FlinkUserCodeClassLoaders$SafetyNetWrapperClassLoader.getResource(FlinkUserCodeClassLoaders.java:183) > at > org.apache.hadoop.conf.Configuration.getResource(Configuration.java:2570) > at > org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2783) > at > org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2758) > at > org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2638) > at org.apache.hadoop.conf.Configuration.get(Configuration.java:1100) > at > org.apache.hadoop.conf.Configuration.getTimeDuration(Configuration.java:1707) > at > org.apache.hadoop.conf.Configuration.getTimeDuration(Configuration.java:1688) > at > org.apache.hadoop.util.ShutdownHookManager.getShutdownTimeout(ShutdownHookManager.java:183) > at > org.apache.hadoop.util.ShutdownHookManager.shutdownExecutor(ShutdownHookManager.java:145) > at > org.apache.hadoop.util.ShutdownHookManager.access$300(ShutdownHookManager.java:65) > at > org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:102) > {code} > The job is still running as expected. > Detached submission with {{./bin/flink run-application -t yarn-application > -d}} works as expected. This is also the documented approach. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (FLINK-22509) ./bin/flink run -m yarn-cluster -d submission leads to IllegalStateException
[ https://issues.apache.org/jira/browse/FLINK-22509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17337484#comment-17337484 ] Yang Wang edited comment on FLINK-22509 at 4/30/21, 4:09 PM: - [~knaufk] [~rmetzger] I agree with that we should deprecate {{\-m yarn-cluster}}, which also means the {{FlinkYarnSessionCli}}. Instead, we need to suggest our users to migrate to the client unified executor interface {{--target yarn-per-job}}. However, I hesitate to deprecate it too soon(e.g. 1.13). The only reason why we have not deprecate it after introducing the unified executor interface is about the CLI the compatibility. I am pretty sure a lot of companies are using the {{flink run}} to integrate with their deployers in production. Because we do not provide a very good client SDK or deploy interfaces. It will take a big burden and surprise for them if we do not let them know early enough. was (Author: fly_in_gis): [~knaufk] [~rmetzger] I agree with that we should deprecate {{-m yarn-cluster}}, which also means the {{FlinkYarnSessionCli}}. Instead, we need to suggest our users to migrate to the client unified executor interface {{--target yarn-per-job}}. However, I hesitate to deprecate it too soon(e.g. 1.13). The only reason why we have not deprecate it after introducing the unified executor interface is about the CLI the compatibility. I am pretty sure a lot of companies are using the {{flink run}} to integrate with their deployers in production. Because we do not provide a very good client SDK or deploy interfaces. It will take a big burden and surprise for them if we do not let them know early enough. > ./bin/flink run -m yarn-cluster -d submission leads to IllegalStateException > > > Key: FLINK-22509 > URL: https://issues.apache.org/jira/browse/FLINK-22509 > Project: Flink > Issue Type: Bug > Components: Deployment / YARN >Affects Versions: 1.13.0, 1.14.0 >Reporter: Robert Metzger >Assignee: Robert Metzger >Priority: Major > > Submitting a detached, per-job YARN cluster in Flink (like this: > {{./bin/flink run -m yarn-cluster -d > ./examples/streaming/TopSpeedWindowing.jar}}), leads to the following > exception: > {code} > 2021-04-28 11:39:00,786 INFO org.apache.flink.yarn.YarnClusterDescriptor > [] - Found Web Interface > ip-172-31-27-232.eu-central-1.compute.internal:45689 of application > 'application_1619607372651_0005'. > Job has been submitted with JobID 5543e81db9c2de78b646088891f23bfc > Exception in thread "Thread-4" java.lang.IllegalStateException: Trying to > access closed classloader. Please check if you store classloaders directly or > indirectly in static fields. If the stacktrace suggests that the leak occurs > in a third party library and cannot be fixed immediately, you can disable > this check with the configuration 'classloader.check-leaked-classloader'. > at > org.apache.flink.runtime.execution.librarycache.FlinkUserCodeClassLoaders$SafetyNetWrapperClassLoader.ensureInner(FlinkUserCodeClassLoaders.java:164) > at > org.apache.flink.runtime.execution.librarycache.FlinkUserCodeClassLoaders$SafetyNetWrapperClassLoader.getResource(FlinkUserCodeClassLoaders.java:183) > at > org.apache.hadoop.conf.Configuration.getResource(Configuration.java:2570) > at > org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2783) > at > org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2758) > at > org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2638) > at org.apache.hadoop.conf.Configuration.get(Configuration.java:1100) > at > org.apache.hadoop.conf.Configuration.getTimeDuration(Configuration.java:1707) > at > org.apache.hadoop.conf.Configuration.getTimeDuration(Configuration.java:1688) > at > org.apache.hadoop.util.ShutdownHookManager.getShutdownTimeout(ShutdownHookManager.java:183) > at > org.apache.hadoop.util.ShutdownHookManager.shutdownExecutor(ShutdownHookManager.java:145) > at > org.apache.hadoop.util.ShutdownHookManager.access$300(ShutdownHookManager.java:65) > at > org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:102) > {code} > The job is still running as expected. > Detached submission with {{./bin/flink run-application -t yarn-application > -d}} works as expected. This is also the documented approach. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-22509) ./bin/flink run -m yarn-cluster -d submission leads to IllegalStateException
[ https://issues.apache.org/jira/browse/FLINK-22509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17337484#comment-17337484 ] Yang Wang commented on FLINK-22509: --- [~knaufk] [~rmetzger] I agree with that we should deprecate {{-m yarn-cluster}}, which also means the {{FlinkYarnSessionCli}}. Instead, we need to suggest our users to migrate to the client unified executor interface {{--target yarn-per-job}}. However, I hesitate to deprecate it too soon(e.g. 1.13). The only reason why we have not deprecate it after introducing the unified executor interface is about the CLI the compatibility. I am pretty sure a lot of companies are using the {{flink run}} to integrate with their deployers in production. Because we do not provide a very good client SDK or deploy interfaces. It will take a big burden and surprise for them if we do not let them know early enough. > ./bin/flink run -m yarn-cluster -d submission leads to IllegalStateException > > > Key: FLINK-22509 > URL: https://issues.apache.org/jira/browse/FLINK-22509 > Project: Flink > Issue Type: Bug > Components: Deployment / YARN >Affects Versions: 1.13.0, 1.14.0 >Reporter: Robert Metzger >Assignee: Robert Metzger >Priority: Major > > Submitting a detached, per-job YARN cluster in Flink (like this: > {{./bin/flink run -m yarn-cluster -d > ./examples/streaming/TopSpeedWindowing.jar}}), leads to the following > exception: > {code} > 2021-04-28 11:39:00,786 INFO org.apache.flink.yarn.YarnClusterDescriptor > [] - Found Web Interface > ip-172-31-27-232.eu-central-1.compute.internal:45689 of application > 'application_1619607372651_0005'. > Job has been submitted with JobID 5543e81db9c2de78b646088891f23bfc > Exception in thread "Thread-4" java.lang.IllegalStateException: Trying to > access closed classloader. Please check if you store classloaders directly or > indirectly in static fields. If the stacktrace suggests that the leak occurs > in a third party library and cannot be fixed immediately, you can disable > this check with the configuration 'classloader.check-leaked-classloader'. > at > org.apache.flink.runtime.execution.librarycache.FlinkUserCodeClassLoaders$SafetyNetWrapperClassLoader.ensureInner(FlinkUserCodeClassLoaders.java:164) > at > org.apache.flink.runtime.execution.librarycache.FlinkUserCodeClassLoaders$SafetyNetWrapperClassLoader.getResource(FlinkUserCodeClassLoaders.java:183) > at > org.apache.hadoop.conf.Configuration.getResource(Configuration.java:2570) > at > org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2783) > at > org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2758) > at > org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2638) > at org.apache.hadoop.conf.Configuration.get(Configuration.java:1100) > at > org.apache.hadoop.conf.Configuration.getTimeDuration(Configuration.java:1707) > at > org.apache.hadoop.conf.Configuration.getTimeDuration(Configuration.java:1688) > at > org.apache.hadoop.util.ShutdownHookManager.getShutdownTimeout(ShutdownHookManager.java:183) > at > org.apache.hadoop.util.ShutdownHookManager.shutdownExecutor(ShutdownHookManager.java:145) > at > org.apache.hadoop.util.ShutdownHookManager.access$300(ShutdownHookManager.java:65) > at > org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:102) > {code} > The job is still running as expected. > Detached submission with {{./bin/flink run-application -t yarn-application > -d}} works as expected. This is also the documented approach. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] yittg commented on a change in pull request #15768: [FLINK-22451][table] Support (*) as parameter of UDFs in Table API
yittg commented on a change in pull request #15768: URL: https://github.com/apache/flink/pull/15768#discussion_r623990859 ## File path: flink-table/flink-table-planner-blink/src/test/scala/org/apache/flink/table/planner/runtime/stream/table/CalcITCase.scala ## @@ -529,6 +529,27 @@ class CalcITCase(mode: StateBackendMode) extends StreamingWithStateTestBase(mode val expected = List("0,0,0", "1,1,1", "2,2,2") assertEquals(expected.sorted, sink.getAppendResults.sorted) } + + /** + * This is an edition of [[CalcITCase.testMap()]] that uses * as one argument of the UDFs. + */ + @Test + def testUserDefinedFunctionWithStarParameter(): Unit = { +val ds = env.fromCollection(smallTupleData3).toTable(tEnv, 'a, 'b, 'c) + .map(Func23('*)).as("a", "b", "c", "d") + .map(Func24('*)).as("a", "b", "c", "d") + .map(Func1('b)) Review comment: original test case added -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #15656: [FLINK-22233][Table SQL / API]Modified the spelling error of word "constant" in source code
flinkbot edited a comment on pull request #15656: URL: https://github.com/apache/flink/pull/15656#issuecomment-822135263 ## CI report: * fb69e9a4c5c897e75659c28915f9b9aa08e2f0c3 Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17466) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #15820: [FLINK-22535][runtime] CleanUp is invoked for task even when the task…
flinkbot edited a comment on pull request #15820: URL: https://github.com/apache/flink/pull/15820#issuecomment-830052015 ## CI report: * c548fc5c79450d898eebc1c6523be98472fe1cdd Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17467) * c11e6cb0c0817a53ef456462f610e3d079f590f9 Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17472) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #15760: [FLINK-19606][table-runtime-blink] Introduce WindowJoinOperator and WindowJoinOperatorBuilder.
flinkbot edited a comment on pull request #15760: URL: https://github.com/apache/flink/pull/15760#issuecomment-826500326 ## CI report: * 7fadefe78bb0f8b0d7b61ee2a105e1cb3b5d17b5 Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17459) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] yittg commented on a change in pull request #15768: [FLINK-22451][table] Support (*) as parameter of UDFs in Table API
yittg commented on a change in pull request #15768: URL: https://github.com/apache/flink/pull/15768#discussion_r623972440 ## File path: flink-table/flink-table-planner-blink/src/test/scala/org/apache/flink/table/planner/runtime/stream/table/CalcITCase.scala ## @@ -529,6 +529,27 @@ class CalcITCase(mode: StateBackendMode) extends StreamingWithStateTestBase(mode val expected = List("0,0,0", "1,1,1", "2,2,2") assertEquals(expected.sorted, sink.getAppendResults.sorted) } + + /** + * This is an edition of [[CalcITCase.testMap()]] that uses * as one argument of the UDFs. + */ + @Test + def testUserDefinedFunctionWithStarParameter(): Unit = { +val ds = env.fromCollection(smallTupleData3).toTable(tEnv, 'a, 'b, 'c) + .map(Func23('*)).as("a", "b", "c", "d") + .map(Func24('*)).as("a", "b", "c", "d") + .map(Func1('b)) Review comment: hha, i just modified the existed testMap case, let me rewrite the original substring case here. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #14172: [FLINK-19545][e2e] Add e2e test for native Kubernetes HA
flinkbot edited a comment on pull request #14172: URL: https://github.com/apache/flink/pull/14172#issuecomment-73215 ## CI report: * bbe6f4a0e363170970b29269af446f0d93578e5f Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17469) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #15820: [FLINK-22535][runtime] CleanUp is invoked for task even when the task…
flinkbot edited a comment on pull request #15820: URL: https://github.com/apache/flink/pull/15820#issuecomment-830052015 ## CI report: * c548fc5c79450d898eebc1c6523be98472fe1cdd Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17467) * c11e6cb0c0817a53ef456462f610e3d079f590f9 UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #15817: [FLINK-14393][webui] Add an option to enable/disable cancel job in we…
flinkbot edited a comment on pull request #15817: URL: https://github.com/apache/flink/pull/15817#issuecomment-829959024 ## CI report: * c14593a5ccd7896b041421ee4d3e433a7afc34ac Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17460) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] akalash commented on pull request #15820: [FLINK-22535][runtime] CleanUp is invoked for task even when the task…
akalash commented on pull request #15820: URL: https://github.com/apache/flink/pull/15820#issuecomment-830157511 @gaoyunhaii , @pnowojski, Thanks guys for your comments. I fixed them and reorganized my commits. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] akalash commented on a change in pull request #15820: [FLINK-22535][runtime] CleanUp is invoked for task even when the task…
akalash commented on a change in pull request #15820: URL: https://github.com/apache/flink/pull/15820#discussion_r623948875 ## File path: flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/tasks/StreamTask.java ## @@ -609,42 +615,54 @@ private void ensureNotCanceled() { @Override public final void invoke() throws Exception { -try { -// Allow invoking method 'invoke' without having to call 'restore' before it. -if (!isRunning) { -LOG.debug("Restoring during invoke will be called."); -restore(); -} +runWithCleanUpOnFail(this::executeInvoke); + +cleanUpInvoke(); +} + +private void executeInvoke() throws Exception { +// Allow invoking method 'invoke' without having to call 'restore' before it. +if (!isRunning) { +LOG.debug("Restoring during invoke will be called."); +restore(); +} -// final check to exit early before starting to run -ensureNotCanceled(); +// final check to exit early before starting to run +ensureNotCanceled(); -// let the task do its work -runMailboxLoop(); +// let the task do its work +runMailboxLoop(); -// if this left the run() method cleanly despite the fact that this was canceled, -// make sure the "clean shutdown" is not attempted -ensureNotCanceled(); +// if this left the run() method cleanly despite the fact that this was canceled, +// make sure the "clean shutdown" is not attempted +ensureNotCanceled(); -afterInvoke(); +afterInvoke(); +} + +private void runWithCleanUpOnFail(RunnableWithException run) throws Exception { +try { +run.run(); } catch (Throwable invokeException) { failing = !canceled; try { if (!canceled) { -cancelTask(); +try { +cancelTask(); +} catch (Throwable ex) { +invokeException = firstOrSuppressed(ex, invokeException); +} Review comment: 1. done 2. now it have -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #15768: [FLINK-22451][table] Support (*) as parameter of UDFs in Table API
flinkbot edited a comment on pull request #15768: URL: https://github.com/apache/flink/pull/15768#issuecomment-826735938 ## CI report: * 48814bd4d49167579110479e3f3cecabff6d443a Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17307) * 11787dc2e0c372007bddc148e0d4aec2ed9a275f Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17470) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #15728: [FLINK-22379][runtime] CheckpointCoordinator checks the state of all …
flinkbot edited a comment on pull request #15728: URL: https://github.com/apache/flink/pull/15728#issuecomment-824960796 ## CI report: * d4f85948ee5281b189ba948a8ec512f62587f979 Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17458) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] wuchong commented on a change in pull request #15768: [FLINK-22451][table] Support (*) as parameter of UDFs in Table API
wuchong commented on a change in pull request #15768: URL: https://github.com/apache/flink/pull/15768#discussion_r623933402 ## File path: flink-table/flink-table-planner-blink/src/test/scala/org/apache/flink/table/planner/runtime/stream/table/CalcITCase.scala ## @@ -529,6 +529,27 @@ class CalcITCase(mode: StateBackendMode) extends StreamingWithStateTestBase(mode val expected = List("0,0,0", "1,1,1", "2,2,2") assertEquals(expected.sorted, sink.getAppendResults.sorted) } + + /** + * This is an edition of [[CalcITCase.testMap()]] that uses * as one argument of the UDFs. + */ + @Test + def testUserDefinedFunctionWithStarParameter(): Unit = { +val ds = env.fromCollection(smallTupleData3).toTable(tEnv, 'a, 'b, 'c) + .map(Func23('*)).as("a", "b", "c", "d") + .map(Func24('*)).as("a", "b", "c", "d") + .map(Func1('b)) Review comment: Could you also test `where`, `select` too? I remember you tested it in the original test. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-18027) ROW value constructor cannot deal with complex expressions
[ https://issues.apache.org/jira/browse/FLINK-18027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17337429#comment-17337429 ] Jark Wu commented on FLINK-18027: - Is it possible to register a custom {{ROW}} function to override the built-in {{SqlStdOperatorTable.ROW}}? I remember we override the {{JSON_VALUE}} function in internal branch, and {{JSON_VALUE}} is also defined in parser. > ROW value constructor cannot deal with complex expressions > -- > > Key: FLINK-18027 > URL: https://issues.apache.org/jira/browse/FLINK-18027 > Project: Flink > Issue Type: Bug > Components: Table SQL / API >Reporter: Benchao Li >Priority: Major > Labels: pull-request-available > > {code:java} > create table my_source ( > my_row row > ) with (...); > create table my_sink ( > my_row row > ) with (...); > insert into my_sink > select ROW(my_row.a, my_row.b) > from my_source;{code} > will throw excepions: > {code:java} > Exception in thread "main" org.apache.flink.table.api.SqlParserException: SQL > parse failed. Encountered "." at line 1, column 18.Exception in thread "main" > org.apache.flink.table.api.SqlParserException: SQL parse failed. Encountered > "." at line 1, column 18.Was expecting one of: ")" ... "," ... at > org.apache.flink.table.planner.calcite.CalciteParser.parse(CalciteParser.java:56) > at > org.apache.flink.table.planner.delegation.ParserImpl.parse(ParserImpl.java:64) > at > org.apache.flink.table.api.internal.TableEnvironmentImpl.sqlQuery(TableEnvironmentImpl.java:627) > at com.bytedance.demo.KafkaTableSource.main(KafkaTableSource.java:76)Caused > by: org.apache.calcite.sql.parser.SqlParseException: Encountered "." at line > 1, column 18.Was expecting one of: ")" ... "," ... at > org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.convertException(FlinkSqlParserImpl.java:416) > at > org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.normalizeException(FlinkSqlParserImpl.java:201) > at > org.apache.calcite.sql.parser.SqlParser.handleException(SqlParser.java:148) > at org.apache.calcite.sql.parser.SqlParser.parseQuery(SqlParser.java:163) at > org.apache.calcite.sql.parser.SqlParser.parseStmt(SqlParser.java:188) at > org.apache.flink.table.planner.calcite.CalciteParser.parse(CalciteParser.java:54) > ... 3 moreCaused by: org.apache.flink.sql.parser.impl.ParseException: > Encountered "." at line 1, column 18.Was expecting one of: ")" ... "," > ... at > org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.generateParseException(FlinkSqlParserImpl.java:36161) > at > org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.jj_consume_token(FlinkSqlParserImpl.java:35975) > at > org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.ParenthesizedSimpleIdentifierList(FlinkSqlParserImpl.java:21432) > at > org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.Expression3(FlinkSqlParserImpl.java:17164) > at > org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.Expression2b(FlinkSqlParserImpl.java:16820) > at > org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.Expression2(FlinkSqlParserImpl.java:16861) > at > org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.Expression(FlinkSqlParserImpl.java:16792) > at > org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.SelectExpression(FlinkSqlParserImpl.java:11091) > at > org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.SelectItem(FlinkSqlParserImpl.java:10293) > at > org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.SelectList(FlinkSqlParserImpl.java:10267) > at > org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.SqlSelect(FlinkSqlParserImpl.java:6943) > at > org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.LeafQuery(FlinkSqlParserImpl.java:658) > at > org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.LeafQueryOrExpr(FlinkSqlParserImpl.java:16775) > at > org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.QueryOrExpr(FlinkSqlParserImpl.java:16238) > at > org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.OrderedQueryOrExpr(FlinkSqlParserImpl.java:532) > at > org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.SqlStmt(FlinkSqlParserImpl.java:3761) > at > org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.SqlStmtEof(FlinkSqlParserImpl.java:3800) > at > org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.parseSqlStmtEof(FlinkSqlParserImpl.java:248) > at org.apache.calcite.sql.parser.SqlParser.parseQuery(SqlParser.java:161) > ... 5 more > {code} > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Closed] (FLINK-22424) Writing to already released buffers potentially causing data corruption during job failover/cancellation
[ https://issues.apache.org/jira/browse/FLINK-22424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Piotr Nowojski closed FLINK-22424. -- Resolution: Fixed merged to release-1.13 as da26733e484 and 65319b256c8 > Writing to already released buffers potentially causing data corruption > during job failover/cancellation > > > Key: FLINK-22424 > URL: https://issues.apache.org/jira/browse/FLINK-22424 > Project: Flink > Issue Type: Bug > Components: Runtime / Network >Affects Versions: 1.6.4, 1.7.2, 1.8.3, 1.9.3, 1.10.3, 1.11.3, 1.12.2, > 1.13.0 >Reporter: Piotr Nowojski >Assignee: Piotr Nowojski >Priority: Critical > Labels: pull-request-available > Fix For: 1.11.4, 1.14.0, 1.13.1, 1.12.4 > > > I modified the code to not re-use the same memory segments, but on recycling > always free up the segment. And what I have observed is a similar problem as > reported in FLINK-21181 ticket, but even more severe: > {noformat} > Caused by: java.lang.RuntimeException: segment has been freed > at > org.apache.flink.streaming.runtime.io.RecordWriterOutput.pushToRecordWriter(RecordWriterOutput.java:109) > at > org.apache.flink.streaming.runtime.io.RecordWriterOutput.collect(RecordWriterOutput.java:93) > at > org.apache.flink.streaming.runtime.io.RecordWriterOutput.collect(RecordWriterOutput.java:44) > at > org.apache.flink.streaming.api.operators.CountingOutput.collect(CountingOutput.java:50) > at > org.apache.flink.streaming.api.operators.CountingOutput.collect(CountingOutput.java:28) > at > org.apache.flink.streaming.api.operators.TimestampedCollector.collect(TimestampedCollector.java:50) > at > org.apache.flink.test.checkpointing.UnalignedCheckpointStressITCase$ReEmitAll.process(UnalignedCheckpointStressITCase.java:477) > at > org.apache.flink.test.checkpointing.UnalignedCheckpointStressITCase$ReEmitAll.process(UnalignedCheckpointStressITCase.java:468) > at > org.apache.flink.streaming.runtime.operators.windowing.functions.InternalIterableProcessWindowFunction.process(InternalIterableProcessWindowFunction.java:57) > at > org.apache.flink.streaming.runtime.operators.windowing.functions.InternalIterableProcessWindowFunction.process(InternalIterableProcessWindowFunction.java:32) > at > org.apache.flink.streaming.runtime.operators.windowing.WindowOperator.emitWindowContents(WindowOperator.java:577) > at > org.apache.flink.streaming.runtime.operators.windowing.WindowOperator.onProcessingTime(WindowOperator.java:533) > at > org.apache.flink.streaming.api.operators.InternalTimerServiceImpl.onProcessingTime(InternalTimerServiceImpl.java:284) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.invokeProcessingTimeCallback(StreamTask.java:1395) > ... 11 more > Caused by: java.lang.IllegalStateException: segment has been freed > at > org.apache.flink.core.memory.MemorySegment.put(MemorySegment.java:483) > at > org.apache.flink.core.memory.MemorySegment.put(MemorySegment.java:1398) > at > org.apache.flink.runtime.io.network.buffer.BufferBuilder.append(BufferBuilder.java:100) > at > org.apache.flink.runtime.io.network.buffer.BufferBuilder.appendAndCommit(BufferBuilder.java:82) > at > org.apache.flink.runtime.io.network.partition.BufferWritingResultPartition.appendUnicastDataForNewRecord(BufferWritingResultPartition.java:250) > at > org.apache.flink.runtime.io.network.partition.BufferWritingResultPartition.emitRecord(BufferWritingResultPartition.java:142) > at > org.apache.flink.runtime.io.network.api.writer.RecordWriter.emit(RecordWriter.java:104) > at > org.apache.flink.runtime.io.network.api.writer.ChannelSelectorRecordWriter.emit(ChannelSelectorRecordWriter.java:54) > at > org.apache.flink.streaming.runtime.io.RecordWriterOutput.pushToRecordWriter(RecordWriterOutput.java:107) > ... 24 more > {noformat} > That's happening also during cancellation/job failover. It's failing when > trying to write to already `free`'ed up buffer. Without my changes, this code > would silently write some data to a buffer that has already been > recycled/returned to the pool. If someone else would pick up this buffer, it > would easily lead to the data corruption. > As far as I can tell, the exact reason behind this is that the buffer to > which timer attempts to write to, has been released from > `ResultSubpartition#onConsumedSubpartition`, causing `BufferConsumer` to be > closed (which recycles/frees underlying memory segment ), while matching > `BufferBuilder` is still being used... -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (FLINK-10644) Batch Job: Speculative execution
[ https://issues.apache.org/jira/browse/FLINK-10644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17334427#comment-17334427 ] wangwj edited comment on FLINK-10644 at 4/30/21, 2:40 PM: -- [~trohrmann] Hi Till. I am from the search and recommendation department of Alibaba in China. Happy to share and discuss my job here. Our big data processing platform uses Flink Batch to process extremely huge data every day. Many long-tail tasks are produced everyday and we have to kill these processes manually, which leads to a poor user experience. So I tried to solve this problem. I think that speculative execution means that two executions in a ExecutionVertex running at a same time, and failover means that two tasks running at two different time. Based on this, I think this feature(speculative execution) is theoretically achievable. So, I have implemented a speculative execution for batch job based on Blink, and it had a significant effect in our product cluster. I did as follows: (1)Which kind of ExecutionJobVertex is suitable enable speculative execution feature in a batch job? The speculative execution feature correlates with the implementation details of the region failover. So, I found that a ExecutionJobVertex will enable speculative execution feature only if all input edges and output edges of this ExecutionJobVertex are blocking(Condition A). (2)How to distinguish long-tail task? I distinguish long-tail task based on the intervals between the current time and the execution first create/deploying time before it failover. When an ExecutionJobVertex meets Condition A and a configurable percentage of executions has been finished in the ExecutionJobVertex, the speculative execution thread starts to really work. In the ExecutionJobVertex, when the running time of a execution is greater than a configurable multiple of the median of the running time of other finished executions, this execution is defined as long-tail execution.(Condition B) (3)How to make the speculative execution algorithm more precise? Baesd on the speculative execution algorithm in Condition B, I solved the problem of long-tail tasks in our product cluster. In the next step, we may add the throughput of the task to the speculative execution algorithm through the heartbeat of TaskManagers with JobManager. (4)How to schedule another execution in a same ExecutionVertex? We have changed the currentExecution in ExecutionVertex to a list, which means that there can be multiple executions in an ExecutionVertex at the same time. Then we reuse the current scheduling logic to schedule the speculative execution. (5)How to make the speculative task run on another machine from the original execution. We have implemented a machine-dimensional blacklist per job. The machine IP was added in the blacklist when an execution is recognized as a long-tail execution. The blacklist would remove the machine IP when it is out of date. When the executions are scheduled, we will add information of the blacklist to yarn PlacementConstraint. In this way, I can ensure that the yarn container is not on the machines in the blacklist. (6)How to avoid errors when multiple executions finish at the same time in an ExecutionVertex? In ExecutionVertex executionFinished() method, multi-thread synchronization was used to ensure that only one execution would successfully finished in an ExecutionVertex. All the other executions will go to the cancellation logic. (7)How to deal with multiple sink files in one ExecutionVertex when the job is sink to files? When batch job will sink to file, we will add an executionAttemptID suffix to the file name. Finally, I will delete or rename these files in finalizeOnMaster(). Here we should pay attention to the situation of flink stream job processing bounded data sets. (8)In batch job with all-to-all shuffle, how did the downstream original execution and speculative execution select the ResultSubPartition of the upstream executions? Two executions of a upstream ExecutionVertex will produce two ResultPartitions. When the upstream ExecutionVertex finished, we will update the inputChannel of down stream execution to the fastest finished execution of upstream. Here we should pay attention to the situation when the down stream execution meet DataConsumptionException. It will restart with the upstream execution that has been finished. (9)How to display information about speculative task on the Flink web ui? After I have implemented this feature. When speculative execution runs faster then original execution, the flink ui will show that this task has been cancelled. But the result of the job is correct, which is in full compliance with our expectations. I don’t know much about the web, I will ask my colleague for help. [~trohrmann] My implementation has played a big role in our product cluster in Alibaba.
[jira] [Updated] (FLINK-20724) Create a http handler for aggregating metrics from whole job
[ https://issues.apache.org/jira/browse/FLINK-20724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Piotr Nowojski updated FLINK-20724: --- Affects Version/s: 1.13.0 1.12.3 > Create a http handler for aggregating metrics from whole job > > > Key: FLINK-20724 > URL: https://issues.apache.org/jira/browse/FLINK-20724 > Project: Flink > Issue Type: Improvement > Components: Runtime / Metrics >Affects Versions: 1.13.0, 1.12.3 >Reporter: Piotr Nowojski >Priority: Major > Labels: stale-major > > This is an optimisation idea. > Create a similar http handler to {{AggregatingSubtasksMetricsHandler}}, but > one that would aggregate metrics per task, from all of the job vertices. The > new handler would only take {{JobID}} as a parameter. So that Web UI can in > one RPC obtain {{max(isBackPressureRatio)}} / > {{max(isCausingBackPressureRatio)}} per each task in the job graph. > This is related to FLINK-14712, where we are invoking more REST calls to get > the statistics per each task/node (in order to color the nodes based on the > back pressure and busy times). With this new handler, WebUI could make a one > REST call to get all the metrics it needs, instead of doing one REST call per > every Task. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-20724) Create a http handler for aggregating metrics from whole job
[ https://issues.apache.org/jira/browse/FLINK-20724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Piotr Nowojski updated FLINK-20724: --- Fix Version/s: (was: 1.14.0) > Create a http handler for aggregating metrics from whole job > > > Key: FLINK-20724 > URL: https://issues.apache.org/jira/browse/FLINK-20724 > Project: Flink > Issue Type: Improvement > Components: Runtime / Metrics >Reporter: Piotr Nowojski >Priority: Major > Labels: stale-major > > This is an optimisation idea. > Create a similar http handler to {{AggregatingSubtasksMetricsHandler}}, but > one that would aggregate metrics per task, from all of the job vertices. The > new handler would only take {{JobID}} as a parameter. So that Web UI can in > one RPC obtain {{max(isBackPressureRatio)}} / > {{max(isCausingBackPressureRatio)}} per each task in the job graph. > This is related to FLINK-14712, where we are invoking more REST calls to get > the statistics per each task/node (in order to color the nodes based on the > back pressure and busy times). With this new handler, WebUI could make a one > REST call to get all the metrics it needs, instead of doing one REST call per > every Task. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-22506) YARN job cluster stuck in retrying creating JobManager if savepoint is corrupted
[ https://issues.apache.org/jira/browse/FLINK-22506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17337421#comment-17337421 ] Paul Lin commented on FLINK-22506: -- [~knaufk] Actually I'm not using the application mode, and this issue has been around for a very long time. I've tried 1.12.1, and the problem still exists. [~trohrmann] I agree that it's hard to distinguish non-retryable errors from the other ones. I think a simple thought to solve the problem is to make the attempt failed when an retryable or non-retryable error occurs, and leave YARN to decide whether the application should be restarted. The total restarts would be restricted by `yarn.application-attempts` and `yarn.application-attempt-failures-validity-interval`. > YARN job cluster stuck in retrying creating JobManager if savepoint is > corrupted > > > Key: FLINK-22506 > URL: https://issues.apache.org/jira/browse/FLINK-22506 > Project: Flink > Issue Type: Improvement > Components: Deployment / YARN >Affects Versions: 1.11.3 >Reporter: Paul Lin >Priority: Major > > If a non-retryable error (e.g. the savepoint is corrupted or unaccessible) > occurs during the initiation of the job manager, the job cluster exits with > an error code. But since it does not mark the attempt as failed, it won't be > count as a failed attempt, and YARN will keep retrying forever. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Closed] (FLINK-15550) testCancelTaskExceptionAfterTaskMarkedFailed failed on azure
[ https://issues.apache.org/jira/browse/FLINK-15550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Piotr Nowojski closed FLINK-15550. -- Resolution: Cannot Reproduce > testCancelTaskExceptionAfterTaskMarkedFailed failed on azure > > > Key: FLINK-15550 > URL: https://issues.apache.org/jira/browse/FLINK-15550 > Project: Flink > Issue Type: Bug > Components: Runtime / Task >Affects Versions: 1.11.0 >Reporter: Yun Tang >Priority: Minor > Labels: auto-deprioritized-major > > Instance: > https://dev.azure.com/rmetzger/Flink/_build/results?buildId=4241=ms.vss-test-web.build-test-results-tab=12434=108939=debug > {code:java} > java.lang.AssertionError: expected: but was: > at > org.apache.flink.runtime.taskmanager.TaskTest.testCancelTaskExceptionAfterTaskMarkedFailed(TaskTest.java:525) > {code} > {code:java} > expected: but was: > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] flinkbot edited a comment on pull request #15768: [FLINK-22451][table] Support (*) as parameter of UDFs in Table API
flinkbot edited a comment on pull request #15768: URL: https://github.com/apache/flink/pull/15768#issuecomment-826735938 ## CI report: * 48814bd4d49167579110479e3f3cecabff6d443a Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17307) * 11787dc2e0c372007bddc148e0d4aec2ed9a275f UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #14172: [FLINK-19545][e2e] Add e2e test for native Kubernetes HA
flinkbot edited a comment on pull request #14172: URL: https://github.com/apache/flink/pull/14172#issuecomment-73215 ## CI report: * 6a74f815c70c03f2b374f4d422e857468bef9f12 Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10070) * bbe6f4a0e363170970b29269af446f0d93578e5f Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17469) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] yittg commented on a change in pull request #15768: [FLINK-22451][table] Support (*) as parameter of UDFs in Table API
yittg commented on a change in pull request #15768: URL: https://github.com/apache/flink/pull/15768#discussion_r623903170 ## File path: flink-table/flink-table-planner-blink/src/test/java/org/apache/flink/table/planner/functions/CallFunctionITCase.java ## @@ -0,0 +1,98 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.table.planner.functions; + +import org.apache.flink.runtime.testutils.MiniClusterResourceConfiguration; +import org.apache.flink.table.annotation.DataTypeHint; +import org.apache.flink.table.annotation.InputGroup; +import org.apache.flink.table.api.EnvironmentSettings; +import org.apache.flink.table.api.Table; +import org.apache.flink.table.api.TableEnvironment; +import org.apache.flink.table.functions.ScalarFunction; +import org.apache.flink.test.util.MiniClusterWithClientResource; +import org.apache.flink.types.Row; +import org.apache.flink.util.CloseableIterator; + +import org.apache.flink.shaded.guava18.com.google.common.collect.Lists; + +import org.junit.ClassRule; +import org.junit.Test; + +import java.util.ArrayList; +import java.util.List; + +import static org.apache.flink.table.api.Expressions.$; +import static org.apache.flink.table.api.Expressions.call; +import static org.hamcrest.CoreMatchers.equalTo; +import static org.junit.Assert.assertThat; + +/** IT tests for call functions. */ +public class CallFunctionITCase { Review comment: @wuchong Thanks, i have fixed it. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (FLINK-22256) Persist checkpoint type information
[ https://issues.apache.org/jira/browse/FLINK-22256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Till Rohrmann updated FLINK-22256: -- Description: As a user, it is retrospectively difficult to determine what kind of checkpoint (i.e. incremental, unaligned, ...) was performed when looking only at the persisted checkpoint metadata. The only way would be to look into the execution configuration of the job which might not be available anymore and can be scattered across the application code and cluster configuration. It would be highly beneficial if such information would be part of the persisted metadata to not track these external pointers. It would also be great to persist the metadata information in a standardized format so that external projects don't need to use Flink's metadata serializers to access it. was: As a user, it is retrospectively difficult to determine what kind of checkpoint (i.e. incremental, unaligned, ...) was performed when looking only at the persisted checkpoint metadata. The only way would be to look into the execution configuration of the job which might not be available anymore and can be scattered across the application code and cluster configuration. It would be highly beneficial if such information would be part of the persisted metadata to not track these external pointers. > Persist checkpoint type information > --- > > Key: FLINK-22256 > URL: https://issues.apache.org/jira/browse/FLINK-22256 > Project: Flink > Issue Type: Improvement > Components: Runtime / Checkpointing >Reporter: Fabian Paul >Priority: Major > > As a user, it is retrospectively difficult to determine what kind of > checkpoint (i.e. incremental, unaligned, ...) was performed when looking only > at the persisted checkpoint metadata. > The only way would be to look into the execution configuration of the job > which might not be available anymore and can be scattered across the > application code and cluster configuration. > It would be highly beneficial if such information would be part of the > persisted metadata to not track these external pointers. > It would also be great to persist the metadata information in a standardized > format so that external projects don't need to use Flink's metadata > serializers to access it. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-22256) Persist checkpoint type information
[ https://issues.apache.org/jira/browse/FLINK-22256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Till Rohrmann updated FLINK-22256: -- Priority: Major (was: Minor) > Persist checkpoint type information > --- > > Key: FLINK-22256 > URL: https://issues.apache.org/jira/browse/FLINK-22256 > Project: Flink > Issue Type: Improvement > Components: Runtime / Checkpointing >Reporter: Fabian Paul >Priority: Major > > As a user, it is retrospectively difficult to determine what kind of > checkpoint (i.e. incremental, unaligned, ...) was performed when looking only > at the persisted checkpoint metadata. > The only way would be to look into the execution configuration of the job > which might not be available anymore and can be scattered across the > application code and cluster configuration. > It would be highly beneficial if such information would be part of the > persisted metadata to not track these external pointers. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] flinkbot edited a comment on pull request #15819: [FLINK-22512][Hive] Fix issue that can't call current_timestamp with hive dialect for hive-3.1
flinkbot edited a comment on pull request #15819: URL: https://github.com/apache/flink/pull/15819#issuecomment-829984795 ## CI report: * 4c52c0c429eb766f22ac20a71e0dcd80b28ff718 Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17457) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #14172: [FLINK-19545][e2e] Add e2e test for native Kubernetes HA
flinkbot edited a comment on pull request #14172: URL: https://github.com/apache/flink/pull/14172#issuecomment-73215 ## CI report: * 6a74f815c70c03f2b374f4d422e857468bef9f12 Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10070) * bbe6f4a0e363170970b29269af446f0d93578e5f UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #15818: [FLINK-22539][python][docs] Restructure the Python dependency management documentation
flinkbot edited a comment on pull request #15818: URL: https://github.com/apache/flink/pull/15818#issuecomment-829970320 ## CI report: * d44f98b523a36e71a1f265768f4ea55a3aeb28ff Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17461) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #15789: [FLINK-21181][runtime] Wait for Invokable cancellation before releasing network resources
flinkbot edited a comment on pull request #15789: URL: https://github.com/apache/flink/pull/15789#issuecomment-827996694 ## CI report: * 4c5180310bf76e96f2665bf53531eccb1fa86421 UNKNOWN * f87ee85814873dee4ed181eee11df9a6758a916e Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17443) * 46e1be2c4832080bf1cb48c509b77cd88872d024 Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17468) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #15817: [FLINK-14393][webui] Add an option to enable/disable cancel job in we…
flinkbot edited a comment on pull request #15817: URL: https://github.com/apache/flink/pull/15817#issuecomment-829959024 ## CI report: * 31f01fe2108f5e18f26f50e352d3ed353bd8ffc8 Azure: [CANCELED](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17455) * c14593a5ccd7896b041421ee4d3e433a7afc34ac Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17460) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] tillrohrmann commented on pull request #14172: [FLINK-19545][e2e] Add e2e test for native Kubernetes HA
tillrohrmann commented on pull request #14172: URL: https://github.com/apache/flink/pull/14172#issuecomment-830086650 Also, it seems that the test does not fully clean up the `minikube` cluster and Docker. Not sure whether this is due to my local setup or not. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-19916) Hadoop3 ShutdownHookManager visit closed ClassLoader
[ https://issues.apache.org/jira/browse/FLINK-19916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17337377#comment-17337377 ] Robert Metzger commented on FLINK-19916: I've reset the priority to Major since multiple tickets have been opened about this issue already. > Hadoop3 ShutdownHookManager visit closed ClassLoader > > > Key: FLINK-19916 > URL: https://issues.apache.org/jira/browse/FLINK-19916 > Project: Flink > Issue Type: Bug > Components: Connectors / Hadoop Compatibility >Reporter: Jingsong Lee >Priority: Major > Labels: auto-deprioritized-major > > {code:java} > Exception in thread "Thread-10" java.lang.IllegalStateException: Trying to > access closed classloader. Please check if you store classloaders directly or > indirectly in static fields. If the stacktrace suggests that the leak occurs > in a third party library and cannot be fixed immediately, you can disable > this check with the configuration 'classloader.check-leaked-classloader'. > at > org.apache.flink.runtime.execution.librarycache.FlinkUserCodeClassLoaders$SafetyNetWrapperClassLoader.ensureInner(FlinkUserCodeClassLoaders.java:161) > at > org.apache.flink.runtime.execution.librarycache.FlinkUserCodeClassLoaders$SafetyNetWrapperClassLoader.getResource(FlinkUserCodeClassLoaders.java:179) > at > org.apache.hadoop.conf.Configuration.getResource(Configuration.java:2780) > at > org.apache.hadoop.conf.Configuration.getStreamReader(Configuration.java:3036) > at > org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2995) > at > org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2968) > at > org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2848) > at org.apache.hadoop.conf.Configuration.get(Configuration.java:1200) > at > org.apache.hadoop.conf.Configuration.getTimeDuration(Configuration.java:1812) > at > org.apache.hadoop.conf.Configuration.getTimeDuration(Configuration.java:1789) > at > org.apache.hadoop.util.ShutdownHookManager.getShutdownTimeout(ShutdownHookManager.java:183) > at > org.apache.hadoop.util.ShutdownHookManager.shutdownExecutor(ShutdownHookManager.java:145) > at > org.apache.hadoop.util.ShutdownHookManager.access$300(ShutdownHookManager.java:65) > at > org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:102) > {code} > This is because Hadoop 3 starts asynchronous threads to execute some shutdown > hooks. > These hooks are run after the job is executed, as a result, the classloader > has been released, but in hooks, configuration still holds the released > classloader, so it will fail to throw an exception in this asynchronous > thread. > Now it doesn't affect our function, it just prints the exception stack on the > console. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-19916) Hadoop3 ShutdownHookManager visit closed ClassLoader
[ https://issues.apache.org/jira/browse/FLINK-19916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Metzger updated FLINK-19916: --- Affects Version/s: 1.12.2 > Hadoop3 ShutdownHookManager visit closed ClassLoader > > > Key: FLINK-19916 > URL: https://issues.apache.org/jira/browse/FLINK-19916 > Project: Flink > Issue Type: Bug > Components: Connectors / Hadoop Compatibility >Affects Versions: 1.12.2 >Reporter: Jingsong Lee >Priority: Major > Labels: auto-deprioritized-major > > {code:java} > Exception in thread "Thread-10" java.lang.IllegalStateException: Trying to > access closed classloader. Please check if you store classloaders directly or > indirectly in static fields. If the stacktrace suggests that the leak occurs > in a third party library and cannot be fixed immediately, you can disable > this check with the configuration 'classloader.check-leaked-classloader'. > at > org.apache.flink.runtime.execution.librarycache.FlinkUserCodeClassLoaders$SafetyNetWrapperClassLoader.ensureInner(FlinkUserCodeClassLoaders.java:161) > at > org.apache.flink.runtime.execution.librarycache.FlinkUserCodeClassLoaders$SafetyNetWrapperClassLoader.getResource(FlinkUserCodeClassLoaders.java:179) > at > org.apache.hadoop.conf.Configuration.getResource(Configuration.java:2780) > at > org.apache.hadoop.conf.Configuration.getStreamReader(Configuration.java:3036) > at > org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2995) > at > org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2968) > at > org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2848) > at org.apache.hadoop.conf.Configuration.get(Configuration.java:1200) > at > org.apache.hadoop.conf.Configuration.getTimeDuration(Configuration.java:1812) > at > org.apache.hadoop.conf.Configuration.getTimeDuration(Configuration.java:1789) > at > org.apache.hadoop.util.ShutdownHookManager.getShutdownTimeout(ShutdownHookManager.java:183) > at > org.apache.hadoop.util.ShutdownHookManager.shutdownExecutor(ShutdownHookManager.java:145) > at > org.apache.hadoop.util.ShutdownHookManager.access$300(ShutdownHookManager.java:65) > at > org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:102) > {code} > This is because Hadoop 3 starts asynchronous threads to execute some shutdown > hooks. > These hooks are run after the job is executed, as a result, the classloader > has been released, but in hooks, configuration still holds the released > classloader, so it will fail to throw an exception in this asynchronous > thread. > Now it doesn't affect our function, it just prints the exception stack on the > console. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Closed] (FLINK-21914) Trying to access closed classloader
[ https://issues.apache.org/jira/browse/FLINK-21914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Metzger closed FLINK-21914. -- Resolution: Duplicate > Trying to access closed classloader > --- > > Key: FLINK-21914 > URL: https://issues.apache.org/jira/browse/FLINK-21914 > Project: Flink > Issue Type: Bug > Components: Connectors / Hive, Deployment / YARN >Affects Versions: 1.12.2 > Environment: flink: 1.12.2 > hadoop: 3.1.3 > hive: 3.1.2 > >Reporter: Spongebob >Priority: Critical > Attachments: app.log > > > I am trying to deploy flink application on yarn, but got this exception: > Exception in thread "Thread-9" java.lang.IllegalStateException: Trying to > access closed classloader. Please check if you store classloaders directly or > indirectly in static fields. If the stacktrace suggests that the leak occurs > in a third party library and cannot be fixed immediately, you can disable > this check with the configuration 'classloader.check-leaked-classloader'. > > This application tested pass on my local environment. And the application > detail is read and write into hive via flink table environment. you can view > attachment for yarn log which source and sink data info was deleted > Exception in thread "Thread-9" java.lang.IllegalStateException: Trying to > access closed classloader. Please check if you store classloaders directly or > indirectly in static fields. If the stacktrace suggests that the leak occurs > in a third party library and cannot be fixed immediately, you can disable > this check with the configuration 'classloader.check- > leaked-classloader'. > {code} > Exception in thread "Thread-9" java.lang.IllegalStateException: Trying to > access closed classloader. Please check if you store classloaders directly or > indirectly in static fields. If the stacktrace suggests that the leak occurs > in a third party library and cannot be fixed immediately, you can disable > this check with the configuration 'classloader.check-leaked-classloader'. > at > org.apache.flink.runtime.execution.librarycache.FlinkUserCodeClassLoaders$SafetyNetWrapperClassLoader.ensureInner(FlinkUserCodeClassLoaders.java:164) > at > org.apache.flink.runtime.execution.librarycache.FlinkUserCodeClassLoaders$SafetyNetWrapperClassLoader.getResource(FlinkUserCodeClassLoaders.java:183) > at > org.apache.hadoop.conf.Configuration.getResource(Configuration.java:2780) > at > org.apache.hadoop.conf.Configuration.getStreamReader(Configuration.java:3036) > at > org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2995) > at > org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2968) > at > org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2848) > at org.apache.hadoop.conf.Configuration.get(Configuration.java:1200) > at > org.apache.hadoop.conf.Configuration.getTimeDuration(Configuration.java:1812) > at > org.apache.hadoop.conf.Configuration.getTimeDuration(Configuration.java:1789) > at > org.apache.hadoop.util.ShutdownHookManager.getShutdownTimeout(ShutdownHookManager.java:183) > at > org.apache.hadoop.util.ShutdownHookManager.shutdownExecutor(ShutdownHookManager.java:145) > at > org.apache.hadoop.util.ShutdownHookManager.access$300(ShutdownHookManager.java:65) > at > org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:102) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-19916) Hadoop3 ShutdownHookManager visit closed ClassLoader
[ https://issues.apache.org/jira/browse/FLINK-19916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Metzger updated FLINK-19916: --- Priority: Major (was: Minor) > Hadoop3 ShutdownHookManager visit closed ClassLoader > > > Key: FLINK-19916 > URL: https://issues.apache.org/jira/browse/FLINK-19916 > Project: Flink > Issue Type: Bug > Components: Connectors / Hadoop Compatibility >Reporter: Jingsong Lee >Priority: Major > Labels: auto-deprioritized-major > > {code:java} > Exception in thread "Thread-10" java.lang.IllegalStateException: Trying to > access closed classloader. Please check if you store classloaders directly or > indirectly in static fields. If the stacktrace suggests that the leak occurs > in a third party library and cannot be fixed immediately, you can disable > this check with the configuration 'classloader.check-leaked-classloader'. > at > org.apache.flink.runtime.execution.librarycache.FlinkUserCodeClassLoaders$SafetyNetWrapperClassLoader.ensureInner(FlinkUserCodeClassLoaders.java:161) > at > org.apache.flink.runtime.execution.librarycache.FlinkUserCodeClassLoaders$SafetyNetWrapperClassLoader.getResource(FlinkUserCodeClassLoaders.java:179) > at > org.apache.hadoop.conf.Configuration.getResource(Configuration.java:2780) > at > org.apache.hadoop.conf.Configuration.getStreamReader(Configuration.java:3036) > at > org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2995) > at > org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2968) > at > org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2848) > at org.apache.hadoop.conf.Configuration.get(Configuration.java:1200) > at > org.apache.hadoop.conf.Configuration.getTimeDuration(Configuration.java:1812) > at > org.apache.hadoop.conf.Configuration.getTimeDuration(Configuration.java:1789) > at > org.apache.hadoop.util.ShutdownHookManager.getShutdownTimeout(ShutdownHookManager.java:183) > at > org.apache.hadoop.util.ShutdownHookManager.shutdownExecutor(ShutdownHookManager.java:145) > at > org.apache.hadoop.util.ShutdownHookManager.access$300(ShutdownHookManager.java:65) > at > org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:102) > {code} > This is because Hadoop 3 starts asynchronous threads to execute some shutdown > hooks. > These hooks are run after the job is executed, as a result, the classloader > has been released, but in hooks, configuration still holds the released > classloader, so it will fail to throw an exception in this asynchronous > thread. > Now it doesn't affect our function, it just prints the exception stack on the > console. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-21914) Trying to access closed classloader
[ https://issues.apache.org/jira/browse/FLINK-21914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17337376#comment-17337376 ] Robert Metzger commented on FLINK-21914: I would like to close this ticket and track the resolution of the issue in FLINK-19916. > Trying to access closed classloader > --- > > Key: FLINK-21914 > URL: https://issues.apache.org/jira/browse/FLINK-21914 > Project: Flink > Issue Type: Bug > Components: Connectors / Hive, Deployment / YARN >Affects Versions: 1.12.2 > Environment: flink: 1.12.2 > hadoop: 3.1.3 > hive: 3.1.2 > >Reporter: Spongebob >Priority: Critical > Attachments: app.log > > > I am trying to deploy flink application on yarn, but got this exception: > Exception in thread "Thread-9" java.lang.IllegalStateException: Trying to > access closed classloader. Please check if you store classloaders directly or > indirectly in static fields. If the stacktrace suggests that the leak occurs > in a third party library and cannot be fixed immediately, you can disable > this check with the configuration 'classloader.check-leaked-classloader'. > > This application tested pass on my local environment. And the application > detail is read and write into hive via flink table environment. you can view > attachment for yarn log which source and sink data info was deleted > Exception in thread "Thread-9" java.lang.IllegalStateException: Trying to > access closed classloader. Please check if you store classloaders directly or > indirectly in static fields. If the stacktrace suggests that the leak occurs > in a third party library and cannot be fixed immediately, you can disable > this check with the configuration 'classloader.check- > leaked-classloader'. > {code} > Exception in thread "Thread-9" java.lang.IllegalStateException: Trying to > access closed classloader. Please check if you store classloaders directly or > indirectly in static fields. If the stacktrace suggests that the leak occurs > in a third party library and cannot be fixed immediately, you can disable > this check with the configuration 'classloader.check-leaked-classloader'. > at > org.apache.flink.runtime.execution.librarycache.FlinkUserCodeClassLoaders$SafetyNetWrapperClassLoader.ensureInner(FlinkUserCodeClassLoaders.java:164) > at > org.apache.flink.runtime.execution.librarycache.FlinkUserCodeClassLoaders$SafetyNetWrapperClassLoader.getResource(FlinkUserCodeClassLoaders.java:183) > at > org.apache.hadoop.conf.Configuration.getResource(Configuration.java:2780) > at > org.apache.hadoop.conf.Configuration.getStreamReader(Configuration.java:3036) > at > org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2995) > at > org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2968) > at > org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2848) > at org.apache.hadoop.conf.Configuration.get(Configuration.java:1200) > at > org.apache.hadoop.conf.Configuration.getTimeDuration(Configuration.java:1812) > at > org.apache.hadoop.conf.Configuration.getTimeDuration(Configuration.java:1789) > at > org.apache.hadoop.util.ShutdownHookManager.getShutdownTimeout(ShutdownHookManager.java:183) > at > org.apache.hadoop.util.ShutdownHookManager.shutdownExecutor(ShutdownHookManager.java:145) > at > org.apache.hadoop.util.ShutdownHookManager.access$300(ShutdownHookManager.java:65) > at > org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:102) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-21914) Trying to access closed classloader
[ https://issues.apache.org/jira/browse/FLINK-21914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17337375#comment-17337375 ] Robert Metzger commented on FLINK-21914: Just deploying on YARN doesn't cause the exceptions on the TaskManagers. I guess it's because you are running Hive code there. As far as I can tell, the exception is not nice, but it does not cause any further issues, as it is thrown after the shutdown hooks have been executed. > Trying to access closed classloader > --- > > Key: FLINK-21914 > URL: https://issues.apache.org/jira/browse/FLINK-21914 > Project: Flink > Issue Type: Bug > Components: Connectors / Hive, Deployment / YARN >Affects Versions: 1.12.2 > Environment: flink: 1.12.2 > hadoop: 3.1.3 > hive: 3.1.2 > >Reporter: Spongebob >Priority: Critical > Attachments: app.log > > > I am trying to deploy flink application on yarn, but got this exception: > Exception in thread "Thread-9" java.lang.IllegalStateException: Trying to > access closed classloader. Please check if you store classloaders directly or > indirectly in static fields. If the stacktrace suggests that the leak occurs > in a third party library and cannot be fixed immediately, you can disable > this check with the configuration 'classloader.check-leaked-classloader'. > > This application tested pass on my local environment. And the application > detail is read and write into hive via flink table environment. you can view > attachment for yarn log which source and sink data info was deleted > Exception in thread "Thread-9" java.lang.IllegalStateException: Trying to > access closed classloader. Please check if you store classloaders directly or > indirectly in static fields. If the stacktrace suggests that the leak occurs > in a third party library and cannot be fixed immediately, you can disable > this check with the configuration 'classloader.check- > leaked-classloader'. > {code} > Exception in thread "Thread-9" java.lang.IllegalStateException: Trying to > access closed classloader. Please check if you store classloaders directly or > indirectly in static fields. If the stacktrace suggests that the leak occurs > in a third party library and cannot be fixed immediately, you can disable > this check with the configuration 'classloader.check-leaked-classloader'. > at > org.apache.flink.runtime.execution.librarycache.FlinkUserCodeClassLoaders$SafetyNetWrapperClassLoader.ensureInner(FlinkUserCodeClassLoaders.java:164) > at > org.apache.flink.runtime.execution.librarycache.FlinkUserCodeClassLoaders$SafetyNetWrapperClassLoader.getResource(FlinkUserCodeClassLoaders.java:183) > at > org.apache.hadoop.conf.Configuration.getResource(Configuration.java:2780) > at > org.apache.hadoop.conf.Configuration.getStreamReader(Configuration.java:3036) > at > org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2995) > at > org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2968) > at > org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2848) > at org.apache.hadoop.conf.Configuration.get(Configuration.java:1200) > at > org.apache.hadoop.conf.Configuration.getTimeDuration(Configuration.java:1812) > at > org.apache.hadoop.conf.Configuration.getTimeDuration(Configuration.java:1789) > at > org.apache.hadoop.util.ShutdownHookManager.getShutdownTimeout(ShutdownHookManager.java:183) > at > org.apache.hadoop.util.ShutdownHookManager.shutdownExecutor(ShutdownHookManager.java:145) > at > org.apache.hadoop.util.ShutdownHookManager.access$300(ShutdownHookManager.java:65) > at > org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:102) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-21914) Trying to access closed classloader
[ https://issues.apache.org/jira/browse/FLINK-21914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Metzger updated FLINK-21914: --- Component/s: Deployment / YARN > Trying to access closed classloader > --- > > Key: FLINK-21914 > URL: https://issues.apache.org/jira/browse/FLINK-21914 > Project: Flink > Issue Type: Bug > Components: Connectors / Hive, Deployment / YARN >Affects Versions: 1.12.2 > Environment: flink: 1.12.2 > hadoop: 3.1.3 > hive: 3.1.2 > >Reporter: Spongebob >Priority: Critical > Attachments: app.log > > > I am trying to deploy flink application on yarn, but got this exception: > Exception in thread "Thread-9" java.lang.IllegalStateException: Trying to > access closed classloader. Please check if you store classloaders directly or > indirectly in static fields. If the stacktrace suggests that the leak occurs > in a third party library and cannot be fixed immediately, you can disable > this check with the configuration 'classloader.check-leaked-classloader'. > > This application tested pass on my local environment. And the application > detail is read and write into hive via flink table environment. you can view > attachment for yarn log which source and sink data info was deleted > Exception in thread "Thread-9" java.lang.IllegalStateException: Trying to > access closed classloader. Please check if you store classloaders directly or > indirectly in static fields. If the stacktrace suggests that the leak occurs > in a third party library and cannot be fixed immediately, you can disable > this check with the configuration 'classloader.check- > leaked-classloader'. > {code} > Exception in thread "Thread-9" java.lang.IllegalStateException: Trying to > access closed classloader. Please check if you store classloaders directly or > indirectly in static fields. If the stacktrace suggests that the leak occurs > in a third party library and cannot be fixed immediately, you can disable > this check with the configuration 'classloader.check-leaked-classloader'. > at > org.apache.flink.runtime.execution.librarycache.FlinkUserCodeClassLoaders$SafetyNetWrapperClassLoader.ensureInner(FlinkUserCodeClassLoaders.java:164) > at > org.apache.flink.runtime.execution.librarycache.FlinkUserCodeClassLoaders$SafetyNetWrapperClassLoader.getResource(FlinkUserCodeClassLoaders.java:183) > at > org.apache.hadoop.conf.Configuration.getResource(Configuration.java:2780) > at > org.apache.hadoop.conf.Configuration.getStreamReader(Configuration.java:3036) > at > org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2995) > at > org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2968) > at > org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2848) > at org.apache.hadoop.conf.Configuration.get(Configuration.java:1200) > at > org.apache.hadoop.conf.Configuration.getTimeDuration(Configuration.java:1812) > at > org.apache.hadoop.conf.Configuration.getTimeDuration(Configuration.java:1789) > at > org.apache.hadoop.util.ShutdownHookManager.getShutdownTimeout(ShutdownHookManager.java:183) > at > org.apache.hadoop.util.ShutdownHookManager.shutdownExecutor(ShutdownHookManager.java:145) > at > org.apache.hadoop.util.ShutdownHookManager.access$300(ShutdownHookManager.java:65) > at > org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:102) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-21914) Trying to access closed classloader
[ https://issues.apache.org/jira/browse/FLINK-21914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Metzger updated FLINK-21914: --- Component/s: (was: Deployment / YARN) Connectors / Hive > Trying to access closed classloader > --- > > Key: FLINK-21914 > URL: https://issues.apache.org/jira/browse/FLINK-21914 > Project: Flink > Issue Type: Bug > Components: Connectors / Hive >Affects Versions: 1.12.2 > Environment: flink: 1.12.2 > hadoop: 3.1.3 > hive: 3.1.2 > >Reporter: Spongebob >Priority: Critical > Attachments: app.log > > > I am trying to deploy flink application on yarn, but got this exception: > Exception in thread "Thread-9" java.lang.IllegalStateException: Trying to > access closed classloader. Please check if you store classloaders directly or > indirectly in static fields. If the stacktrace suggests that the leak occurs > in a third party library and cannot be fixed immediately, you can disable > this check with the configuration 'classloader.check-leaked-classloader'. > > This application tested pass on my local environment. And the application > detail is read and write into hive via flink table environment. you can view > attachment for yarn log which source and sink data info was deleted > Exception in thread "Thread-9" java.lang.IllegalStateException: Trying to > access closed classloader. Please check if you store classloaders directly or > indirectly in static fields. If the stacktrace suggests that the leak occurs > in a third party library and cannot be fixed immediately, you can disable > this check with the configuration 'classloader.check- > leaked-classloader'. > {code} > Exception in thread "Thread-9" java.lang.IllegalStateException: Trying to > access closed classloader. Please check if you store classloaders directly or > indirectly in static fields. If the stacktrace suggests that the leak occurs > in a third party library and cannot be fixed immediately, you can disable > this check with the configuration 'classloader.check-leaked-classloader'. > at > org.apache.flink.runtime.execution.librarycache.FlinkUserCodeClassLoaders$SafetyNetWrapperClassLoader.ensureInner(FlinkUserCodeClassLoaders.java:164) > at > org.apache.flink.runtime.execution.librarycache.FlinkUserCodeClassLoaders$SafetyNetWrapperClassLoader.getResource(FlinkUserCodeClassLoaders.java:183) > at > org.apache.hadoop.conf.Configuration.getResource(Configuration.java:2780) > at > org.apache.hadoop.conf.Configuration.getStreamReader(Configuration.java:3036) > at > org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2995) > at > org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2968) > at > org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2848) > at org.apache.hadoop.conf.Configuration.get(Configuration.java:1200) > at > org.apache.hadoop.conf.Configuration.getTimeDuration(Configuration.java:1812) > at > org.apache.hadoop.conf.Configuration.getTimeDuration(Configuration.java:1789) > at > org.apache.hadoop.util.ShutdownHookManager.getShutdownTimeout(ShutdownHookManager.java:183) > at > org.apache.hadoop.util.ShutdownHookManager.shutdownExecutor(ShutdownHookManager.java:145) > at > org.apache.hadoop.util.ShutdownHookManager.access$300(ShutdownHookManager.java:65) > at > org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:102) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-21914) Trying to access closed classloader
[ https://issues.apache.org/jira/browse/FLINK-21914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17337371#comment-17337371 ] Robert Metzger commented on FLINK-21914: This is a known issue, reported here FLINK-22509 and here already: FLINK-19916. The logs you've provided are interesting, because the issue also seem to appear on TaskManagers. > Trying to access closed classloader > --- > > Key: FLINK-21914 > URL: https://issues.apache.org/jira/browse/FLINK-21914 > Project: Flink > Issue Type: Bug > Components: Deployment / YARN >Affects Versions: 1.12.2 > Environment: flink: 1.12.2 > hadoop: 3.1.3 > hive: 3.1.2 > >Reporter: Spongebob >Priority: Critical > Attachments: app.log > > > I am trying to deploy flink application on yarn, but got this exception: > Exception in thread "Thread-9" java.lang.IllegalStateException: Trying to > access closed classloader. Please check if you store classloaders directly or > indirectly in static fields. If the stacktrace suggests that the leak occurs > in a third party library and cannot be fixed immediately, you can disable > this check with the configuration 'classloader.check-leaked-classloader'. > > This application tested pass on my local environment. And the application > detail is read and write into hive via flink table environment. you can view > attachment for yarn log which source and sink data info was deleted > Exception in thread "Thread-9" java.lang.IllegalStateException: Trying to > access closed classloader. Please check if you store classloaders directly or > indirectly in static fields. If the stacktrace suggests that the leak occurs > in a third party library and cannot be fixed immediately, you can disable > this check with the configuration 'classloader.check- > leaked-classloader'. > {code} > Exception in thread "Thread-9" java.lang.IllegalStateException: Trying to > access closed classloader. Please check if you store classloaders directly or > indirectly in static fields. If the stacktrace suggests that the leak occurs > in a third party library and cannot be fixed immediately, you can disable > this check with the configuration 'classloader.check-leaked-classloader'. > at > org.apache.flink.runtime.execution.librarycache.FlinkUserCodeClassLoaders$SafetyNetWrapperClassLoader.ensureInner(FlinkUserCodeClassLoaders.java:164) > at > org.apache.flink.runtime.execution.librarycache.FlinkUserCodeClassLoaders$SafetyNetWrapperClassLoader.getResource(FlinkUserCodeClassLoaders.java:183) > at > org.apache.hadoop.conf.Configuration.getResource(Configuration.java:2780) > at > org.apache.hadoop.conf.Configuration.getStreamReader(Configuration.java:3036) > at > org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2995) > at > org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2968) > at > org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2848) > at org.apache.hadoop.conf.Configuration.get(Configuration.java:1200) > at > org.apache.hadoop.conf.Configuration.getTimeDuration(Configuration.java:1812) > at > org.apache.hadoop.conf.Configuration.getTimeDuration(Configuration.java:1789) > at > org.apache.hadoop.util.ShutdownHookManager.getShutdownTimeout(ShutdownHookManager.java:183) > at > org.apache.hadoop.util.ShutdownHookManager.shutdownExecutor(ShutdownHookManager.java:145) > at > org.apache.hadoop.util.ShutdownHookManager.access$300(ShutdownHookManager.java:65) > at > org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:102) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-21914) Trying to access closed classloader
[ https://issues.apache.org/jira/browse/FLINK-21914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Metzger updated FLINK-21914: --- Component/s: (was: API / Core) Deployment / YARN > Trying to access closed classloader > --- > > Key: FLINK-21914 > URL: https://issues.apache.org/jira/browse/FLINK-21914 > Project: Flink > Issue Type: Bug > Components: Deployment / YARN >Affects Versions: 1.12.2 > Environment: flink: 1.12.2 > hadoop: 3.1.3 > hive: 3.1.2 > >Reporter: Spongebob >Priority: Critical > Attachments: app.log > > > I am trying to deploy flink application on yarn, but got this exception: > Exception in thread "Thread-9" java.lang.IllegalStateException: Trying to > access closed classloader. Please check if you store classloaders directly or > indirectly in static fields. If the stacktrace suggests that the leak occurs > in a third party library and cannot be fixed immediately, you can disable > this check with the configuration 'classloader.check-leaked-classloader'. > > This application tested pass on my local environment. And the application > detail is read and write into hive via flink table environment. you can view > attachment for yarn log which source and sink data info was deleted > Exception in thread "Thread-9" java.lang.IllegalStateException: Trying to > access closed classloader. Please check if you store classloaders directly or > indirectly in static fields. If the stacktrace suggests that the leak occurs > in a third party library and cannot be fixed immediately, you can disable > this check with the configuration 'classloader.check- > leaked-classloader'. > {code} > Exception in thread "Thread-9" java.lang.IllegalStateException: Trying to > access closed classloader. Please check if you store classloaders directly or > indirectly in static fields. If the stacktrace suggests that the leak occurs > in a third party library and cannot be fixed immediately, you can disable > this check with the configuration 'classloader.check-leaked-classloader'. > at > org.apache.flink.runtime.execution.librarycache.FlinkUserCodeClassLoaders$SafetyNetWrapperClassLoader.ensureInner(FlinkUserCodeClassLoaders.java:164) > at > org.apache.flink.runtime.execution.librarycache.FlinkUserCodeClassLoaders$SafetyNetWrapperClassLoader.getResource(FlinkUserCodeClassLoaders.java:183) > at > org.apache.hadoop.conf.Configuration.getResource(Configuration.java:2780) > at > org.apache.hadoop.conf.Configuration.getStreamReader(Configuration.java:3036) > at > org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2995) > at > org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2968) > at > org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2848) > at org.apache.hadoop.conf.Configuration.get(Configuration.java:1200) > at > org.apache.hadoop.conf.Configuration.getTimeDuration(Configuration.java:1812) > at > org.apache.hadoop.conf.Configuration.getTimeDuration(Configuration.java:1789) > at > org.apache.hadoop.util.ShutdownHookManager.getShutdownTimeout(ShutdownHookManager.java:183) > at > org.apache.hadoop.util.ShutdownHookManager.shutdownExecutor(ShutdownHookManager.java:145) > at > org.apache.hadoop.util.ShutdownHookManager.access$300(ShutdownHookManager.java:65) > at > org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:102) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-22014) Flink JobManager failed to restart after failure in kubernetes HA setup
[ https://issues.apache.org/jira/browse/FLINK-22014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17337369#comment-17337369 ] Till Rohrmann commented on FLINK-22014: --- I couldn't really find anything suspicious in the logs. Would it be possible to configure a different filesystem for the HA storage directory [~mlushchytski]? Maybe it has something to do with S3. If not, then the problem should also arise with a different filesystem. > Flink JobManager failed to restart after failure in kubernetes HA setup > --- > > Key: FLINK-22014 > URL: https://issues.apache.org/jira/browse/FLINK-22014 > Project: Flink > Issue Type: Bug > Components: Deployment / Kubernetes >Affects Versions: 1.11.3, 1.12.2, 1.13.0 >Reporter: Mikalai Lushchytski >Assignee: Till Rohrmann >Priority: Major > Labels: k8s-ha, pull-request-available > Fix For: 1.11.4, 1.14.0, 1.12.4 > > Attachments: flink-logs.txt.zip, image-2021-04-19-11-17-58-215.png, > scalyr-logs (1).txt > > > After the JobManager pod failed and the new one started, it was not able to > recover jobs due to the absence of recovery data in storage - config map > pointed at not existing file. > > Due to this the JobManager pod entered into the `CrashLoopBackOff`state and > was not able to recover - each attempt failed with the same error so the > whole cluster became unrecoverable and not operating. > > I had to manually delete the config map and start the jobs again without the > save point. > > If I tried to emulate the failure further by deleting job manager pod > manually, the new pod every time recovered well and issue was not > reproducible anymore artificially. > > Below is the failure log: > {code:java} > 2021-03-26 08:22:57,925 INFO > org.apache.flink.runtime.resourcemanager.slotmanager.SlotManagerImpl [] - > Starting the SlotManager. > 2021-03-26 08:22:57,928 INFO > org.apache.flink.runtime.leaderretrieval.DefaultLeaderRetrievalService [] - > Starting DefaultLeaderRetrievalService with KubernetesLeaderRetrievalDriver > {configMapName='stellar-flink-cluster-dispatcher-leader'}. > 2021-03-26 08:22:57,931 INFO > org.apache.flink.runtime.jobmanager.DefaultJobGraphStore [] - Retrieved job > ids [198c46bac791e73ebcc565a550fa4ff6, 344f5ebc1b5c3a566b4b2837813e4940, > 96c4603a0822d10884f7fe536703d811, d9ded24224aab7c7041420b3efc1b6ba] from > KubernetesStateHandleStore{configMapName='stellar-flink-cluster-dispatcher-leader'} > 2021-03-26 08:22:57,933 INFO > org.apache.flink.runtime.dispatcher.runner.SessionDispatcherLeaderProcess [] > - Trying to recover job with job id 198c46bac791e73ebcc565a550fa4ff6. > 2021-03-26 08:22:58,029 INFO > org.apache.flink.runtime.dispatcher.runner.SessionDispatcherLeaderProcess [] > - Stopping SessionDispatcherLeaderProcess. > 2021-03-26 08:28:22,677 INFO > org.apache.flink.runtime.jobmanager.DefaultJobGraphStore [] - Stopping > DefaultJobGraphStore. 2021-03-26 08:28:22,681 ERROR > org.apache.flink.runtime.entrypoint.ClusterEntrypoint [] - Fatal error > occurred in the cluster entrypoint. java.util.concurrent.CompletionException: > org.apache.flink.util.FlinkRuntimeException: Could not recover job with job > id 198c46bac791e73ebcc565a550fa4ff6. >at java.util.concurrent.CompletableFuture.encodeThrowable(Unknown Source) > ~[?:?] >at java.util.concurrent.CompletableFuture.completeThrowable(Unknown > Source) [?:?] >at java.util.concurrent.CompletableFuture$AsyncSupply.run(Unknown Source) > [?:?] >at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) [?:?] >at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) [?:?] >at java.lang.Thread.run(Unknown Source) [?:?] Caused by: > org.apache.flink.util.FlinkRuntimeException: Could not recover job with job > id 198c46bac791e73ebcc565a550fa4ff6. >at > org.apache.flink.runtime.dispatcher.runner.SessionDispatcherLeaderProcess.recoverJob(SessionDispatcherLeaderProcess.java:144 > undefined) ~[flink-dist_2.12-1.12.2.jar:1.12.2] >at > org.apache.flink.runtime.dispatcher.runner.SessionDispatcherLeaderProcess.recoverJobs(SessionDispatcherLeaderProcess.java:122 > undefined) ~[flink-dist_2.12-1.12.2.jar:1.12.2] >at > org.apache.flink.runtime.dispatcher.runner.AbstractDispatcherLeaderProcess.supplyUnsynchronizedIfRunning(AbstractDispatcherLeaderProcess.java:198 > undefined) ~[flink-dist_2.12-1.12.2.jar:1.12.2] >at > org.apache.flink.runtime.dispatcher.runner.SessionDispatcherLeaderProcess.recoverJobsIfRunning(SessionDispatcherLeaderProcess.java:113 > undefined) ~[flink-dist_2.12-1.12.2.jar:1.12.2] ... 4 more > Caused by: org.apache.flink.util.FlinkException: Could not retrieve submitted > JobGraph from state
[jira] [Commented] (FLINK-22509) ./bin/flink run -m yarn-cluster -d submission leads to IllegalStateException
[ https://issues.apache.org/jira/browse/FLINK-22509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17337368#comment-17337368 ] Robert Metzger commented on FLINK-22509: Closing as a duplicate of FLINK-19916. > ./bin/flink run -m yarn-cluster -d submission leads to IllegalStateException > > > Key: FLINK-22509 > URL: https://issues.apache.org/jira/browse/FLINK-22509 > Project: Flink > Issue Type: Bug > Components: Deployment / YARN >Affects Versions: 1.13.0, 1.14.0 >Reporter: Robert Metzger >Assignee: Robert Metzger >Priority: Major > > Submitting a detached, per-job YARN cluster in Flink (like this: > {{./bin/flink run -m yarn-cluster -d > ./examples/streaming/TopSpeedWindowing.jar}}), leads to the following > exception: > {code} > 2021-04-28 11:39:00,786 INFO org.apache.flink.yarn.YarnClusterDescriptor > [] - Found Web Interface > ip-172-31-27-232.eu-central-1.compute.internal:45689 of application > 'application_1619607372651_0005'. > Job has been submitted with JobID 5543e81db9c2de78b646088891f23bfc > Exception in thread "Thread-4" java.lang.IllegalStateException: Trying to > access closed classloader. Please check if you store classloaders directly or > indirectly in static fields. If the stacktrace suggests that the leak occurs > in a third party library and cannot be fixed immediately, you can disable > this check with the configuration 'classloader.check-leaked-classloader'. > at > org.apache.flink.runtime.execution.librarycache.FlinkUserCodeClassLoaders$SafetyNetWrapperClassLoader.ensureInner(FlinkUserCodeClassLoaders.java:164) > at > org.apache.flink.runtime.execution.librarycache.FlinkUserCodeClassLoaders$SafetyNetWrapperClassLoader.getResource(FlinkUserCodeClassLoaders.java:183) > at > org.apache.hadoop.conf.Configuration.getResource(Configuration.java:2570) > at > org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2783) > at > org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2758) > at > org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2638) > at org.apache.hadoop.conf.Configuration.get(Configuration.java:1100) > at > org.apache.hadoop.conf.Configuration.getTimeDuration(Configuration.java:1707) > at > org.apache.hadoop.conf.Configuration.getTimeDuration(Configuration.java:1688) > at > org.apache.hadoop.util.ShutdownHookManager.getShutdownTimeout(ShutdownHookManager.java:183) > at > org.apache.hadoop.util.ShutdownHookManager.shutdownExecutor(ShutdownHookManager.java:145) > at > org.apache.hadoop.util.ShutdownHookManager.access$300(ShutdownHookManager.java:65) > at > org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:102) > {code} > The job is still running as expected. > Detached submission with {{./bin/flink run-application -t yarn-application > -d}} works as expected. This is also the documented approach. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Closed] (FLINK-22509) ./bin/flink run -m yarn-cluster -d submission leads to IllegalStateException
[ https://issues.apache.org/jira/browse/FLINK-22509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Metzger closed FLINK-22509. -- Resolution: Duplicate > ./bin/flink run -m yarn-cluster -d submission leads to IllegalStateException > > > Key: FLINK-22509 > URL: https://issues.apache.org/jira/browse/FLINK-22509 > Project: Flink > Issue Type: Bug > Components: Deployment / YARN >Affects Versions: 1.13.0, 1.14.0 >Reporter: Robert Metzger >Assignee: Robert Metzger >Priority: Major > > Submitting a detached, per-job YARN cluster in Flink (like this: > {{./bin/flink run -m yarn-cluster -d > ./examples/streaming/TopSpeedWindowing.jar}}), leads to the following > exception: > {code} > 2021-04-28 11:39:00,786 INFO org.apache.flink.yarn.YarnClusterDescriptor > [] - Found Web Interface > ip-172-31-27-232.eu-central-1.compute.internal:45689 of application > 'application_1619607372651_0005'. > Job has been submitted with JobID 5543e81db9c2de78b646088891f23bfc > Exception in thread "Thread-4" java.lang.IllegalStateException: Trying to > access closed classloader. Please check if you store classloaders directly or > indirectly in static fields. If the stacktrace suggests that the leak occurs > in a third party library and cannot be fixed immediately, you can disable > this check with the configuration 'classloader.check-leaked-classloader'. > at > org.apache.flink.runtime.execution.librarycache.FlinkUserCodeClassLoaders$SafetyNetWrapperClassLoader.ensureInner(FlinkUserCodeClassLoaders.java:164) > at > org.apache.flink.runtime.execution.librarycache.FlinkUserCodeClassLoaders$SafetyNetWrapperClassLoader.getResource(FlinkUserCodeClassLoaders.java:183) > at > org.apache.hadoop.conf.Configuration.getResource(Configuration.java:2570) > at > org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2783) > at > org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2758) > at > org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2638) > at org.apache.hadoop.conf.Configuration.get(Configuration.java:1100) > at > org.apache.hadoop.conf.Configuration.getTimeDuration(Configuration.java:1707) > at > org.apache.hadoop.conf.Configuration.getTimeDuration(Configuration.java:1688) > at > org.apache.hadoop.util.ShutdownHookManager.getShutdownTimeout(ShutdownHookManager.java:183) > at > org.apache.hadoop.util.ShutdownHookManager.shutdownExecutor(ShutdownHookManager.java:145) > at > org.apache.hadoop.util.ShutdownHookManager.access$300(ShutdownHookManager.java:65) > at > org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:102) > {code} > The job is still running as expected. > Detached submission with {{./bin/flink run-application -t yarn-application > -d}} works as expected. This is also the documented approach. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-19916) Hadoop3 ShutdownHookManager visit closed ClassLoader
[ https://issues.apache.org/jira/browse/FLINK-19916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17337367#comment-17337367 ] Robert Metzger commented on FLINK-19916: Where did you see this exception? I found the same issue while submitting a job with YARN: FLINK-22509. It seems that this exception is thrown during the shutdown of the executor itself, so ideally all the shutdown actions are completed. > Hadoop3 ShutdownHookManager visit closed ClassLoader > > > Key: FLINK-19916 > URL: https://issues.apache.org/jira/browse/FLINK-19916 > Project: Flink > Issue Type: Bug > Components: Connectors / Hadoop Compatibility >Reporter: Jingsong Lee >Priority: Minor > Labels: auto-deprioritized-major > > {code:java} > Exception in thread "Thread-10" java.lang.IllegalStateException: Trying to > access closed classloader. Please check if you store classloaders directly or > indirectly in static fields. If the stacktrace suggests that the leak occurs > in a third party library and cannot be fixed immediately, you can disable > this check with the configuration 'classloader.check-leaked-classloader'. > at > org.apache.flink.runtime.execution.librarycache.FlinkUserCodeClassLoaders$SafetyNetWrapperClassLoader.ensureInner(FlinkUserCodeClassLoaders.java:161) > at > org.apache.flink.runtime.execution.librarycache.FlinkUserCodeClassLoaders$SafetyNetWrapperClassLoader.getResource(FlinkUserCodeClassLoaders.java:179) > at > org.apache.hadoop.conf.Configuration.getResource(Configuration.java:2780) > at > org.apache.hadoop.conf.Configuration.getStreamReader(Configuration.java:3036) > at > org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2995) > at > org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2968) > at > org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2848) > at org.apache.hadoop.conf.Configuration.get(Configuration.java:1200) > at > org.apache.hadoop.conf.Configuration.getTimeDuration(Configuration.java:1812) > at > org.apache.hadoop.conf.Configuration.getTimeDuration(Configuration.java:1789) > at > org.apache.hadoop.util.ShutdownHookManager.getShutdownTimeout(ShutdownHookManager.java:183) > at > org.apache.hadoop.util.ShutdownHookManager.shutdownExecutor(ShutdownHookManager.java:145) > at > org.apache.hadoop.util.ShutdownHookManager.access$300(ShutdownHookManager.java:65) > at > org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:102) > {code} > This is because Hadoop 3 starts asynchronous threads to execute some shutdown > hooks. > These hooks are run after the job is executed, as a result, the classloader > has been released, but in hooks, configuration still holds the released > classloader, so it will fail to throw an exception in this asynchronous > thread. > Now it doesn't affect our function, it just prints the exception stack on the > console. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-22139) Flink Jobmanager & Task Manger logs are not writing to the logs files
[ https://issues.apache.org/jira/browse/FLINK-22139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17337366#comment-17337366 ] Till Rohrmann commented on FLINK-22139: --- [~bhagi__R] I've tested to deploy {{jobmanager-session-deployment-non-ha.yaml}} as specified [here|https://ci.apache.org/projects/flink/flink-docs-master/docs/deployment/resource-providers/standalone/kubernetes/#session-cluster-resource-definitions] and I could access the logs via {{kubectl logs pod-name}} as well as by logging into the container and the accessing the file under {{/opt/flink/log}}. I am not sure why it isn't working for you. > Flink Jobmanager & Task Manger logs are not writing to the logs files > - > > Key: FLINK-22139 > URL: https://issues.apache.org/jira/browse/FLINK-22139 > Project: Flink > Issue Type: Bug > Components: Deployment / Kubernetes >Affects Versions: 1.12.2 > Environment: on kubernetes flink standalone deployment with > jobmanager HA is enabled. >Reporter: Bhagi >Priority: Major > > Hi Team, > I am submitting the jobs and restarting the job manager and task manager > pods.. Log files are generating with the name task manager and job manager. > but job manager & task manager log file size is '0', i am not sure any > configuration missed..why logs are not writing to their log files.. > # Task Manager pod### > flink@flink-taskmanager-85b6585b7-hhgl7:~$ ls -lart log/ > total 0 > -rw-r--r-- 1 flink flink 0 Apr 7 09:35 > flink--taskexecutor-0-flink-taskmanager-85b6585b7-hhgl7.log > flink@flink-taskmanager-85b6585b7-hhgl7:~$ > ### Jobmanager pod Logs # > flink@flink-jobmanager-f6db89b7f-lq4ps:~$ > -rw-r--r-- 1 7148739 flink 0 Apr 7 06:36 > flink--standalonesession-0-flink-jobmanager-f6db89b7f-gtkx5.log > -rw-r--r-- 1 7148739 flink 0 Apr 7 06:36 > flink--standalonesession-0-flink-jobmanager-f6db89b7f-wnrfm.log > -rw-r--r-- 1 7148739 flink 0 Apr 7 06:37 > flink--standalonesession-0-flink-jobmanager-f6db89b7f-2b2fs.log > -rw-r--r-- 1 7148739 flink 0 Apr 7 06:37 > flink--standalonesession-0-flink-jobmanager-f6db89b7f-7kdhh.log > -rw-r--r-- 1 7148739 flink 0 Apr 7 09:35 > flink--standalonesession-0-flink-jobmanager-f6db89b7f-twhkt.log > drwxrwxrwx 2 7148739 flink35 Apr 7 09:35 . > -rw-r--r-- 1 7148739 flink 0 Apr 7 09:35 > flink--standalonesession-0-flink-jobmanager-f6db89b7f-lq4ps.log > flink@flink-jobmanager-f6db89b7f-lq4ps:~$ > I configured log4j.properties for flink > log4j.properties: |+ > monitorInterval=30 > rootLogger.level = INFO > rootLogger.appenderRef.file.ref = MainAppender > logger.flink.name = org.apache.flink > logger.flink.level = INFO > logger.akka.name = akka > logger.akka.level = INFO > appender.main.name = MainAppender > appender.main.type = RollingFile > appender.main.append = true > appender.main.fileName = ${sys:log.file} > appender.main.filePattern = ${sys:log.file}.%i > appender.main.layout.type = PatternLayout > appender.main.layout.pattern = %d{-MM-dd HH:mm:ss,SSS} %-5p %-60c %x > - %m%n > appender.main.policies.type = Policies > appender.main.policies.size.type = SizeBasedTriggeringPolicy > appender.main.policies.size.size = 100MB > appender.main.policies.startup.type = OnStartupTriggeringPolicy > appender.main.strategy.type = DefaultRolloverStrategy > appender.main.strategy.max = ${env:MAX_LOG_FILE_NUMBER:-10} > logger.netty.name = > org.apache.flink.shaded.akka.org.jboss.netty.channel.DefaultChannelPipeline > logger.netty.level = OFF -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] flinkbot edited a comment on pull request #15820: [FLINK-22535][runtime] CleanUp is invoked for task even when the task…
flinkbot edited a comment on pull request #15820: URL: https://github.com/apache/flink/pull/15820#issuecomment-830052015 ## CI report: * c548fc5c79450d898eebc1c6523be98472fe1cdd Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17467) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #15656: [FLINK-22233][Table SQL / API]Modified the spelling error of word "constant" in source code
flinkbot edited a comment on pull request #15656: URL: https://github.com/apache/flink/pull/15656#issuecomment-822135263 ## CI report: * 6358c33368746cf2c93463ab423f5c9ff11b641b Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17410) * fb69e9a4c5c897e75659c28915f9b9aa08e2f0c3 Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17466) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #15131: [FLINK-21700][security] Add an option to disable credential retrieval on a secure cluster
flinkbot edited a comment on pull request #15131: URL: https://github.com/apache/flink/pull/15131#issuecomment-794957907 ## CI report: * 60812bb2804fd21d56a47a23ca4b42a04300f1c0 Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17419) * 8cb12deac019898bb69f7a5faf7f803dd27d71d0 Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17465) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-18027) ROW value constructor cannot deal with complex expressions
[ https://issues.apache.org/jira/browse/FLINK-18027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17337350#comment-17337350 ] Timo Walther commented on FLINK-18027: -- I would keep the naming consistent with our existing data types. `map_`, `array_`, `row_` to find functions easier. Furthermore, we still have user-defined structured types which could cause confusion if we call it `struct`. > ROW value constructor cannot deal with complex expressions > -- > > Key: FLINK-18027 > URL: https://issues.apache.org/jira/browse/FLINK-18027 > Project: Flink > Issue Type: Bug > Components: Table SQL / API >Reporter: Benchao Li >Priority: Major > Labels: pull-request-available > > {code:java} > create table my_source ( > my_row row > ) with (...); > create table my_sink ( > my_row row > ) with (...); > insert into my_sink > select ROW(my_row.a, my_row.b) > from my_source;{code} > will throw excepions: > {code:java} > Exception in thread "main" org.apache.flink.table.api.SqlParserException: SQL > parse failed. Encountered "." at line 1, column 18.Exception in thread "main" > org.apache.flink.table.api.SqlParserException: SQL parse failed. Encountered > "." at line 1, column 18.Was expecting one of: ")" ... "," ... at > org.apache.flink.table.planner.calcite.CalciteParser.parse(CalciteParser.java:56) > at > org.apache.flink.table.planner.delegation.ParserImpl.parse(ParserImpl.java:64) > at > org.apache.flink.table.api.internal.TableEnvironmentImpl.sqlQuery(TableEnvironmentImpl.java:627) > at com.bytedance.demo.KafkaTableSource.main(KafkaTableSource.java:76)Caused > by: org.apache.calcite.sql.parser.SqlParseException: Encountered "." at line > 1, column 18.Was expecting one of: ")" ... "," ... at > org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.convertException(FlinkSqlParserImpl.java:416) > at > org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.normalizeException(FlinkSqlParserImpl.java:201) > at > org.apache.calcite.sql.parser.SqlParser.handleException(SqlParser.java:148) > at org.apache.calcite.sql.parser.SqlParser.parseQuery(SqlParser.java:163) at > org.apache.calcite.sql.parser.SqlParser.parseStmt(SqlParser.java:188) at > org.apache.flink.table.planner.calcite.CalciteParser.parse(CalciteParser.java:54) > ... 3 moreCaused by: org.apache.flink.sql.parser.impl.ParseException: > Encountered "." at line 1, column 18.Was expecting one of: ")" ... "," > ... at > org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.generateParseException(FlinkSqlParserImpl.java:36161) > at > org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.jj_consume_token(FlinkSqlParserImpl.java:35975) > at > org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.ParenthesizedSimpleIdentifierList(FlinkSqlParserImpl.java:21432) > at > org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.Expression3(FlinkSqlParserImpl.java:17164) > at > org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.Expression2b(FlinkSqlParserImpl.java:16820) > at > org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.Expression2(FlinkSqlParserImpl.java:16861) > at > org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.Expression(FlinkSqlParserImpl.java:16792) > at > org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.SelectExpression(FlinkSqlParserImpl.java:11091) > at > org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.SelectItem(FlinkSqlParserImpl.java:10293) > at > org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.SelectList(FlinkSqlParserImpl.java:10267) > at > org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.SqlSelect(FlinkSqlParserImpl.java:6943) > at > org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.LeafQuery(FlinkSqlParserImpl.java:658) > at > org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.LeafQueryOrExpr(FlinkSqlParserImpl.java:16775) > at > org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.QueryOrExpr(FlinkSqlParserImpl.java:16238) > at > org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.OrderedQueryOrExpr(FlinkSqlParserImpl.java:532) > at > org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.SqlStmt(FlinkSqlParserImpl.java:3761) > at > org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.SqlStmtEof(FlinkSqlParserImpl.java:3800) > at > org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.parseSqlStmtEof(FlinkSqlParserImpl.java:248) > at org.apache.calcite.sql.parser.SqlParser.parseQuery(SqlParser.java:161) > ... 5 more > {code} > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-21949) Support collect to array aggregate function
[ https://issues.apache.org/jira/browse/FLINK-21949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17337348#comment-17337348 ] Jark Wu commented on FLINK-21949: - The only concern of ARRAY_AGG from my side is it sounds similar to LISTAGG, however returns different type (ARRAY vs. STRING). LISTAGG is introduced in SQL-2016: https://modern-sql.com/feature/listagg > Support collect to array aggregate function > --- > > Key: FLINK-21949 > URL: https://issues.apache.org/jira/browse/FLINK-21949 > Project: Flink > Issue Type: Sub-task > Components: Table SQL / API >Affects Versions: 1.12.0 >Reporter: jiabao sun >Priority: Minor > > Some nosql databases like mongodb and elasticsearch support nested data types. > Aggregating multiple rows into ARRAY is a common requirement. > The CollectToArray function is similar to Collect, except that it returns > ARRAY instead of MULTISET. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] lirui-apache commented on a change in pull request #15712: [FLINK-22400][hive connect]fix NPE problem when convert flink object for Map
lirui-apache commented on a change in pull request #15712: URL: https://github.com/apache/flink/pull/15712#discussion_r623836414 ## File path: flink-connectors/flink-connector-hive/src/test/java/org/apache/flink/connectors/hive/TableEnvHiveConnectorITCase.java ## @@ -531,6 +532,52 @@ public void testLocationWithComma() throws Exception { } } +@Test +public void testReadHiveDataWithEmptyMapForHiveShim20X() throws Exception { +TableEnvironment tableEnv = getTableEnvWithHiveCatalog(); + +try { +// Flink to write parquet file +String folderURI = TEMPORARY_FOLDER.newFolder().toURI().toString(); Review comment: This is already a `tempFolder` member in this test. ## File path: flink-connectors/flink-connector-hive/src/test/java/org/apache/flink/connectors/hive/TableEnvHiveConnectorITCase.java ## @@ -531,6 +532,52 @@ public void testLocationWithComma() throws Exception { } } +@Test +public void testReadHiveDataWithEmptyMapForHiveShim20X() throws Exception { +TableEnvironment tableEnv = getTableEnvWithHiveCatalog(); + +try { +// Flink to write parquet file Review comment: Does this issue only happen with Parquet tables? If not, we can just use text table so that the test can be simpler. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] pnowojski commented on a change in pull request #15820: [FLINK-22535][runtime] CleanUp is invoked for task even when the task…
pnowojski commented on a change in pull request #15820: URL: https://github.com/apache/flink/pull/15820#discussion_r623819184 ## File path: flink-streaming-java/src/test/java/org/apache/flink/streaming/runtime/tasks/StreamTaskTest.java ## @@ -1619,6 +1616,34 @@ public void testTaskAvoidHangingAfterSnapshotStateThrownException() throws Excep } } +@Test +public void testCleanUpResourcesWhenFailingDuringInit() throws Exception { +// given: Configured SourceStreamTask with source which fails during initialization. +Configuration taskManagerConfig = new Configuration(); +taskManagerConfig.setString(STATE_BACKEND, TestMemoryStateBackendFactory.class.getName()); + +StreamConfig cfg = new StreamConfig(new Configuration()); +cfg.setStateKeySerializer(mock(TypeSerializer.class)); +cfg.setOperatorID(new OperatorID(4712L, 43L)); + +cfg.setStreamOperator(new TestStreamSource<>(new InitFailedSource())); +cfg.setTimeCharacteristic(TimeCharacteristic.ProcessingTime); + +try (NettyShuffleEnvironment shuffleEnv = new NettyShuffleEnvironmentBuilder().build()) { +Task task = createTask(SourceStreamTask.class, shuffleEnv, cfg, taskManagerConfig); + +// when: Task starts. +task.startTaskThread(); + +// wait for clean termination. +task.getExecutingThread().join(); + +// then: The task should clean up all resources even when it failed on init. +assertEquals(ExecutionState.FAILED, task.getExecutionState()); +assertFalse(InitFailedSource.resourceAllocated); Review comment: nit: rename and invert boolean logic to `assertTrue(InitFailedSource.wasClosed)`? It would be more accurate compared to "allocated". ## File path: flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/tasks/StreamTask.java ## @@ -609,42 +615,54 @@ private void ensureNotCanceled() { @Override public final void invoke() throws Exception { -try { -// Allow invoking method 'invoke' without having to call 'restore' before it. -if (!isRunning) { -LOG.debug("Restoring during invoke will be called."); -restore(); -} +runWithCleanUpOnFail(this::executeInvoke); + +cleanUpInvoke(); +} + +private void executeInvoke() throws Exception { +// Allow invoking method 'invoke' without having to call 'restore' before it. +if (!isRunning) { +LOG.debug("Restoring during invoke will be called."); +restore(); Review comment: `executeRestore()` + add a test for not swallowing exceptions if `restore` hasn't been invoked? ## File path: flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/tasks/StreamTask.java ## @@ -609,42 +615,54 @@ private void ensureNotCanceled() { @Override public final void invoke() throws Exception { -try { -// Allow invoking method 'invoke' without having to call 'restore' before it. -if (!isRunning) { -LOG.debug("Restoring during invoke will be called."); -restore(); -} +runWithCleanUpOnFail(this::executeInvoke); + +cleanUpInvoke(); +} + +private void executeInvoke() throws Exception { +// Allow invoking method 'invoke' without having to call 'restore' before it. +if (!isRunning) { +LOG.debug("Restoring during invoke will be called."); +restore(); +} -// final check to exit early before starting to run -ensureNotCanceled(); +// final check to exit early before starting to run +ensureNotCanceled(); -// let the task do its work -runMailboxLoop(); +// let the task do its work +runMailboxLoop(); -// if this left the run() method cleanly despite the fact that this was canceled, -// make sure the "clean shutdown" is not attempted -ensureNotCanceled(); +// if this left the run() method cleanly despite the fact that this was canceled, +// make sure the "clean shutdown" is not attempted +ensureNotCanceled(); -afterInvoke(); +afterInvoke(); +} + +private void runWithCleanUpOnFail(RunnableWithException run) throws Exception { +try { +run.run(); } catch (Throwable invokeException) { failing = !canceled; try { if (!canceled) { -cancelTask(); +try { +cancelTask(); +} catch (Throwable ex) { +invokeException = firstOrSuppressed(ex, invokeException); +} Review comment: 1. shouldn't this
[jira] [Commented] (FLINK-18027) ROW value constructor cannot deal with complex expressions
[ https://issues.apache.org/jira/browse/FLINK-18027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17337346#comment-17337346 ] Jark Wu commented on FLINK-18027: - What about {{STRUCT(...)}} ? This is also a well-known word and Spark provides this function to construct row. https://spark.apache.org/docs/latest/api/sql/#struct > ROW value constructor cannot deal with complex expressions > -- > > Key: FLINK-18027 > URL: https://issues.apache.org/jira/browse/FLINK-18027 > Project: Flink > Issue Type: Bug > Components: Table SQL / API >Reporter: Benchao Li >Priority: Major > Labels: pull-request-available > > {code:java} > create table my_source ( > my_row row > ) with (...); > create table my_sink ( > my_row row > ) with (...); > insert into my_sink > select ROW(my_row.a, my_row.b) > from my_source;{code} > will throw excepions: > {code:java} > Exception in thread "main" org.apache.flink.table.api.SqlParserException: SQL > parse failed. Encountered "." at line 1, column 18.Exception in thread "main" > org.apache.flink.table.api.SqlParserException: SQL parse failed. Encountered > "." at line 1, column 18.Was expecting one of: ")" ... "," ... at > org.apache.flink.table.planner.calcite.CalciteParser.parse(CalciteParser.java:56) > at > org.apache.flink.table.planner.delegation.ParserImpl.parse(ParserImpl.java:64) > at > org.apache.flink.table.api.internal.TableEnvironmentImpl.sqlQuery(TableEnvironmentImpl.java:627) > at com.bytedance.demo.KafkaTableSource.main(KafkaTableSource.java:76)Caused > by: org.apache.calcite.sql.parser.SqlParseException: Encountered "." at line > 1, column 18.Was expecting one of: ")" ... "," ... at > org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.convertException(FlinkSqlParserImpl.java:416) > at > org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.normalizeException(FlinkSqlParserImpl.java:201) > at > org.apache.calcite.sql.parser.SqlParser.handleException(SqlParser.java:148) > at org.apache.calcite.sql.parser.SqlParser.parseQuery(SqlParser.java:163) at > org.apache.calcite.sql.parser.SqlParser.parseStmt(SqlParser.java:188) at > org.apache.flink.table.planner.calcite.CalciteParser.parse(CalciteParser.java:54) > ... 3 moreCaused by: org.apache.flink.sql.parser.impl.ParseException: > Encountered "." at line 1, column 18.Was expecting one of: ")" ... "," > ... at > org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.generateParseException(FlinkSqlParserImpl.java:36161) > at > org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.jj_consume_token(FlinkSqlParserImpl.java:35975) > at > org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.ParenthesizedSimpleIdentifierList(FlinkSqlParserImpl.java:21432) > at > org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.Expression3(FlinkSqlParserImpl.java:17164) > at > org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.Expression2b(FlinkSqlParserImpl.java:16820) > at > org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.Expression2(FlinkSqlParserImpl.java:16861) > at > org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.Expression(FlinkSqlParserImpl.java:16792) > at > org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.SelectExpression(FlinkSqlParserImpl.java:11091) > at > org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.SelectItem(FlinkSqlParserImpl.java:10293) > at > org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.SelectList(FlinkSqlParserImpl.java:10267) > at > org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.SqlSelect(FlinkSqlParserImpl.java:6943) > at > org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.LeafQuery(FlinkSqlParserImpl.java:658) > at > org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.LeafQueryOrExpr(FlinkSqlParserImpl.java:16775) > at > org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.QueryOrExpr(FlinkSqlParserImpl.java:16238) > at > org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.OrderedQueryOrExpr(FlinkSqlParserImpl.java:532) > at > org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.SqlStmt(FlinkSqlParserImpl.java:3761) > at > org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.SqlStmtEof(FlinkSqlParserImpl.java:3800) > at > org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.parseSqlStmtEof(FlinkSqlParserImpl.java:248) > at org.apache.calcite.sql.parser.SqlParser.parseQuery(SqlParser.java:161) > ... 5 more > {code} > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] flinkbot commented on pull request #15820: [FLINK-22535][runtime] CleanUp is invoked for task even when the task…
flinkbot commented on pull request #15820: URL: https://github.com/apache/flink/pull/15820#issuecomment-830052015 ## CI report: * c548fc5c79450d898eebc1c6523be98472fe1cdd UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #15789: [FLINK-21181][runtime] Wait for Invokable cancellation before releasing network resources
flinkbot edited a comment on pull request #15789: URL: https://github.com/apache/flink/pull/15789#issuecomment-827996694 ## CI report: * 4c5180310bf76e96f2665bf53531eccb1fa86421 UNKNOWN * f87ee85814873dee4ed181eee11df9a6758a916e Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17443) * 46e1be2c4832080bf1cb48c509b77cd88872d024 UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #15656: [FLINK-22233][Table SQL / API]Modified the spelling error of word "constant" in source code
flinkbot edited a comment on pull request #15656: URL: https://github.com/apache/flink/pull/15656#issuecomment-822135263 ## CI report: * 6358c33368746cf2c93463ab423f5c9ff11b641b Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17410) * fb69e9a4c5c897e75659c28915f9b9aa08e2f0c3 UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #15131: [FLINK-21700][security] Add an option to disable credential retrieval on a secure cluster
flinkbot edited a comment on pull request #15131: URL: https://github.com/apache/flink/pull/15131#issuecomment-794957907 ## CI report: * 60812bb2804fd21d56a47a23ca4b42a04300f1c0 Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17419) * 8cb12deac019898bb69f7a5faf7f803dd27d71d0 UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] gaoyunhaii commented on a change in pull request #15820: [FLINK-22535][runtime] CleanUp is invoked for task even when the task…
gaoyunhaii commented on a change in pull request #15820: URL: https://github.com/apache/flink/pull/15820#discussion_r623824848 ## File path: flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/tasks/StreamTask.java ## @@ -530,6 +532,10 @@ protected Counter setupNumRecordsInCounter(StreamOperator streamOperator) { @Override public void restore() throws Exception { +runWithCleanUpOnFail(this::executeRestore); Review comment: Would it be better to make `restore` also `final`, and let the subclasses to override `executeRestore` instead ? Since the exception handling should be always required ## File path: flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/tasks/StreamTask.java ## @@ -609,42 +615,54 @@ private void ensureNotCanceled() { @Override public final void invoke() throws Exception { -try { -// Allow invoking method 'invoke' without having to call 'restore' before it. -if (!isRunning) { -LOG.debug("Restoring during invoke will be called."); -restore(); -} +runWithCleanUpOnFail(this::executeInvoke); + +cleanUpInvoke(); +} + +private void executeInvoke() throws Exception { +// Allow invoking method 'invoke' without having to call 'restore' before it. +if (!isRunning) { +LOG.debug("Restoring during invoke will be called."); +restore(); Review comment: Would it be better to call `executeRestore` here to avoid nested exception handling? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-18027) ROW value constructor cannot deal with complex expressions
[ https://issues.apache.org/jira/browse/FLINK-18027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17337323#comment-17337323 ] Timo Walther commented on FLINK-18027: -- [~jark] what do you think about introducing a new built-in function `row_from()` similar to `map_from_arrays` etc. to avoid all these problems with Calcite's parser. `ROW()` seems to cause a lot of issues that could actually be solved by a simple built-in function. > ROW value constructor cannot deal with complex expressions > -- > > Key: FLINK-18027 > URL: https://issues.apache.org/jira/browse/FLINK-18027 > Project: Flink > Issue Type: Bug > Components: Table SQL / API >Reporter: Benchao Li >Priority: Major > Labels: pull-request-available > > {code:java} > create table my_source ( > my_row row > ) with (...); > create table my_sink ( > my_row row > ) with (...); > insert into my_sink > select ROW(my_row.a, my_row.b) > from my_source;{code} > will throw excepions: > {code:java} > Exception in thread "main" org.apache.flink.table.api.SqlParserException: SQL > parse failed. Encountered "." at line 1, column 18.Exception in thread "main" > org.apache.flink.table.api.SqlParserException: SQL parse failed. Encountered > "." at line 1, column 18.Was expecting one of: ")" ... "," ... at > org.apache.flink.table.planner.calcite.CalciteParser.parse(CalciteParser.java:56) > at > org.apache.flink.table.planner.delegation.ParserImpl.parse(ParserImpl.java:64) > at > org.apache.flink.table.api.internal.TableEnvironmentImpl.sqlQuery(TableEnvironmentImpl.java:627) > at com.bytedance.demo.KafkaTableSource.main(KafkaTableSource.java:76)Caused > by: org.apache.calcite.sql.parser.SqlParseException: Encountered "." at line > 1, column 18.Was expecting one of: ")" ... "," ... at > org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.convertException(FlinkSqlParserImpl.java:416) > at > org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.normalizeException(FlinkSqlParserImpl.java:201) > at > org.apache.calcite.sql.parser.SqlParser.handleException(SqlParser.java:148) > at org.apache.calcite.sql.parser.SqlParser.parseQuery(SqlParser.java:163) at > org.apache.calcite.sql.parser.SqlParser.parseStmt(SqlParser.java:188) at > org.apache.flink.table.planner.calcite.CalciteParser.parse(CalciteParser.java:54) > ... 3 moreCaused by: org.apache.flink.sql.parser.impl.ParseException: > Encountered "." at line 1, column 18.Was expecting one of: ")" ... "," > ... at > org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.generateParseException(FlinkSqlParserImpl.java:36161) > at > org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.jj_consume_token(FlinkSqlParserImpl.java:35975) > at > org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.ParenthesizedSimpleIdentifierList(FlinkSqlParserImpl.java:21432) > at > org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.Expression3(FlinkSqlParserImpl.java:17164) > at > org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.Expression2b(FlinkSqlParserImpl.java:16820) > at > org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.Expression2(FlinkSqlParserImpl.java:16861) > at > org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.Expression(FlinkSqlParserImpl.java:16792) > at > org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.SelectExpression(FlinkSqlParserImpl.java:11091) > at > org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.SelectItem(FlinkSqlParserImpl.java:10293) > at > org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.SelectList(FlinkSqlParserImpl.java:10267) > at > org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.SqlSelect(FlinkSqlParserImpl.java:6943) > at > org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.LeafQuery(FlinkSqlParserImpl.java:658) > at > org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.LeafQueryOrExpr(FlinkSqlParserImpl.java:16775) > at > org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.QueryOrExpr(FlinkSqlParserImpl.java:16238) > at > org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.OrderedQueryOrExpr(FlinkSqlParserImpl.java:532) > at > org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.SqlStmt(FlinkSqlParserImpl.java:3761) > at > org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.SqlStmtEof(FlinkSqlParserImpl.java:3800) > at > org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.parseSqlStmtEof(FlinkSqlParserImpl.java:248) > at org.apache.calcite.sql.parser.SqlParser.parseQuery(SqlParser.java:161) > ... 5 more > {code} > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] flinkbot commented on pull request #15820: [FLINK-22535][runtime] CleanUp is invoked for task even when the task…
flinkbot commented on pull request #15820: URL: https://github.com/apache/flink/pull/15820#issuecomment-830041840 Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community to review your pull request. We will use this comment to track the progress of the review. ## Automated Checks Last check on commit c548fc5c79450d898eebc1c6523be98472fe1cdd (Fri Apr 30 11:53:57 UTC 2021) **Warnings:** * No documentation files were touched! Remember to keep the Flink docs up to date! Mention the bot in a comment to re-run the automated checks. ## Review Progress * ❓ 1. The [description] looks good. * ❓ 2. There is [consensus] that the contribution should go into to Flink. * ❓ 3. Needs [attention] from. * ❓ 4. The change fits into the overall [architecture]. * ❓ 5. Overall code [quality] is good. Please see the [Pull Request Review Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full explanation of the review process. The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands The @flinkbot bot supports the following commands: - `@flinkbot approve description` to approve one or more aspects (aspects: `description`, `consensus`, `architecture` and `quality`) - `@flinkbot approve all` to approve all aspects - `@flinkbot approve-until architecture` to approve everything until `architecture` - `@flinkbot attention @username1 [@username2 ..]` to require somebody's attention - `@flinkbot disapprove architecture` to remove an approval you gave earlier -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Closed] (FLINK-12710) Unify built-in and user-defined functions in the API modules
[ https://issues.apache.org/jira/browse/FLINK-12710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Timo Walther closed FLINK-12710. Resolution: Implemented Implemented as part of FLINK-20522. > Unify built-in and user-defined functions in the API modules > > > Key: FLINK-12710 > URL: https://issues.apache.org/jira/browse/FLINK-12710 > Project: Flink > Issue Type: New Feature > Components: Table SQL / API >Reporter: Timo Walther >Priority: Major > Labels: auto-unassigned, pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > Currently, there are three completely different stacks of functions: Table > API builtins, SQL builtins, and user-defined types. > Both the Blink and the legacy planner define a separate list of functions and > implementations with different type system and type checking logic. > The long-term goal of this issue is to unify all 6 different stacks into a > common one. This includes better support for type inference which relates to > FLINK-12251. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] flinkbot edited a comment on pull request #15817: [FLINK-14393][webui] Add an option to enable/disable cancel job in we…
flinkbot edited a comment on pull request #15817: URL: https://github.com/apache/flink/pull/15817#issuecomment-829959024 ## CI report: * 31f01fe2108f5e18f26f50e352d3ed353bd8ffc8 Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17455) * c14593a5ccd7896b041421ee4d3e433a7afc34ac Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17460) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org