[GitHub] [flink] liyubin117 commented on pull request #18361: [FLINK-25631][table] Support enhanced `show tables` syntax
liyubin117 commented on pull request #18361: URL: https://github.com/apache/flink/pull/18361#issuecomment-1012879158 @wuchong Hi, Could you please help give a review ? Thanks very much ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-25654) Remove the redundant lock in SortMergeResultPartition
[ https://issues.apache.org/jira/browse/FLINK-25654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17476005#comment-17476005 ] zhangtianyu commented on FLINK-25654: - t-1 -- 原始邮件 -- 发件人:Yingjie Cao (Jira) "j...@apache.org" 时 间:2022/01/14 15:42:00 周五 收件人:d...@flink.apache.org 抄送人: 主 题:[jira] [Created] (FLINK-25654) Remove the redundant lock in SortMergeResultPartition Yingjie Cao created FLINK-25654: --- Summary: Remove the redundant lock in SortMergeResultPartition Key: FLINK-25654 URL: https://issues.apache.org/jira/browse/FLINK-25654 Project: Flink Issue Type: Sub-task Components: Runtime / Network Affects Versions: 1.14.3 Reporter: Yingjie Cao Fix For: 1.15.0, 1.14.4 After FLINK-2372, the task canceler will never call the close method of ResultPartition, this can reduce some race conditions and simplify the code. This ticket aims to remove some redundant locks in SortMergeResultPartition. -- This message was sent by Atlassian Jira (v8.20.1#820001) > Remove the redundant lock in SortMergeResultPartition > - > > Key: FLINK-25654 > URL: https://issues.apache.org/jira/browse/FLINK-25654 > Project: Flink > Issue Type: Sub-task > Components: Runtime / Network >Affects Versions: 1.14.3 >Reporter: Yingjie Cao >Priority: Major > Fix For: 1.15.0, 1.14.4 > > > After FLINK-2372, the task canceler will never call the close method of > ResultPartition, this can reduce some race conditions and simplify the code. > This ticket aims to remove some redundant locks in SortMergeResultPartition. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (FLINK-25654) Remove the redundant lock in SortMergeResultPartition
Yingjie Cao created FLINK-25654: --- Summary: Remove the redundant lock in SortMergeResultPartition Key: FLINK-25654 URL: https://issues.apache.org/jira/browse/FLINK-25654 Project: Flink Issue Type: Sub-task Components: Runtime / Network Affects Versions: 1.14.3 Reporter: Yingjie Cao Fix For: 1.15.0, 1.14.4 After FLINK-2372, the task canceler will never call the close method of ResultPartition, this can reduce some race conditions and simplify the code. This ticket aims to remove some redundant locks in SortMergeResultPartition. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Comment Edited] (FLINK-25649) Scheduling jobs fails with org.apache.flink.runtime.jobmanager.scheduler.NoResourceAvailableException
[ https://issues.apache.org/jira/browse/FLINK-25649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17475999#comment-17475999 ] Till Rohrmann edited comment on FLINK-25649 at 1/14/22, 7:30 AM: - cc [~dmvk], [~zhuzh] was (Author: till.rohrmann): cc [~dmvk] > Scheduling jobs fails with > org.apache.flink.runtime.jobmanager.scheduler.NoResourceAvailableException > - > > Key: FLINK-25649 > URL: https://issues.apache.org/jira/browse/FLINK-25649 > Project: Flink > Issue Type: Bug > Components: Runtime / Coordination >Affects Versions: 1.13.1 >Reporter: Gil De Grove >Priority: Major > > Following comment from Till on this [SO > question|https://stackoverflow.com/questions/70683048/scheduling-jobs-fails-with-org-apache-flink-runtime-jobmanager-scheduler-noresou?noredirect=1#comment124980546_70683048] > h2. *Summary* > We are currently experiencing a scheduling issue with our flink cluster. > The symptoms are that some/most/all (it depend, the symptoms are not always > the same) of our tasks are showed as _SCHEDULED_ but fail after a timeout. > The jobs are them showed a _RUNNING_ > The failing exception is the following one: > {{Caused by: java.util.concurrent.CompletionException: > org.apache.flink.runtime.jobmanager.scheduler.NoResourceAvailableException: > Slot request bulk is not fulfillable! Could not allocate the required slot > within slot request timeout}} > After analysis, we assume (we cannot prove it, as there are not that much > logs for that part of the code) that the failure is due to a deadlock/race > condition that is happening when several jobs are being submitted at the same > time to the flink cluster, even though we have enough slots available in the > cluster. > We actually have the error with 52 available task slots, and have 12 jobs > that are not scheduled. > h2. Additional information > * Flink version: 1.13.1 commit a7f3192 > * Flink cluster in session mode > * 2 Job managers using k8s HA mode (resource requests: 2 CPU, 4Gb Ram, > limits sets on memory to 4Gb) > * 50 task managers with 2 slots each (resource requests: 2 CPUs, 2GB Ram. No > limits set). > * Our Flink cluster is shut down every night, and restarted every morning. > The error seems to occur when a lot of jobs needs to be scheduled. The jobs > are configured to restore their state, and we do not see any issues for jobs > that are being scheduled and run correctly, it seems to really be related to > a scheduling issue. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (FLINK-25649) Scheduling jobs fails with org.apache.flink.runtime.jobmanager.scheduler.NoResourceAvailableException
[ https://issues.apache.org/jira/browse/FLINK-25649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17475999#comment-17475999 ] Till Rohrmann commented on FLINK-25649: --- cc [~dmvk] > Scheduling jobs fails with > org.apache.flink.runtime.jobmanager.scheduler.NoResourceAvailableException > - > > Key: FLINK-25649 > URL: https://issues.apache.org/jira/browse/FLINK-25649 > Project: Flink > Issue Type: Bug > Components: Runtime / Coordination >Affects Versions: 1.13.1 >Reporter: Gil De Grove >Priority: Major > > Following comment from Till on this [SO > question|https://stackoverflow.com/questions/70683048/scheduling-jobs-fails-with-org-apache-flink-runtime-jobmanager-scheduler-noresou?noredirect=1#comment124980546_70683048] > h2. *Summary* > We are currently experiencing a scheduling issue with our flink cluster. > The symptoms are that some/most/all (it depend, the symptoms are not always > the same) of our tasks are showed as _SCHEDULED_ but fail after a timeout. > The jobs are them showed a _RUNNING_ > The failing exception is the following one: > {{Caused by: java.util.concurrent.CompletionException: > org.apache.flink.runtime.jobmanager.scheduler.NoResourceAvailableException: > Slot request bulk is not fulfillable! Could not allocate the required slot > within slot request timeout}} > After analysis, we assume (we cannot prove it, as there are not that much > logs for that part of the code) that the failure is due to a deadlock/race > condition that is happening when several jobs are being submitted at the same > time to the flink cluster, even though we have enough slots available in the > cluster. > We actually have the error with 52 available task slots, and have 12 jobs > that are not scheduled. > h2. Additional information > * Flink version: 1.13.1 commit a7f3192 > * Flink cluster in session mode > * 2 Job managers using k8s HA mode (resource requests: 2 CPU, 4Gb Ram, > limits sets on memory to 4Gb) > * 50 task managers with 2 slots each (resource requests: 2 CPUs, 2GB Ram. No > limits set). > * Our Flink cluster is shut down every night, and restarted every morning. > The error seems to occur when a lot of jobs needs to be scheduled. The jobs > are configured to restore their state, and we do not see any issues for jobs > that are being scheduled and run correctly, it seems to really be related to > a scheduling issue. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (FLINK-25653) Move buffer recycle in SortMergeSubpartitionReader out of lock to avoid deadlock
Yingjie Cao created FLINK-25653: --- Summary: Move buffer recycle in SortMergeSubpartitionReader out of lock to avoid deadlock Key: FLINK-25653 URL: https://issues.apache.org/jira/browse/FLINK-25653 Project: Flink Issue Type: Sub-task Components: Runtime / Network Affects Versions: 1.13.5, 1.14.3 Reporter: Yingjie Cao Fix For: 1.15.0, 1.13.6, 1.14.4 For the current sort-shuffle implementation, the different lock order in SortMergeSubpartitionReader and SortMergeResultPartitionReadScheduler can cause deadlock. To solve the problem, we can move buffer recycle in SortMergeSubpartitionReader out of the lock. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[GitHub] [flink-table-store] JingsongLi commented on a change in pull request #7: [FLINK-25643] Introduce Predicate to table store
JingsongLi commented on a change in pull request #7: URL: https://github.com/apache/flink-table-store/pull/7#discussion_r784608572 ## File path: flink-table-store-core/src/main/java/org/apache/flink/table/store/file/predicate/Equal.java ## @@ -0,0 +1,52 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.table.store.file.predicate; + +import org.apache.flink.table.store.file.stats.FieldStats; + +import static org.apache.flink.util.Preconditions.checkNotNull; + +/** A {@link Predicate} to eval equal. */ +public class Equal implements Predicate { + +private final int index; + +private final Literal literal; + +public Equal(int index, Literal literal) { +this.index = index; +this.literal = checkNotNull(literal); +} + +@Override +public boolean test(Object[] values) { +Object field = values[index]; +return field != null && literal.compareValueTo(field) == 0; +} + +@Override +public boolean test(long rowCount, FieldStats[] fieldStats) { +FieldStats stats = fieldStats[index]; +if (rowCount == stats.nullCount()) { +return false; +} +return literal.compareValueTo(stats.minValue()) >= 0 +&& literal.compareValueTo(stats.maxValue()) <= 0; Review comment: I think it is currently something that needs to be improved and should not exist, and we need to complete the statistics before integrating it together. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink-table-store] tsreaper commented on a change in pull request #7: [FLINK-25643] Introduce Predicate to table store
tsreaper commented on a change in pull request #7: URL: https://github.com/apache/flink-table-store/pull/7#discussion_r784594032 ## File path: flink-table-store-core/src/main/java/org/apache/flink/table/store/file/predicate/Equal.java ## @@ -0,0 +1,52 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.table.store.file.predicate; + +import org.apache.flink.table.store.file.stats.FieldStats; + +import static org.apache.flink.util.Preconditions.checkNotNull; + +/** A {@link Predicate} to eval equal. */ +public class Equal implements Predicate { + +private final int index; + +private final Literal literal; + +public Equal(int index, Literal literal) { +this.index = index; +this.literal = checkNotNull(literal); +} + +@Override +public boolean test(Object[] values) { +Object field = values[index]; +return field != null && literal.compareValueTo(field) == 0; +} + +@Override +public boolean test(long rowCount, FieldStats[] fieldStats) { +FieldStats stats = fieldStats[index]; +if (rowCount == stats.nullCount()) { +return false; +} +return literal.compareValueTo(stats.minValue()) >= 0 +&& literal.compareValueTo(stats.maxValue()) <= 0; Review comment: `stats.minValue()` and `stats.maxValue()` can be null even for comparable types (see current implementation of `SstFile.RollingFile#finish`). Please deal with this case and add more tests. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink-table-store] tsreaper commented on a change in pull request #7: [FLINK-25643] Introduce Predicate to table store
tsreaper commented on a change in pull request #7: URL: https://github.com/apache/flink-table-store/pull/7#discussion_r784594032 ## File path: flink-table-store-core/src/main/java/org/apache/flink/table/store/file/predicate/Equal.java ## @@ -0,0 +1,52 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.table.store.file.predicate; + +import org.apache.flink.table.store.file.stats.FieldStats; + +import static org.apache.flink.util.Preconditions.checkNotNull; + +/** A {@link Predicate} to eval equal. */ +public class Equal implements Predicate { + +private final int index; + +private final Literal literal; + +public Equal(int index, Literal literal) { +this.index = index; +this.literal = checkNotNull(literal); +} + +@Override +public boolean test(Object[] values) { +Object field = values[index]; +return field != null && literal.compareValueTo(field) == 0; +} + +@Override +public boolean test(long rowCount, FieldStats[] fieldStats) { +FieldStats stats = fieldStats[index]; +if (rowCount == stats.nullCount()) { +return false; +} +return literal.compareValueTo(stats.minValue()) >= 0 +&& literal.compareValueTo(stats.maxValue()) <= 0; Review comment: `stats.minValue()` and `stats.maxValue()` can be null even for comparable types (see current implementation of `SstFile.RollingFile#finish`. Please deal with this case and add more tests. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] Airblader edited a comment on pull request #17711: [FLINK-24737][runtime-web] Update outdated web dependencies
Airblader edited a comment on pull request #17711: URL: https://github.com/apache/flink/pull/17711#issuecomment-1012833850 I would prefer bumping this in a separate PR using the JIRA issue @MartijnVisser pointed to. I'm happy to review and merge that one as well, of course (no vacation in between this time). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] Airblader commented on pull request #17711: [FLINK-24737][runtime-web] Update outdated web dependencies
Airblader commented on pull request #17711: URL: https://github.com/apache/flink/pull/17711#issuecomment-1012833850 I would prefer bumping this in a separate PR using the JIRA issue @MartijnVisser pointed to. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink-table-store] JingsongLi commented on a change in pull request #8: [FLINK-25630] Introduce manifest related data structure and files
JingsongLi commented on a change in pull request #8: URL: https://github.com/apache/flink-table-store/pull/8#discussion_r784581070 ## File path: flink-table-store-core/src/main/java/org/apache/flink/table/store/file/manifest/ManifestEntrySerializer.java ## @@ -0,0 +1,63 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.table.store.file.manifest; + +import org.apache.flink.table.data.GenericRowData; +import org.apache.flink.table.data.RowData; +import org.apache.flink.table.runtime.typeutils.RowDataSerializer; +import org.apache.flink.table.store.file.ValueKind; +import org.apache.flink.table.store.file.mergetree.sst.SstFileMetaSerializer; +import org.apache.flink.table.store.file.utils.ObjectSerializer; +import org.apache.flink.table.types.logical.RowType; + +/** Serializer for {@link ManifestEntry}. */ +public class ManifestEntrySerializer extends ObjectSerializer { + Review comment: Can you add `serialVersionUID` to all `ObjectSerializer`? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Created] (FLINK-25652) Can "duration“ and ”received records" be updated at the same time in WebUI's task detail ?
jeff-zou created FLINK-25652: Summary: Can "duration“ and ”received records" be updated at the same time in WebUI's task detail ? Key: FLINK-25652 URL: https://issues.apache.org/jira/browse/FLINK-25652 Project: Flink Issue Type: Improvement Components: Runtime / Web Frontend Affects Versions: 1.13.2 Reporter: jeff-zou Attachments: 20220114145221.png Can "duration“ and ”records received" be updated at the same time in WebUI's task detail ? then I can get more precise QPS which the value is equal to ”records received" div "duration“. !20220114145221.png! -- This message was sent by Atlassian Jira (v8.20.1#820001)
[GitHub] [flink] chenzihao5 commented on pull request #18261: [hotfix][javadocs] Fix the doc content in the createFieldGetter method of RowData
chenzihao5 commented on pull request #18261: URL: https://github.com/apache/flink/pull/18261#issuecomment-1012819645 > Thank you for your contribution. > > Please update your commit message to follow the Code Style & Quality Guidelines, which can be found at https://flink.apache.org/contributing/code-style-and-quality-pull-requests.html#commit-naming-conventions. Right now it mentions `[docs]` as what it fixes, this should be `[javadocs]` @MartijnVisser Thanks for your correction. I have modified. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink-table-store] tsreaper opened a new pull request #8: [FLINK-25630] Introduce manifest related data structure and files
tsreaper opened a new pull request #8: URL: https://github.com/apache/flink-table-store/pull/8 Manifest files are the meta data for sst files. They record the changes between two snapshots. * Introduce manifest entry and maniefst file meta * Add tests for FileStorePathFactory * Introduce ManifestFile and ManifestList -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] zjureel closed pull request #18346: [DO NOT MERGE][FLINK-25085] Add scheduled executor service in MainThreadExecutor and close it when the server reaches termination
zjureel closed pull request #18346: URL: https://github.com/apache/flink/pull/18346 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot commented on pull request #18361: [FLINK-25631][table] Support enhanced `show tables` syntax
flinkbot commented on pull request #18361: URL: https://github.com/apache/flink/pull/18361#issuecomment-1012808315 Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community to review your pull request. We will use this comment to track the progress of the review. ## Automated Checks Last check on commit 4aec51002d41ee5968d7c9cd8b912f7538785aad (Fri Jan 14 06:24:39 UTC 2022) ✅no warnings Mention the bot in a comment to re-run the automated checks. ## Review Progress * ❓ 1. The [description] looks good. * ❓ 2. There is [consensus] that the contribution should go into to Flink. * ❓ 3. Needs [attention] from. * ❓ 4. The change fits into the overall [architecture]. * ❓ 5. Overall code [quality] is good. Please see the [Pull Request Review Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full explanation of the review process. The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands The @flinkbot bot supports the following commands: - `@flinkbot approve description` to approve one or more aspects (aspects: `description`, `consensus`, `architecture` and `quality`) - `@flinkbot approve all` to approve all aspects - `@flinkbot approve-until architecture` to approve everything until `architecture` - `@flinkbot attention @username1 [@username2 ..]` to require somebody's attention - `@flinkbot disapprove architecture` to remove an approval you gave earlier -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] wsry closed pull request #17936: [FLINK-24954][network] Refresh read buffer request timeout on buffer recycling/requesting for sort-shuffle
wsry closed pull request #17936: URL: https://github.com/apache/flink/pull/17936 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (FLINK-25631) Support enhanced `show tables` statement
[ https://issues.apache.org/jira/browse/FLINK-25631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-25631: --- Labels: pull-request-available (was: ) > Support enhanced `show tables` statement > > > Key: FLINK-25631 > URL: https://issues.apache.org/jira/browse/FLINK-25631 > Project: Flink > Issue Type: New Feature > Components: Table SQL / API >Affects Versions: 1.14.4 >Reporter: Yubin Li >Assignee: Yubin Li >Priority: Major > Labels: pull-request-available > > Enhanced `show tables` statement like ` show tables from db1 like 't%' ` has > been supported broadly in many popular data process engine like > presto/trino/spark > [https://spark.apache.org/docs/latest/sql-ref-syntax-aux-show-tables.html] > I have investigated the syntax of engines as mentioned above. > > We could use such statement to easily show the tables of specified databases > without switching db frequently, alse we could use regexp pattern to find > focused tables quickly from plenty of tables. besides, the new statement is > compatible completely with the old one, users could use `show tables` as > before. > h3. SHOW TABLES [ ( FROM | IN ) [catalog.]db ] [ [NOT] LIKE regex_pattern ] > > I have implemented the syntax, demo as below: > {code:java} > Flink SQL> create database d1; > [INFO] Execute statement succeed. > Flink SQL> create table d1.b1(id int) with ('connector'='print'); > [INFO] Execute statement succeed. > Flink SQL> create table t1(id int) with ('connector'='print'); > [INFO] Execute statement succeed. > Flink SQL> create table m1(id int) with ('connector'='print'); > [INFO] Execute statement succeed. > Flink SQL> show tables like 'm%'; > ++ > | table name | > ++ > | m1 | > ++ > 1 row in set > Flink SQL> show tables from d1 like 'b%'; > ++ > | table name | > ++ > | b1 | > ++ > 1 row in set{code} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[GitHub] [flink] liyubin117 opened a new pull request #18361: [FLINK-25631][table] Support enhanced `show tables` syntax
liyubin117 opened a new pull request #18361: URL: https://github.com/apache/flink/pull/18361 ## What is the purpose of the change Support enhanced `show tables` syntax like `show tables from catalog1.db1 like 't%'` ## Brief change log * SHOW TABLES [ ( FROM | IN ) [catalog.]db ] [ [NOT] LIKE regex_pattern ] ## Verifying this change Docs only change. ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): no - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: no - The serializers: no - The runtime per-record code paths (performance sensitive): no - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn/Mesos, ZooKeeper: no - The S3 file system connector: no ## Documentation - Does this pull request introduce a new feature? yes - If yes, how is the feature documented? yes -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] paul8263 commented on pull request #18334: [BP-1.14][FLINK-22821][core] Stabilize NetUtils#getAvailablePort in order to a…
paul8263 commented on pull request #18334: URL: https://github.com/apache/flink/pull/18334#issuecomment-1012787097 Hi @zentol , Thank you very much for your reply. I have tested Netty ServerBootstrap and it turns out that if we bind 0 as the port number, the server will finally choose a random unallocated port. I agree with you that for Netty related test the file lock could be simplified to assign a port 0 to the NettyServer, as the file lock implementation is more cumbersome. I think we may need to discuss whether it makes sense if we keep using the uniform logic by calling NetUtils#.getAvailablePort. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-25483) When FlinkSQL writes ES, it will not write and update the null value field
[ https://issues.apache.org/jira/browse/FLINK-25483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17475943#comment-17475943 ] Martijn Visser commented on FLINK-25483: [~empcl] I understand that. What I mean is that with that given SQL statement, the user explicitly doesn't take into account that a null value could end up in the result. I disagree with the statement "Users do not want our null values to affect the original records in Elasticsearch". There are users that do want this to happen. Like I said before, if you have this actual requirement, I think the best way to solve it is to have a custom Elasticsearch Sink that you modify with your requirement. This should not end up in Flink. > When FlinkSQL writes ES, it will not write and update the null value field > -- > > Key: FLINK-25483 > URL: https://issues.apache.org/jira/browse/FLINK-25483 > Project: Flink > Issue Type: New Feature > Components: Table SQL / Ecosystem >Reporter: 陈磊 >Priority: Minor > > Using Flink SQL to consume Kafka to write ES, sometimes some fields do not > exist, and those that do not exist do not want to write ES, how to deal with > this situation? > For example: the source data has 3 fields, a, b, c > insert into table2 > select > a,b,c > from table1 > When b=null, only hope to write a and c > When c=null, only hope to write a and b > -- This message was sent by Atlassian Jira (v8.20.1#820001)
[GitHub] [flink] MartijnVisser commented on pull request #15140: [FLINK-20628][connectors/rabbitmq2] RabbitMQ connector using new connector API
MartijnVisser commented on pull request #15140: URL: https://github.com/apache/flink/pull/15140#issuecomment-1012787224 @SteNicholas Sounds good. Keep in mind that we might not want to merge in new connectors in 1.16, depending on how quickly the external connector discussion is progressing and the building blocks will be delivered. It could actually be that we start moving out connectors in 1.16. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-25420) Port JDBC Source to new Source API (FLIP-27)
[ https://issues.apache.org/jira/browse/FLINK-25420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17475942#comment-17475942 ] Martijn Visser commented on FLINK-25420: [~RocMarshal] I'm not sure if a FLIP is needed (in the end, it should be just an implementation of a Flink APIs, it's not introducing/breaking anything in Flink). Outlining it in this Jira could be sufficient or if you want to validate the implementation steps, another option could be to have a [DISCUSS] thread on the Dev mailing list to validate those implementation steps. > Port JDBC Source to new Source API (FLIP-27) > > > Key: FLINK-25420 > URL: https://issues.apache.org/jira/browse/FLINK-25420 > Project: Flink > Issue Type: Improvement >Reporter: Martijn Visser >Priority: Major > > The current JDBC connector is using the old SourceFunction interface, which > is going to be deprecated. We should port/refactor the JDBC Source to use the > new Source API, based on FLIP-27 > https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface -- This message was sent by Atlassian Jira (v8.20.1#820001)
[GitHub] [flink] flinkbot commented on pull request #18360: [FLINK-25329][runtime] Support memory execution graph store in session cluster
flinkbot commented on pull request #18360: URL: https://github.com/apache/flink/pull/18360#issuecomment-1012784974 Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community to review your pull request. We will use this comment to track the progress of the review. ## Automated Checks Last check on commit ff1d1019b6b2d674dd699d398299233dea6cf93e (Fri Jan 14 05:19:58 UTC 2022) **Warnings:** * No documentation files were touched! Remember to keep the Flink docs up to date! * **This pull request references an unassigned [Jira ticket](https://issues.apache.org/jira/browse/FLINK-25329).** According to the [code contribution guide](https://flink.apache.org/contributing/contribute-code.html), tickets need to be assigned before starting with the implementation work. Mention the bot in a comment to re-run the automated checks. ## Review Progress * ❓ 1. The [description] looks good. * ❓ 2. There is [consensus] that the contribution should go into to Flink. * ❓ 3. Needs [attention] from. * ❓ 4. The change fits into the overall [architecture]. * ❓ 5. Overall code [quality] is good. Please see the [Pull Request Review Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full explanation of the review process. The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands The @flinkbot bot supports the following commands: - `@flinkbot approve description` to approve one or more aspects (aspects: `description`, `consensus`, `architecture` and `quality`) - `@flinkbot approve all` to approve all aspects - `@flinkbot approve-until architecture` to approve everything until `architecture` - `@flinkbot attention @username1 [@username2 ..]` to require somebody's attention - `@flinkbot disapprove architecture` to remove an approval you gave earlier -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (FLINK-25329) Improvement of execution graph store in flink session cluster for jobs
[ https://issues.apache.org/jira/browse/FLINK-25329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-25329: --- Labels: pull-request-available (was: ) > Improvement of execution graph store in flink session cluster for jobs > -- > > Key: FLINK-25329 > URL: https://issues.apache.org/jira/browse/FLINK-25329 > Project: Flink > Issue Type: Sub-task > Components: Runtime / Coordination >Affects Versions: 1.14.0, 1.12.5, 1.13.3 >Reporter: Shammon >Priority: Major > Labels: pull-request-available > > Flink session cluster uses files to store info of jobs after they reach > termination with `FileExecutionGraphInfoStore`, each job will generate one > file. When the cluster executes many small jobs concurrently, there will be > many disk related operations, which will > 1> Increase the CPU usage of `Dispatcher` > 2> Decrease the performance of the jobs in the cluster. > We hope to improve the disk operations in `FileExecutionGraphInfoStore` to > increase the performance of session cluster, or support memory store. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[GitHub] [flink] zjureel opened a new pull request #18360: [FLINK-25329][runtime] Support memory execution graph store in session cluster
zjureel opened a new pull request #18360: URL: https://github.com/apache/flink/pull/18360 ## What is the purpose of the change This PR aims to support memory execution graph store in session cluster ## Brief change log - * Improve MemoryExecutionGraphInfoStore to support expiration time and maximum capacity - * Support memory and file execution graph store in SessionClusterEntrypoint ## Verifying this change This change added tests and can be verified as follows: - * Add `MemoryExecutionGraphInfoStoreTest` that test the methods in `MemoryExecutionGraphInfoStore` ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): (yes / no) no - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (yes / no) no - The serializers: (yes / no / don't know) no - The runtime per-record code paths (performance sensitive): (yes / no / don't know) no - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (yes / no / don't know) no - The S3 file system connector: (yes / no / don't know) no ## Documentation - Does this pull request introduce a new feature? (yes / no) no - If yes, how is the feature documented? (not applicable / docs / JavaDocs / not documented) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-25584) Azure failed on install node and npm for Flink : Runtime web
[ https://issues.apache.org/jira/browse/FLINK-25584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17475937#comment-17475937 ] Martijn Visser commented on FLINK-25584: +1 > Azure failed on install node and npm for Flink : Runtime web > > > Key: FLINK-25584 > URL: https://issues.apache.org/jira/browse/FLINK-25584 > Project: Flink > Issue Type: Bug > Components: Build System / Azure Pipelines, Runtime / Web Frontend >Affects Versions: 1.15.0, 1.14.2 >Reporter: Yun Gao >Priority: Critical > Labels: test-stability > > {code:java} > [INFO] --- frontend-maven-plugin:1.11.0:install-node-and-npm (install node > and npm) @ flink-runtime-web --- > [INFO] Installing node version v12.14.1 > [INFO] Downloading > https://nodejs.org/dist/v12.14.1/node-v12.14.1-linux-x64.tar.gz to > /__w/1/.m2/repository/com/github/eirslett/node/12.14.1/node-12.14.1-linux-x64.tar.gz > [INFO] No proxies configured > [INFO] No proxy was configured, downloading directly > [INFO] > > [INFO] Reactor Summary: > > [ERROR] Failed to execute goal > com.github.eirslett:frontend-maven-plugin:1.11.0:install-node-and-npm > (install node and npm) on project flink-runtime-web: Could not download > Node.js: Could not download > https://nodejs.org/dist/v12.14.1/node-v12.14.1-linux-x64.tar.gz: Remote host > terminated the handshake: SSL peer shut down incorrectly -> [Help 1] > [ERROR] > [ERROR] To see the full stack trace of the errors, re-run Maven with the -e > switch. > [ERROR] Re-run Maven using the -X switch to enable full debug logging. > [ERROR] > [ERROR] For more information about the errors and possible solutions, please > read the following articles: > [ERROR] [Help 1] > http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException > [ERROR] > [ERROR] After correcting the problems, you can resume the build with the > command > [ERROR] mvn -rf :flink-runtime-web > {code} > (The refactor summary is omitted) -- This message was sent by Atlassian Jira (v8.20.1#820001)
[GitHub] [flink] MartijnVisser commented on pull request #18257: [FLINK-25504][Kafka] Upgrade Kafka Client to 2.8.1 and Confluent Platform to 6.2.2
MartijnVisser commented on pull request #18257: URL: https://github.com/apache/flink/pull/18257#issuecomment-1012779888 NB: Flinkbot reports as the CI still `PENDING`, but it has completed successfully. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] MartijnVisser commented on pull request #17711: [FLINK-24737][runtime-web] Update outdated web dependencies
MartijnVisser commented on pull request #17711: URL: https://github.com/apache/flink/pull/17711#issuecomment-1012774822 @yangjunhan I'm not sure if it's better to make it a separate PR (since we also have https://issues.apache.org/jira/browse/FLINK-24768 open for this) or if we should do it in a separate commit. Let's see what @Airblader says on that. It would be great do have this update in place! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] yangjunhan commented on pull request #17711: [FLINK-24737][runtime-web] Update outdated web dependencies
yangjunhan commented on pull request #17711: URL: https://github.com/apache/flink/pull/17711#issuecomment-1012727601 > @yangjunhan I apologize for the late response; I was on vacation over the holidays. > > The changes overall look fine to me, except for the open point around the linting output. I think the solution you proposed sounds good, at least I can't think of anything better either. @Airblader Hope you enjoyed your holidays. Since NG-ZORRO-antd recently released its new version, I am considering bumping Runtime-Web directly to Angular v13 and NG-ZORRO-antd v13 in this PR, if that is ok with you and @MartijnVisser . -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot commented on pull request #18359: [FLINK-25484][connectors/filesystem] Support inactivityInterval config in table api
flinkbot commented on pull request #18359: URL: https://github.com/apache/flink/pull/18359#issuecomment-1012727099 Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community to review your pull request. We will use this comment to track the progress of the review. ## Automated Checks Last check on commit 111d05bc4d409c357223c238e67bcb34880a1951 (Fri Jan 14 03:53:33 UTC 2022) **Warnings:** * No documentation files were touched! Remember to keep the Flink docs up to date! Mention the bot in a comment to re-run the automated checks. ## Review Progress * ❓ 1. The [description] looks good. * ❓ 2. There is [consensus] that the contribution should go into to Flink. * ❓ 3. Needs [attention] from. * ❓ 4. The change fits into the overall [architecture]. * ❓ 5. Overall code [quality] is good. Please see the [Pull Request Review Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full explanation of the review process. The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands The @flinkbot bot supports the following commands: - `@flinkbot approve description` to approve one or more aspects (aspects: `description`, `consensus`, `architecture` and `quality`) - `@flinkbot approve all` to approve all aspects - `@flinkbot approve-until architecture` to approve everything until `architecture` - `@flinkbot attention @username1 [@username2 ..]` to require somebody's attention - `@flinkbot disapprove architecture` to remove an approval you gave earlier -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (FLINK-25484) TableRollingPolicy do not support inactivityInterval config which is supported in datastream api
[ https://issues.apache.org/jira/browse/FLINK-25484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-25484: --- Labels: pull-request-available (was: ) > TableRollingPolicy do not support inactivityInterval config which is > supported in datastream api > > > Key: FLINK-25484 > URL: https://issues.apache.org/jira/browse/FLINK-25484 > Project: Flink > Issue Type: Improvement > Components: Connectors / FileSystem, Table SQL / Ecosystem >Affects Versions: 1.15.0 >Reporter: LiChang >Assignee: LiChang >Priority: Major > Labels: pull-request-available > Fix For: 1.15.0 > > Attachments: image-2022-01-13-11-36-24-102.png > > > TableRollingPolicy do not support inactivityInterval config > public static class TableRollingPolicy extends > CheckpointRollingPolicy { > private final boolean rollOnCheckpoint; > private final long rollingFileSize; > private final long rollingTimeInterval; > public TableRollingPolicy( > boolean rollOnCheckpoint, long rollingFileSize, long rollingTimeInterval) { > this.rollOnCheckpoint = rollOnCheckpoint; > Preconditions.checkArgument(rollingFileSize > 0L); > Preconditions.checkArgument(rollingTimeInterval > 0L); > this.rollingFileSize = rollingFileSize; > this.rollingTimeInterval = rollingTimeInterval; > } > -- This message was sent by Atlassian Jira (v8.20.1#820001)
[GitHub] [flink] lichang-bd opened a new pull request #18359: [FLINK-25484][connectors/filesystem] Support inactivityInterval config in table api #18344
lichang-bd opened a new pull request #18359: URL: https://github.com/apache/flink/pull/18359 Change-Id: Ica8ba43307476198c4f525ac40a54ab862b58ea7 ## What is the purpose of the change Support inactivityInterval config in table api ## Brief change log - *Add SINK_ROLLING_POLICY_INACTIVITY_INTERVAL table config* - *Support file inactive check when rolling which has been already supported in datastream api* ## Verifying this change This change is a trivial rework / code cleanup without any test coverage. ## Documentation - Does this pull request introduce a new feature? (no) - If yes, how is the feature documented? (not applicable / docs / JavaDocs / not documented) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Assigned] (FLINK-25312) Hive supports managed table
[ https://issues.apache.org/jira/browse/FLINK-25312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jingsong Lee reassigned FLINK-25312: Assignee: Jane Chan > Hive supports managed table > --- > > Key: FLINK-25312 > URL: https://issues.apache.org/jira/browse/FLINK-25312 > Project: Flink > Issue Type: Sub-task > Components: Connectors / Hive >Reporter: Jingsong Lee >Assignee: Jane Chan >Priority: Major > Fix For: 1.15.0 > > -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (FLINK-25312) Hive supports managed table
[ https://issues.apache.org/jira/browse/FLINK-25312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17475921#comment-17475921 ] Jingsong Lee commented on FLINK-25312: -- [~qingyue] Thanks! > Hive supports managed table > --- > > Key: FLINK-25312 > URL: https://issues.apache.org/jira/browse/FLINK-25312 > Project: Flink > Issue Type: Sub-task > Components: Connectors / Hive >Reporter: Jingsong Lee >Assignee: Jane Chan >Priority: Major > Fix For: 1.15.0 > > -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (FLINK-25651) Flink1.14.2 DataStream Connectors Kafka Deserializer example method uses the wrong parameter
shouzuo meng created FLINK-25651: Summary: Flink1.14.2 DataStream Connectors Kafka Deserializer example method uses the wrong parameter Key: FLINK-25651 URL: https://issues.apache.org/jira/browse/FLINK-25651 Project: Flink Issue Type: Bug Components: Connectors / Kafka Affects Versions: 1.14.2 Reporter: shouzuo meng Attachments: deserializer.png The official documentation DataStream Connectors kafka Deserializer module, introduces the KafkaRecordDeserializationSchema. ValueOnly, used the wrong parameters -- This message was sent by Atlassian Jira (v8.20.1#820001)
[GitHub] [flink] flinkbot commented on pull request #18358: Update kafka.md
flinkbot commented on pull request #18358: URL: https://github.com/apache/flink/pull/18358#issuecomment-1012712193 Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community to review your pull request. We will use this comment to track the progress of the review. ## Automated Checks Last check on commit f081bab4fe89823da05cbad2f0be0ebae668e310 (Fri Jan 14 03:17:11 UTC 2022) **Warnings:** * Documentation files were touched, but no `docs/content.zh/` files: Update Chinese documentation or file Jira ticket. * **Invalid pull request title: No valid Jira ID provided** Mention the bot in a comment to re-run the automated checks. ## Review Progress * ❓ 1. The [description] looks good. * ❓ 2. There is [consensus] that the contribution should go into to Flink. * ❓ 3. Needs [attention] from. * ❓ 4. The change fits into the overall [architecture]. * ❓ 5. Overall code [quality] is good. Please see the [Pull Request Review Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full explanation of the review process. The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands The @flinkbot bot supports the following commands: - `@flinkbot approve description` to approve one or more aspects (aspects: `description`, `consensus`, `architecture` and `quality`) - `@flinkbot approve all` to approve all aspects - `@flinkbot approve-until architecture` to approve everything until `architecture` - `@flinkbot attention @username1 [@username2 ..]` to require somebody's attention - `@flinkbot disapprove architecture` to remove an approval you gave earlier -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] jelly-1203 opened a new pull request #18358: Update kafka.md
jelly-1203 opened a new pull request #18358: URL: https://github.com/apache/flink/pull/18358 Modify KafkaRecordDeserializationSchema. ValueOnly method's parameters ## What is the purpose of the change *(For example: This pull request makes task deployment go through the blob server, rather than through RPC. That way we avoid re-transferring them on each deployment (during recovery).)* ## Brief change log *(for example:)* - *The TaskInfo is stored in the blob store on job creation time as a persistent artifact* - *Deployments RPC transmits only the blob storage reference* - *TaskManagers retrieve the TaskInfo from the blob cache* ## Verifying this change Please make sure both new and modified tests in this PR follows the conventions defined in our code quality guide: https://flink.apache.org/contributing/code-style-and-quality-common.html#testing *(Please pick either of the following options)* This change is a trivial rework / code cleanup without any test coverage. *(or)* This change is already covered by existing tests, such as *(please describe tests)*. *(or)* This change added tests and can be verified as follows: *(example:)* - *Added integration tests for end-to-end deployment with large payloads (100MB)* - *Extended integration test for recovery after master (JobManager) failure* - *Added test that validates that TaskInfo is transferred only once across recoveries* - *Manually verified the change by running a 4 node cluser with 2 JobManagers and 4 TaskManagers, a stateful streaming program, and killing one JobManager and two TaskManagers during the execution, verifying that recovery happens correctly.* ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): (yes / no) - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (yes / no) - The serializers: (yes / no / don't know) - The runtime per-record code paths (performance sensitive): (yes / no / don't know) - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (yes / no / don't know) - The S3 file system connector: (yes / no / don't know) ## Documentation - Does this pull request introduce a new feature? (yes / no) - If yes, how is the feature documented? (not applicable / docs / JavaDocs / not documented) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-25634) flink-read-onyarn-configuration
[ https://issues.apache.org/jira/browse/FLINK-25634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17475910#comment-17475910 ] 宇宙先生 commented on FLINK-25634: -- thanks for review, I can't > flink-read-onyarn-configuration > --- > > Key: FLINK-25634 > URL: https://issues.apache.org/jira/browse/FLINK-25634 > Project: Flink > Issue Type: Bug >Affects Versions: 1.11.2 >Reporter: 宇宙先生 >Priority: Major > Attachments: image-2022-01-13-10-02-57-803.png, > image-2022-01-13-10-03-28-230.png, image-2022-01-13-10-07-02-908.png, > image-2022-01-13-10-09-36-890.png, image-2022-01-13-10-10-44-945.png, > image-2022-01-13-10-14-18-918.png, image-2022-01-13-14-19-43-539.png, > image-2022-01-13-16-33-24-459.png, image-2022-01-13-16-33-46-046.png, > image-2022-01-13-16-35-37-432.png, image-2022-01-13-19-17-01-097.png, > image-2022-01-13-19-17-10-731.png, image-2022-01-13-19-17-15-511.png > > > in flink-src.code : > !image-2022-01-13-10-14-18-918.png! > Set the number of retries for failed YARN ApplicationMasters/JobManagers in > high > availability mode. This value is usually limited by YARN. > By default, it's 1 in the standalone case and 2 in the high availability case > > in my cluster,the number of retries for failed YARN ApplicationMasters is 2 > yarn's configuration also like this > !image-2022-01-13-10-07-02-908.png! > But it keeps restarting when my task fails, > !image-2022-01-13-10-10-44-945.png! > I would like to know the reason why the configuration is not taking effect. > sincere thanks! -- This message was sent by Atlassian Jira (v8.20.1#820001)
[GitHub] [flink] lichang-bd closed pull request #18357: Flink 25484
lichang-bd closed pull request #18357: URL: https://github.com/apache/flink/pull/18357 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot commented on pull request #18357: Flink 25484
flinkbot commented on pull request #18357: URL: https://github.com/apache/flink/pull/18357#issuecomment-1012700380 Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community to review your pull request. We will use this comment to track the progress of the review. ## Automated Checks Last check on commit 111d05bc4d409c357223c238e67bcb34880a1951 (Fri Jan 14 02:46:36 UTC 2022) **Warnings:** * No documentation files were touched! Remember to keep the Flink docs up to date! * **Invalid pull request title: No valid Jira ID provided** Mention the bot in a comment to re-run the automated checks. ## Review Progress * ❓ 1. The [description] looks good. * ❓ 2. There is [consensus] that the contribution should go into to Flink. * ❓ 3. Needs [attention] from. * ❓ 4. The change fits into the overall [architecture]. * ❓ 5. Overall code [quality] is good. Please see the [Pull Request Review Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full explanation of the review process. The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands The @flinkbot bot supports the following commands: - `@flinkbot approve description` to approve one or more aspects (aspects: `description`, `consensus`, `architecture` and `quality`) - `@flinkbot approve all` to approve all aspects - `@flinkbot approve-until architecture` to approve everything until `architecture` - `@flinkbot attention @username1 [@username2 ..]` to require somebody's attention - `@flinkbot disapprove architecture` to remove an approval you gave earlier -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] lichang-bd opened a new pull request #18357: Flink 25484
lichang-bd opened a new pull request #18357: URL: https://github.com/apache/flink/pull/18357 Change-Id: Ica8ba43307476198c4f525ac40a54ab862b58ea7 ## What is the purpose of the change Support inactivityInterval config in table api ## Brief change log - *Add SINK_ROLLING_POLICY_INACTIVITY_INTERVAL table config* - *Support file inactive check when rolling which has been already supported in datastream api* ## Verifying this change This change is a trivial rework / code cleanup without any test coverage. ## Documentation - Does this pull request introduce a new feature? (no) - If yes, how is the feature documented? (not applicable / docs / JavaDocs / not documented) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] lichang-bd closed pull request #18344: [Flink-25484][Filesystem] Support inactivityInterval config in table api
lichang-bd closed pull request #18344: URL: https://github.com/apache/flink/pull/18344 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (FLINK-25329) Improvement of execution graph store in flink session cluster for jobs
[ https://issues.apache.org/jira/browse/FLINK-25329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shammon updated FLINK-25329: Description: Flink session cluster uses files to store info of jobs after they reach termination with `FileExecutionGraphInfoStore`, each job will generate one file. When the cluster executes many small jobs concurrently, there will be many disk related operations, which will 1> Increase the CPU usage of `Dispatcher` 2> Decrease the performance of the jobs in the cluster. We hope to improve the disk operations in `FileExecutionGraphInfoStore` to increase the performance of session cluster, or support memory store. was: Flink session cluster uses files to store info of jobs after they reach termination with `FileExecutionGraphInfoStore`, each job will generate one file. When the cluster executes many small jobs concurrently, there will be many disk related operations, which will 1> Increase the CPU usage of `Dispatcher` 2> Decrease the performance of the jobs in the cluster. We hope to improve the disk operations in `FileExecutionGraphInfoStore` to increase the performance of session cluster. > Improvement of execution graph store in flink session cluster for jobs > -- > > Key: FLINK-25329 > URL: https://issues.apache.org/jira/browse/FLINK-25329 > Project: Flink > Issue Type: Sub-task > Components: Runtime / Coordination >Affects Versions: 1.14.0, 1.12.5, 1.13.3 >Reporter: Shammon >Priority: Major > > Flink session cluster uses files to store info of jobs after they reach > termination with `FileExecutionGraphInfoStore`, each job will generate one > file. When the cluster executes many small jobs concurrently, there will be > many disk related operations, which will > 1> Increase the CPU usage of `Dispatcher` > 2> Decrease the performance of the jobs in the cluster. > We hope to improve the disk operations in `FileExecutionGraphInfoStore` to > increase the performance of session cluster, or support memory store. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (FLINK-25329) Improvement of execution graph store in flink session cluster for jobs
[ https://issues.apache.org/jira/browse/FLINK-25329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shammon updated FLINK-25329: Summary: Improvement of execution graph store in flink session cluster for jobs (was: Improvement of disk operations in flink session cluster for jobs) > Improvement of execution graph store in flink session cluster for jobs > -- > > Key: FLINK-25329 > URL: https://issues.apache.org/jira/browse/FLINK-25329 > Project: Flink > Issue Type: Sub-task > Components: Runtime / Coordination >Affects Versions: 1.14.0, 1.12.5, 1.13.3 >Reporter: Shammon >Priority: Major > > Flink session cluster uses files to store info of jobs after they reach > termination with `FileExecutionGraphInfoStore`, each job will generate one > file. When the cluster executes many small jobs concurrently, there will be > many disk related operations, which will > 1> Increase the CPU usage of `Dispatcher` > 2> Decrease the performance of the jobs in the cluster. > We hope to improve the disk operations in `FileExecutionGraphInfoStore` to > increase the performance of session cluster. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[GitHub] [flink] liuyongvs commented on pull request #17777: [FLINK-24886][core] TimeUtils supports the form of m.
liuyongvs commented on pull request #1: URL: https://github.com/apache/flink/pull/1#issuecomment-1012689676 hi @tillrohrmann @Thesharing ,do you have time ,sorry to @ you -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-25614) Let LocalWindowAggregate be chained with upstream
[ https://issues.apache.org/jira/browse/FLINK-25614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17475897#comment-17475897 ] Q Kang commented on FLINK-25614: Hi [~wenlong.lwl] , I have fixed `SlicingWindowAggOperatorTest`, but there still exist many other occasions which heavily affected by this change. Maybe we should consult [~jark] & [~jingzhang] for details about the reason to use Long instead of TimestampData. > Let LocalWindowAggregate be chained with upstream > - > > Key: FLINK-25614 > URL: https://issues.apache.org/jira/browse/FLINK-25614 > Project: Flink > Issue Type: Improvement > Components: Table SQL / Runtime >Affects Versions: 1.14.2 >Reporter: Q Kang >Priority: Minor > Labels: pull-request-available > > When enabling two-phase aggregation (local-global) strategy for Window TVF, > the physical plan is shown as follows: > {code:java} > TableSourceScan -> Calc -> WatermarkAssigner -> Calc > || > || [FORWARD] > || > LocalWindowAggregate > || > || [HASH] > || > GlobalWindowAggregate > || > || > ...{code} > We can let the `LocalWindowAggregate` node be chained with upstream operators > in order to improve efficiency, just like the non-windowing counterpart > `LocalGroupAggregate`. > -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (FLINK-25312) Hive supports managed table
[ https://issues.apache.org/jira/browse/FLINK-25312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17475895#comment-17475895 ] Jane Chan commented on FLINK-25312: --- Hi, [~lzljs3620320], I'm interested in this subtask, and would you mind assigning it to me? > Hive supports managed table > --- > > Key: FLINK-25312 > URL: https://issues.apache.org/jira/browse/FLINK-25312 > Project: Flink > Issue Type: Sub-task > Components: Connectors / Hive >Reporter: Jingsong Lee >Priority: Major > Fix For: 1.15.0 > > -- This message was sent by Atlassian Jira (v8.20.1#820001)
[GitHub] [flink] xintongsong commented on pull request #15599: [FLINK-11838][flink-gs-fs-hadoop] Create Google Storage file system with recoverable writer support
xintongsong commented on pull request #15599: URL: https://github.com/apache/flink/pull/15599#issuecomment-1012679021 @galenwarren, The commit doesn't need to be too detailed, as more details can be found with the JIRA id. Something briefly describes the major changes would be good enough. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] zjureel commented on pull request #18303: [FLINK-25085][runtime] Add a scheduled thread pool for periodic tasks in RpcEndpoint
zjureel commented on pull request #18303: URL: https://github.com/apache/flink/pull/18303#issuecomment-1012656102 @flinkbot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] JingGe commented on pull request #18330: [FLINK-25570][streaming] Add topology extension points to Sink V2
JingGe commented on pull request #18330: URL: https://github.com/apache/flink/pull/18330#issuecomment-1012608343 Thanks for your contributions! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] JingGe commented on a change in pull request #18330: [FLINK-25570][streaming] Add topology extension points to Sink V2
JingGe commented on a change in pull request #18330: URL: https://github.com/apache/flink/pull/18330#discussion_r784393519 ## File path: flink-streaming-java/src/main/java/org/apache/flink/streaming/api/connector/sink2/CommittableSummary.java ## @@ -0,0 +1,87 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.streaming.api.connector.sink2; + +import org.apache.flink.annotation.Experimental; + +import javax.annotation.Nullable; + +import java.util.OptionalLong; + +/** + * This class tracks the information about committables belonging to one checkpoint coming from one + * subtask. + * + * It is sent to down-stream consumers to depict the progress of the committing process. + * + * @param type of the committable + */ +@Experimental +public class CommittableSummary implements CommittableMessage { Review comment: It seems that `` is unused. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] JingGe commented on a change in pull request #18330: [FLINK-25570][streaming] Add topology extension points to Sink V2
JingGe commented on a change in pull request #18330: URL: https://github.com/apache/flink/pull/18330#discussion_r784386592 ## File path: flink-streaming-java/src/main/java/org/apache/flink/streaming/api/connector/sink2/CommittableMessageTypeInfo.java ## @@ -0,0 +1,166 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.streaming.api.connector.sink2; + +import org.apache.flink.annotation.Experimental; +import org.apache.flink.api.common.ExecutionConfig; +import org.apache.flink.api.common.typeinfo.TypeInformation; +import org.apache.flink.api.common.typeutils.TypeSerializer; +import org.apache.flink.api.connector.sink2.Committer; +import org.apache.flink.api.connector.sink2.SinkWriter; +import org.apache.flink.core.io.SimpleVersionedSerializer; +import org.apache.flink.core.io.SimpleVersionedSerializerTypeSerializerProxy; +import org.apache.flink.util.function.SerializableSupplier; + +import java.util.Objects; + +/** The message send from {@link SinkWriter} to {@link Committer}. */ +@Experimental +public class CommittableMessageTypeInfo extends TypeInformation> { + +private final SerializableSupplier> +committableSerializerFactory; + +private CommittableMessageTypeInfo( +SerializableSupplier> committableSerializerFactory) { +this.committableSerializerFactory = committableSerializerFactory; +} + +/** + * Returns the type information based on the serializer for a {@link CommittableMessage}. + * + * @param committableSerializerFactory factory to create the serializer for a {@link + * CommittableMessage} + * @param type of the committable + * @return + */ +public static +TypeInformation> forCommittableSerializerFactory( Review comment: NIT: public static TypeInformation> for(SerializableSupplier> committableSerializerFactory) ## File path: flink-streaming-java/src/main/java/org/apache/flink/streaming/api/connector/sink2/CommittableSummary.java ## @@ -0,0 +1,87 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.streaming.api.connector.sink2; + +import org.apache.flink.annotation.Experimental; + +import javax.annotation.Nullable; + +import java.util.OptionalLong; + +/** + * This class tracks the information about committables belonging to one checkpoint coming from one + * subtask. + * + * It is sent to down-stream consumers to depict the progress of the committing process. + * + * @param type of the committable + */ +@Experimental +public class CommittableSummary implements CommittableMessage { Review comment: It seems that is unused. ## File path: flink-streaming-java/src/main/java/org/apache/flink/streaming/api/connector/sink2/CommittableMessageTypeInfo.java ## @@ -0,0 +1,166 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed
[jira] [Comment Edited] (FLINK-25646) Document buffer debloating issues with high parallelism
[ https://issues.apache.org/jira/browse/FLINK-25646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17475778#comment-17475778 ] Mason Chen edited comment on FLINK-25646 at 1/13/22, 10:01 PM: --- Hi [~akalashnikov], thanks for the documentation! Could you attach a reference to the benchmark? I'd like to take a look. Is the observation with unaligned and/or aligned checkpoints? was (Author: mason6345): Hi [~akalashnikov], thanks for the documentation! Could you attach a reference to the benchmark? I'd like to take a look > Document buffer debloating issues with high parallelism > --- > > Key: FLINK-25646 > URL: https://issues.apache.org/jira/browse/FLINK-25646 > Project: Flink > Issue Type: Improvement > Components: Documentation, Runtime / Network >Affects Versions: 1.14.0 >Reporter: Anton Kalashnikov >Assignee: Anton Kalashnikov >Priority: Major > Labels: pull-request-available > > According to last benchmarks, there are some problems with buffer debloat > when job has high parallelism. The high parallelism means the different value > from job to job but in general it is more than 200. So it makes sense to > document that problem and propose the solution - increasing the number of > buffers. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (FLINK-25646) Document buffer debloating issues with high parallelism
[ https://issues.apache.org/jira/browse/FLINK-25646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17475778#comment-17475778 ] Mason Chen commented on FLINK-25646: Hi [~akalashnikov], thanks for the documentation! Could you attach a reference to the benchmark? I'd like to take a look > Document buffer debloating issues with high parallelism > --- > > Key: FLINK-25646 > URL: https://issues.apache.org/jira/browse/FLINK-25646 > Project: Flink > Issue Type: Improvement > Components: Documentation, Runtime / Network >Affects Versions: 1.14.0 >Reporter: Anton Kalashnikov >Assignee: Anton Kalashnikov >Priority: Major > Labels: pull-request-available > > According to last benchmarks, there are some problems with buffer debloat > when job has high parallelism. The high parallelism means the different value > from job to job but in general it is more than 200. So it makes sense to > document that problem and propose the solution - increasing the number of > buffers. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (FLINK-21896) GroupAggregateJsonPlanTest.testDistinctAggCalls fail
[ https://issues.apache.org/jira/browse/FLINK-21896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17475743#comment-17475743 ] Soumyajit Sahu commented on FLINK-21896: [~godfreyhe] [~twalthr] This would break upgrade from Flink 12 to 13, right? If serialization was done using Flink 12, and then app was redeployed using Flink 13, then the deserialization throws this error. To mitigate, we had to delete our app's state and start the flink app afresh. java.io.InvalidClassException: org.apache.flink.table.types.logical.LogicalType; local class incompatible: stream classdesc serialVersionUID = -7381419642101800907, local class serialVersionUID = 1 at java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:699) at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:2003) at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1850) at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:2003) at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1850) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2160) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1667) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:503) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:461) at org.apache.flink.util.InstantiationUtil.deserializeObject(InstantiationUtil.java:615) at org.apache.flink.util.InstantiationUtil.deserializeObject(InstantiationUtil.java:593) > GroupAggregateJsonPlanTest.testDistinctAggCalls fail > > > Key: FLINK-21896 > URL: https://issues.apache.org/jira/browse/FLINK-21896 > Project: Flink > Issue Type: Bug > Components: Table SQL / Planner >Affects Versions: 1.13.0 >Reporter: Guowei Ma >Assignee: godfrey he >Priority: Major > Labels: pull-request-available, test-stability > Fix For: 1.13.0 > > > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=15083&view=logs&j=f66801b3-5d8b-58b4-03aa-cc67e0663d23&t=1abe556e-1530-599d-b2c7-b8c00d549e53&l=6364 > {code:java} > at > org.apache.flink.table.planner.plan.nodes.exec.stream.GroupAggregateJsonPlanTest.testDistinctAggCalls(GroupAggregateJsonPlanTest.java:148) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:566) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.junit.rules.ExpectedException$ExpectedExceptionStatement.evaluate(ExpectedException.java:239) > at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55) > at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48) > at org.junit.rules.RunRules.evaluate(RunRules.java:20) > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57) > at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) > at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) > at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) > at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) > at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) > at org.junit.runners.ParentRunner.run(ParentRunner.java:363) > at org.junit.runners.Suite.runChild(Suite.java:128) > at org.junit.runners.Suite.runChild(Suite.java:27) > at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) > at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) > at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) > at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) > {code} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[GitHub] [flink] rkhachatryan commented on a change in pull request #18324: [FLINK-25557][checkpoint] Introduce incremental/full checkpoint size stats
rkhachatryan commented on a change in pull request #18324: URL: https://github.com/apache/flink/pull/18324#discussion_r784314617 ## File path: flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/SubtaskStateStats.java ## @@ -85,6 +88,7 @@ this.subtaskIndex = subtaskIndex; checkArgument(stateSize >= 0, "Negative state size"); this.stateSize = stateSize; +this.incrementalStateSize = incrementalStateSize; Review comment: Add `checkState` similar to `stateSize`? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] rkhachatryan commented on a change in pull request #18324: [FLINK-25557][checkpoint] Introduce incremental/full checkpoint size stats
rkhachatryan commented on a change in pull request #18324: URL: https://github.com/apache/flink/pull/18324#discussion_r784312461 ## File path: flink-state-backends/flink-statebackend-changelog/src/main/java/org/apache/flink/state/changelog/ChangelogKeyedStateBackend.java ## @@ -368,19 +372,34 @@ public boolean deregisterKeySelectionListener(KeySelectionListener listener) // collections don't change once started and handles are immutable List prevDeltaCopy = new ArrayList<>(changelogStateBackendStateCopy.getRestoredNonMaterialized()); +long incrementalMaterializeSize = 0L; if (delta != null && delta.getStateSize() > 0) { prevDeltaCopy.add(delta); +incrementalMaterializeSize += delta.getIncrementalStateSize(); } if (prevDeltaCopy.isEmpty() && changelogStateBackendStateCopy.getMaterializedSnapshot().isEmpty()) { return SnapshotResult.empty(); } else { +List materializedSnapshot = +changelogStateBackendStateCopy.getMaterializedSnapshot(); +for (KeyedStateHandle keyedStateHandle : materializedSnapshot) { +if (!lastCompletedHandles.contains(keyedStateHandle)) { +incrementalMaterializeSize += keyedStateHandle.getStateSize(); Review comment: (ditto [1st comment](https://github.com/apache/flink/pull/18324#discussion_r784298983)): I think we should NOT count materialized state size in incremental state size -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] rkhachatryan commented on a change in pull request #18324: [FLINK-25557][checkpoint] Introduce incremental/full checkpoint size stats
rkhachatryan commented on a change in pull request #18324: URL: https://github.com/apache/flink/pull/18324#discussion_r784309007 ## File path: flink-runtime/src/main/java/org/apache/flink/runtime/state/StateObject.java ## @@ -63,4 +63,14 @@ * @return Size of the state in bytes. */ long getStateSize(); + +/** + * Returns the incremental state size in bytes. If the size is unknown, this method would return + * same result as {@link #getStateSize()}. + * + * @return Size of incremental state in bytes. + */ +default long getIncrementalStateSize() { +return getStateSize(); +} Review comment: 1. Conceptually, incremental state is only relevant to `CompositeStateHandle` (which defines `registerSharedStates` method). WDYT about moving this method there? 2. Then we could force non-default implementation 3. In javadoc, could you clarify what "incremental" means (as per [comment above](https://github.com/apache/flink/pull/18324#discussion_r784298983)) 4. In javadoc, could you clarify the relation to channel state? Or maybe in some other place, like `OperatorSubtaskState` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] rkhachatryan commented on a change in pull request #18324: [FLINK-25557][checkpoint] Introduce incremental/full checkpoint size stats
rkhachatryan commented on a change in pull request #18324: URL: https://github.com/apache/flink/pull/18324#discussion_r784305375 ## File path: flink-runtime/src/main/java/org/apache/flink/runtime/state/IncrementalRemoteKeyedStateHandle.java ## @@ -204,6 +203,21 @@ public long getStateSize() { return size; } +@Override +public long getIncrementalStateSize() { +long size = StateUtil.getStateSize(metaStateHandle); + +for (StreamStateHandle sharedStateHandle : sharedState.values()) { +size += sharedStateHandle.getIncrementalStateSize(); Review comment: I guess this only works because `PlaceholderStreamStateHandle.getIncrementalStateSize` returns `0`, right? But backend isn't requried to return placeholder IMO; in fact, it currently doesn't - without FLINK-25395/ #18297 (In the future, the latter PR quite likely will be reverted I think). WDYT about computing incremental state size externally (in `RocksIncrementalSnapshotStrategy`) and storing it in metadata? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] rkhachatryan commented on a change in pull request #18324: [FLINK-25557][checkpoint] Introduce incremental/full checkpoint size stats
rkhachatryan commented on a change in pull request #18324: URL: https://github.com/apache/flink/pull/18324#discussion_r784298983 ## File path: flink-runtime/src/main/java/org/apache/flink/runtime/state/changelog/ChangelogStateBackendHandle.java ## @@ -113,6 +124,19 @@ public long getStateSize() { + nonMaterialized.stream().mapToLong(StateObject::getStateSize).sum(); } +@Override +public long getIncrementalStateSize() { +long incrementalStateSize = +incrementalMaterializeSize == undefinedIncrementalMaterializeSize +? materialized.stream() + .mapToLong(StateObject::getIncrementalStateSize) +.sum() +: incrementalMaterializeSize; Review comment: Depending on how we define "incremental state size", materialized part should be included or not: 1. if it's everything that was uploaded for **this** checkpoint, then it should 1. if it's the difference from the previous checkpoint, it should **not** be included Right? It seems problematic to find out what exactly was uploaded for **this** checkpoint because multiple checkpoints will likely include the same materialized state, and therefore report the same incremental state multiple times. Besides that, the 2nd option seems more intuitive to me personally. WDYT? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] rkhachatryan commented on a change in pull request #18324: [FLINK-25557][checkpoint] Introduce incremental/full checkpoint size stats
rkhachatryan commented on a change in pull request #18324: URL: https://github.com/apache/flink/pull/18324#discussion_r784301442 ## File path: flink-runtime/src/main/java/org/apache/flink/runtime/state/changelog/ChangelogStateBackendHandle.java ## @@ -50,17 +50,28 @@ class ChangelogStateBackendHandleImpl implements ChangelogStateBackendHandle { private static final long serialVersionUID = 1L; +private static final long undefinedIncrementalMaterializeSize = -1L; Review comment: Why don't we store this size in checkpoint metadata? So that we can get rid of unknown size and show the correct size after recovery? Nit: `UNDEFINED_INCREMENTAL_MATERIALIZE_SIZE` ? ## File path: flink-runtime-web/web-dashboard/src/app/pages/job/checkpoints/detail/job-checkpoints-detail.component.html ## @@ -45,12 +45,13 @@ > - + Name Acknowledged Latest Acknowledgment End to End Duration -Checkpointed Data Size +Incremental Checkpoint Data Size +Full Checkpoint Data Size Review comment: WDYT about adding a tooltip here and/or in other added tags? (maybe copying the javadoc). I think `title` attribute should work. ## File path: flink-state-backends/flink-statebackend-changelog/src/main/java/org/apache/flink/state/changelog/ChangelogKeyedStateBackend.java ## @@ -368,19 +372,34 @@ public boolean deregisterKeySelectionListener(KeySelectionListener listener) // collections don't change once started and handles are immutable List prevDeltaCopy = new ArrayList<>(changelogStateBackendStateCopy.getRestoredNonMaterialized()); +long incrementalMaterializeSize = 0L; if (delta != null && delta.getStateSize() > 0) { prevDeltaCopy.add(delta); +incrementalMaterializeSize += delta.getIncrementalStateSize(); Review comment: :+1: We should **not** count `prevDeltaCopy.getIncrementalStateSize()` in `incrementalMaterializeSize`. Could you add a comment that it's inttentional? ## File path: flink-runtime/src/main/java/org/apache/flink/runtime/state/changelog/ChangelogStateBackendHandle.java ## @@ -113,6 +124,19 @@ public long getStateSize() { + nonMaterialized.stream().mapToLong(StateObject::getStateSize).sum(); } +@Override +public long getIncrementalStateSize() { +long incrementalStateSize = +incrementalMaterializeSize == undefinedIncrementalMaterializeSize +? materialized.stream() + .mapToLong(StateObject::getIncrementalStateSize) +.sum() +: incrementalMaterializeSize; Review comment: Depending on how we define "incremental state size", materialized part should be included or not: 1. if it's everything that was uploaded for **this** checkpoint, then it should 1. if it's the difference from the previous checkpoint, it should **not** be included Right? And it's problematic to find out what exactly was uploaded for **this** checkpoint because multiple checkpoints will likely include the same materialized state, and therefore report the same incremental state multiple times. Besides that, the 2nd option seems more intuitive to me personally. WDYT? ## File path: flink-state-backends/flink-statebackend-changelog/src/main/java/org/apache/flink/state/changelog/ChangelogKeyedStateBackend.java ## @@ -368,19 +372,34 @@ public boolean deregisterKeySelectionListener(KeySelectionListener listener) // collections don't change once started and handles are immutable List prevDeltaCopy = new ArrayList<>(changelogStateBackendStateCopy.getRestoredNonMaterialized()); +long incrementalMaterializeSize = 0L; if (delta != null && delta.getStateSize() > 0) { prevDeltaCopy.add(delta); +incrementalMaterializeSize += delta.getIncrementalStateSize(); } if (prevDeltaCopy.isEmpty() && changelogStateBackendStateCopy.getMaterializedSnapshot().isEmpty()) { return SnapshotResult.empty(); } else { +List materializedSnapshot = +changelogStateBackendStateCopy.getMaterializedSnapshot(); +for (KeyedStateHandle keyedStateHandle : materializedSnapshot) { +if (!lastCompletedHandles.contains(keyedStateHandle)) { +incrementalMaterializeSize += keyedStateHandle.getStateSize(); Review comment: (ditto 1st comment): I think we should NOT count materialized state size in incremental state size ## File path: flink-runtime/src/main/java/org/apache/flink/runtime/state/changelog/ChangelogState
[jira] [Commented] (FLINK-25505) Fix NetworkBufferPoolTest, SystemResourcesCounterTest on Apple M1
[ https://issues.apache.org/jira/browse/FLINK-25505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17475724#comment-17475724 ] Robert Metzger commented on FLINK-25505: For fixing the SystemResourcesCounterTest, we need to bump oshi-core to version 5.5.0 (that's when the last fix mentioning M1 got in): https://github.com/oshi/oshi/blob/master/CHANGELOG.md This requires us fixing some system metrics code. > Fix NetworkBufferPoolTest, SystemResourcesCounterTest on Apple M1 > -- > > Key: FLINK-25505 > URL: https://issues.apache.org/jira/browse/FLINK-25505 > Project: Flink > Issue Type: Bug > Components: Runtime / Metrics, Runtime / Network >Affects Versions: 1.15.0 >Reporter: Robert Metzger >Priority: Major > > As discussed in https://issues.apache.org/jira/browse/FLINK-23230, some tests > in flink-runtime are not passing on M1 / Apple Silicon Macs. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[GitHub] [flink] galenwarren commented on pull request #15599: [FLINK-11838][flink-gs-fs-hadoop] Create Google Storage file system with recoverable writer support
galenwarren commented on pull request #15599: URL: https://github.com/apache/flink/pull/15599#issuecomment-1012485195 @xintongsong Sounds great! Many thanks for all your work and guidance on this project. Once we get the CI issues resolved, I'll squash everything so it will be ready to merge. You mentioned before using an appropriate commit message, would something based on the initial description in the PR work, i.e. > This PR adds RecoverableWriter support for Google Storage (GS), to allow writing to GS buckets from Flink's StreamingFileSink. To do this, we implement and register a FileSystem implementation for GS and then provide a RecoverableWriter implementation via createRecoverableWriter. ... or were you thinking something more detailed. @zentol Thanks for your suggestions. I'm trying one other thing first -- when we first started this PR, the versions of Guava and some other dependencies in `flink-fs-hadoop-shaded` were older than what was required by some of the Google libraries we use; so, I had to exclude some dependencies in `flink-fs-hadoop-shaded` in favor of the Google-supplied ones. Since then, `flink-fs-hadoop-shaded` has been updated, and its versions seem fine. So I'm reversing things, excluding dependencies from `google-cloud-storage` and `gcs-connector` that are supplied by `flink-fs-hadoop-shaded`. This does seem to have reduced the number of different grpc versions in the dependency tree, and I'm curious if it will affect our CI issues. I'm waiting for the CI build to occur now. If this doesn't fix it, I'll follow through on your dependency-convergence suggestion. Last, I've excluded the ` org.checkerframework:checker-compat-qual`, ` org.checkerframework:checker-qual`, and `org.codehaus.mojo:animal-sniffer-annotations` libraries and removed their license files and entries in NOTICE. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] galenwarren commented on pull request #15599: [FLINK-11838][flink-gs-fs-hadoop] Create Google Storage file system with recoverable writer support
galenwarren commented on pull request #15599: URL: https://github.com/apache/flink/pull/15599#issuecomment-1012456967 @flinkbot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] codecholeric commented on pull request #18333: [FLINK-25220][test] Write an architectural rule for all IT cases w.r.t. the MiniCluster
codecholeric commented on pull request #18333: URL: https://github.com/apache/flink/pull/18333#issuecomment-1012456359 Hey, I quickly looked over it (it's quite late for me already :wink:), but here are my 2 cents why things probably differ (as you already guessed in the last comments): For the case of `KafkaShuffleITCase` or `FlinkKafkaInternalProducerITCase` the `test-jar` goal is defined like this: ``` org.apache.maven.plugins maven-jar-plugin test-jar **/KafkaTestEnvironment* **/testutils/* META-INF/LICENSE META-INF/NOTICE ``` AFAIK IntelliJ doesn't support this fine-grained configuration. It emulates Maven and if there is `test-jar` in one project and `test-jar` as dependency in another, then it will simply dump all classes compiled from `src/test/java` (i.e. `target/test-classes`) onto the classpath. I think that's why `KafkaShuffleITCase` doesn't appear when you run it with Maven (I also checked the contents of the `test-jar` archive and `KafkaShuffleITCase` is indeed not contained). I see 2 solutions: 1) do you really need this optimization to only include a little bit of those classes in the test jars? Most other modules seem to simply dump everything in the test jars, which makes it simpler and the difference here should vanish (after all test jars are not released, right? Thus, maybe it's not really relevant to optimize for a couple KB more or less?) 2) exclude those classes from the test, even though they are not tested with Maven (but then it also doesn't hurt) Theoretically there would be 3), don't go via classpath, but compile everything and then just put in the Flink root folder to look for `.class` files on the file system, but I wouldn't go this way if you can get around it. If there is any way, I would simply follow 1), because that's also the only way to really test all relevant classes in a downstream module. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] JingGe commented on a change in pull request #18322: [FLINK-25589][docs][connectors] Update Chinese version of Elasticsearch connector docs
JingGe commented on a change in pull request #18322: URL: https://github.com/apache/flink/pull/18322#discussion_r784252017 ## File path: docs/content.zh/docs/connectors/datastream/elasticsearch.md ## @@ -294,131 +210,163 @@ input.addSink(esSinkBuilder.build) ### Elasticsearch Sinks 和容错 -启用 Flink checkpoint 后,Flink Elasticsearch Sink 保证至少一次将操作请求发送到 Elasticsearch 集群。 +默认情况下,Flink Elasticsearch Sink 不会提供任何传递健壮性的保障。 +用户可以通过配置自行启用 Elasticsearch sink 的 at-least-once 语义。 Review comment: ```suggestion 用户可以选择启用 Elasticsearch sink 的 at-least-once 语义。 ``` ## File path: docs/content.zh/docs/connectors/datastream/elasticsearch.md ## @@ -40,16 +40,12 @@ under the License. - -5.x -{{< artifact flink-connector-elasticsearch5 >}} - 6.x Review comment: ```suggestion <= 6.3.1 ``` ## File path: docs/content.zh/docs/connectors/table/elasticsearch.md ## @@ -192,31 +177,24 @@ CREATE TABLE myUserTable ( sink.bulk-flush.backoff.max-retries 可选 - 8 + (none) Integer 最大回退重试次数。 sink.bulk-flush.backoff.delay 可选 - 50ms + (none) Duration 每次回退尝试之间的延迟。对于 CONSTANT 回退策略,该值是每次重试之间的延迟。对于 EXPONENTIAL 回退策略,该值是初始的延迟。 Review comment: ```suggestion 每次退避尝试之间的延迟。对于 CONSTANT 退避策略,该值是每次重试之间的延迟。对于 EXPONENTIAL 退避策略,该值是初始的延迟。 ``` ## File path: docs/content.zh/docs/connectors/datastream/elasticsearch.md ## @@ -40,16 +40,12 @@ under the License. - -5.x -{{< artifact flink-connector-elasticsearch5 >}} - 6.x {{< artifact flink-connector-elasticsearch6 >}} -7 及更高版本 +7.x Review comment: ```suggestion <= 7.5.1 ``` ## File path: docs/content.zh/docs/connectors/datastream/elasticsearch.md ## @@ -294,131 +210,163 @@ input.addSink(esSinkBuilder.build) ### Elasticsearch Sinks 和容错 -启用 Flink checkpoint 后,Flink Elasticsearch Sink 保证至少一次将操作请求发送到 Elasticsearch 集群。 +默认情况下,Flink Elasticsearch Sink 不会提供任何传递健壮性的保障。 +用户可以通过配置自行启用 Elasticsearch sink 的 at-least-once 语义。 + +启用 Flink checkpoint 后,Flink Elasticsearch Sink 可以保证至少一次将操作请求发送到 Elasticsearch 集群。 Review comment: ```suggestion 通过启用 Flink checkpoint ,Flink Elasticsearch Sink 可以保证至少一次将操作请求发送到 Elasticsearch 集群。 ``` ## File path: docs/content.zh/docs/connectors/datastream/elasticsearch.md ## @@ -294,131 +210,163 @@ input.addSink(esSinkBuilder.build) ### Elasticsearch Sinks 和容错 -启用 Flink checkpoint 后,Flink Elasticsearch Sink 保证至少一次将操作请求发送到 Elasticsearch 集群。 +默认情况下,Flink Elasticsearch Sink 不会提供任何传递健壮性的保障。 +用户可以通过配置自行启用 Elasticsearch sink 的 at-least-once 语义。 + +启用 Flink checkpoint 后,Flink Elasticsearch Sink 可以保证至少一次将操作请求发送到 Elasticsearch 集群。 这是通过在进行 checkpoint 时等待 `BulkProcessor` 中所有挂起的操作请求来实现。 这有效地保证了在触发 checkpoint 之前所有的请求被 Elasticsearch 成功确认,然后继续处理发送到 sink 的记录。 关于 checkpoint 和容错的更多详细信息,请参见[容错文档]({{< ref "docs/learn-flink/fault_tolerance" >}})。 -要使用具有容错特性的 Elasticsearch Sinks,需要在执行环境中启用作业拓扑的 checkpoint: +要使用具有容错特性的 Elasticsearch Sinks,需要配置启用 at-least-once 分发并且在执行环境中启用作业拓扑的 checkpoint: {{< tabs "d00d1e93-4844-40d7-b0ec-9ec37e73145e" >}} {{< tab "Java" >}} +Elasticsearch 6: ```java final StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment(); env.enableCheckpointing(5000); // 每 5000 毫秒执行一次 checkpoint + +Elasticsearch6SinkBuilder sinkBuilder = new Elasticsearch6SinkBuilder() +.setDeliveryGuarantee(DeliveryGuarantee.AT_LEAST_ONCE) +.setHosts(new HttpHost("127.0.0.1", 9200, "http")) +.setEmitter( +(element, context, indexer) -> +indexer.add(createIndexRequest(element))); +``` + +Elasticsearch 7: +```java +final StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment(); +env.enableCheckpointing(5000); // 每 5000 毫秒执行一次 checkpoint + +Elasticsearch7SinkBuilder sinkBuilder = new Elasticsearch7SinkBuilder() +.setDeliveryGuarantee(DeliveryGuarantee.AT_LEAST_ONCE) +.setHosts(new HttpHost("127.0.0.1", 9200, "http")) +.setEmitter( +(element, context, indexer) -> +indexer.add(createIndexRequest(element))); ``` {{< /tab >}} {{< tab "Scala" >}} +Elasticsearch 6: ```scala val env = StreamExecutionEnvironment.getExecutionEnvironment() env.enableCheckpointing(5000) // 每 5000 毫秒执行一次 checkpoint + +val sinkBuilder = new Elasticsearch6SinkBuilder[String] + .setDeliveryGuarantee(DeliveryGuarantee.AT_LEAST_ONCE) + .setHosts(new HttpHost("127.0.0.1", 9200, "http")) + .setEmitter((element: String, context: SinkWriter.Context, indexer: RequestIndexer) => + indexer.add(createIndexRequest(element))) +``` + +Elasticsearch 7: +```scala +val env = StreamExecutionEnvironment.getExecutionEnvironment() +env.enableCheckpointing(5000) // 每 5000 毫秒执行一次 checkpoint + +val sinkBuilder = new Elasticse
[GitHub] [flink] rkhachatryan commented on a change in pull request #18297: [FLINK-25395][state] Fix TM error handling when uploading shared state
rkhachatryan commented on a change in pull request #18297: URL: https://github.com/apache/flink/pull/18297#discussion_r784256807 ## File path: flink-runtime/src/main/java/org/apache/flink/runtime/state/changelog/ChangelogStateHandleStreamImpl.java ## @@ -83,11 +80,7 @@ public KeyedStateHandle getIntersection(KeyGroupRange keyGroupRange) { @Override public void discardState() throws Exception { -// discard only on TM; on JM, shared state is removed on subsumption -if (stateRegistry == null) { -bestEffortDiscardAllStateObjects( -handlesAndOffsets.stream().map(t2 -> t2.f0).collect(Collectors.toList())); -} +// do nothing: rely on SharedStateRegistry for cleanup after ACK; Review comment: The in-memory object will be GCed on checkpoint subsumption; The actual handles will be discarded by `SharedStateRegistry` when no longer used (they are registered in `registerSharedStates`). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Closed] (FLINK-25341) Improve casting STRUCTURED type to STRING
[ https://issues.apache.org/jira/browse/FLINK-25341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Timo Walther closed FLINK-25341. Fix Version/s: 1.15.0 Resolution: Fixed Fixed in master: 0157aa3bf9cd7bce32cd4abb6fa7b6fdd1de2677 > Improve casting STRUCTURED type to STRING > - > > Key: FLINK-25341 > URL: https://issues.apache.org/jira/browse/FLINK-25341 > Project: Flink > Issue Type: Sub-task > Components: Table SQL / Planner >Reporter: Timo Walther >Assignee: Francesco Guardiani >Priority: Major > Labels: pull-request-available > Fix For: 1.15.0 > > > Structured types currently use ROW to STRING logic. However, this is not very > useful for users as the field order might be determined by Flink. Also, > structured types has the nice property of defining a custom {{toString}} and > attribute names. > I would suggest the following: > If the structured type has a {{StructuredType.getImplementationClass}} > convert to external class and call {{toString}}. > If no implementation class is present or the toString is not possible, use > the string representation of maps. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[GitHub] [flink] twalthr closed pull request #18182: [FLINK-25341][table-planner] Add StructuredToStringCastRule to support user POJO toString implementation
twalthr closed pull request #18182: URL: https://github.com/apache/flink/pull/18182 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] JingGe commented on a change in pull request #18333: [FLINK-25220][test] Write an architectural rule for all IT cases w.r.t. the MiniCluster
JingGe commented on a change in pull request #18333: URL: https://github.com/apache/flink/pull/18333#discussion_r784026310 ## File path: flink-architecture-tests/src/test/java/org/apache/flink/architecture/common/JavaFieldPredicates.java ## @@ -0,0 +1,129 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.architecture.common; + +import com.tngtech.archunit.base.DescribedPredicate; +import com.tngtech.archunit.core.domain.JavaAnnotation; +import com.tngtech.archunit.core.domain.JavaField; +import com.tngtech.archunit.core.domain.JavaModifier; + +import java.lang.annotation.Annotation; +import java.util.Arrays; + +/** Fine-grained predicates focus on the JavaField. */ +public class JavaFieldPredicates { + +/** + * Match the public modifier of the {@link JavaField}. + * + * @return A {@link DescribedPredicate} returning true, if and only if the tested {@link + * JavaField} has the public modifier. + */ +public static DescribedPredicate isPublic() { +return DescribedPredicate.describe( +"public", field -> field.getModifiers().contains(JavaModifier.PUBLIC)); +} + +/** + * Match the static modifier of the {@link JavaField}. + * + * @return A {@link DescribedPredicate} returning true, if and only if the tested {@link + * JavaField} has the static modifier. + */ +public static DescribedPredicate isStatic() { +return DescribedPredicate.describe( +"static", field -> field.getModifiers().contains(JavaModifier.STATIC)); +} + +/** + * Match none static modifier of the {@link JavaField}. + * + * @return A {@link DescribedPredicate} returning true, if and only if the tested {@link + * JavaField} has no static modifier. + */ +public static DescribedPredicate isNotStatic() { +return DescribedPredicate.describe( +"not static", field -> !field.getModifiers().contains(JavaModifier.STATIC)); +} + +/** + * Match the final modifier of the {@link JavaField}. + * + * @return A {@link DescribedPredicate} returning true, if and only if the tested {@link + * JavaField} has the final modifier. + */ +public static DescribedPredicate isFinal() { +return DescribedPredicate.describe( +"final", field -> field.getModifiers().contains(JavaModifier.FINAL)); +} Review comment: I think actually the other way around. There are too many combinations. Methods like `arePublicStaticFinalOfType` provide the convenience for the most asked scenarios. These fine-gained methods will be used for other combinations, like how lego works. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot commented on pull request #18356: [hotfix][flink-annotations] Add v1.15 as the next Flink version to master
flinkbot commented on pull request #18356: URL: https://github.com/apache/flink/pull/18356#issuecomment-1012346346 Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community to review your pull request. We will use this comment to track the progress of the review. ## Automated Checks Last check on commit a10aee84fae72f74cb17b2b9f736b8d04d6e6200 (Thu Jan 13 17:22:12 UTC 2022) **Warnings:** * No documentation files were touched! Remember to keep the Flink docs up to date! Mention the bot in a comment to re-run the automated checks. ## Review Progress * ❓ 1. The [description] looks good. * ❓ 2. There is [consensus] that the contribution should go into to Flink. * ❓ 3. Needs [attention] from. * ❓ 4. The change fits into the overall [architecture]. * ❓ 5. Overall code [quality] is good. Please see the [Pull Request Review Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full explanation of the review process. The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands The @flinkbot bot supports the following commands: - `@flinkbot approve description` to approve one or more aspects (aspects: `description`, `consensus`, `architecture` and `quality`) - `@flinkbot approve all` to approve all aspects - `@flinkbot approve-until architecture` to approve everything until `architecture` - `@flinkbot attention @username1 [@username2 ..]` to require somebody's attention - `@flinkbot disapprove architecture` to remove an approval you gave earlier -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] matriv opened a new pull request #18356: [hotfix][flink-annotations] Add v1.15 as the next Flink version to master
matriv opened a new pull request #18356: URL: https://github.com/apache/flink/pull/18356 The upcoming version to be released need to exist already in master so that when the new release branch is created from master the version of the release is already there. Follows: #18340 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] twalthr commented on a change in pull request #18352: [FLINK-15585][table] Improve function identifier string in plan digest
twalthr commented on a change in pull request #18352: URL: https://github.com/apache/flink/pull/18352#discussion_r784160659 ## File path: flink-table/flink-table-common/src/test/java/org/apache/flink/table/functions/UserDefinedFunctionHelperTest.java ## @@ -143,34 +162,27 @@ private void testErrorMessage() { @Nullable String expectedErrorMessage; -TestSpec(Class functionClass) { +TestSpec( +@Nullable Class functionClass, +@Nullable UserDefinedFunction functionInstance, +@Nullable CatalogFunction catalogFunction) { this.functionClass = functionClass; -this.functionInstance = null; -this.catalogFunction = null; -} - -TestSpec(UserDefinedFunction functionInstance) { -this.functionClass = null; this.functionInstance = functionInstance; -this.catalogFunction = null; -} - -TestSpec(CatalogFunction catalogFunction) { -this.functionClass = null; -this.functionInstance = null; this.catalogFunction = catalogFunction; } static TestSpec forClass(Class function) { -return new TestSpec(function); +return new TestSpec(function, null, null); } static TestSpec forInstance(UserDefinedFunction function) { -return new TestSpec(function); +return new TestSpec(null, function, null); } -static TestSpec forCatalogFunction(String className, FunctionLanguage language) { -return new TestSpec(new CatalogFunctionMock(className, language)); +static TestSpec forCatalogFunction( +String className, +@SuppressWarnings("SameParameterValue") FunctionLanguage language) { Review comment: You are right. I will simply always choose Java for now. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] twalthr commented on a change in pull request #18352: [FLINK-15585][table] Improve function identifier string in plan digest
twalthr commented on a change in pull request #18352: URL: https://github.com/apache/flink/pull/18352#discussion_r784159595 ## File path: flink-table/flink-table-common/src/main/java/org/apache/flink/table/functions/UserDefinedFunctionHelper.java ## @@ -243,6 +244,36 @@ public static void prepareInstance(ReadableConfig config, UserDefinedFunction fu cleanFunction(config, function); } +/** + * Returns whether a {@link UserDefinedFunction} can be easily serialized and identified by only + * a fully qualified class name. It must have a default constructor and no serializable fields. + */ +public static boolean isClassNameSerializable(UserDefinedFunction function) { +final Class functionClass = function.getClass(); +if (!InstantiationUtil.hasPublicNullaryConstructor(functionClass)) { +// function must be parameterized +return false; +} +Class currentClass = functionClass; +while (!currentClass.equals(UserDefinedFunction.class)) { +for (Field field : currentClass.getDeclaredFields()) { +if (!Modifier.isTransient(field.getModifiers()) +&& !Modifier.isStatic(field.getModifiers())) { +// function seems to be stateful +return false; +} +} +currentClass = currentClass.getSuperclass(); +} Review comment: Even with a marker interface you need to perform this check. User-defined function might come in various flavors. And the more interfaces, the harder it gets to get the job done. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] twalthr commented on a change in pull request #18352: [FLINK-15585][table] Improve function identifier string in plan digest
twalthr commented on a change in pull request #18352: URL: https://github.com/apache/flink/pull/18352#discussion_r784159595 ## File path: flink-table/flink-table-common/src/main/java/org/apache/flink/table/functions/UserDefinedFunctionHelper.java ## @@ -243,6 +244,36 @@ public static void prepareInstance(ReadableConfig config, UserDefinedFunction fu cleanFunction(config, function); } +/** + * Returns whether a {@link UserDefinedFunction} can be easily serialized and identified by only + * a fully qualified class name. It must have a default constructor and no serializable fields. + */ +public static boolean isClassNameSerializable(UserDefinedFunction function) { +final Class functionClass = function.getClass(); +if (!InstantiationUtil.hasPublicNullaryConstructor(functionClass)) { +// function must be parameterized +return false; +} +Class currentClass = functionClass; +while (!currentClass.equals(UserDefinedFunction.class)) { +for (Field field : currentClass.getDeclaredFields()) { +if (!Modifier.isTransient(field.getModifiers()) +&& !Modifier.isStatic(field.getModifiers())) { +// function seems to be stateful +return false; +} +} +currentClass = currentClass.getSuperclass(); +} Review comment: Even with a marker interface you need perform this check. User-defined function might come in various flavors. And the more interfaces, the harder it gets to get the job done. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] echauchot commented on pull request #18284: [FLINK-21407][docs][formats] Remove MongoDb connector as it is not supported in DataStream API
echauchot commented on pull request #18284: URL: https://github.com/apache/flink/pull/18284#issuecomment-1012320859 > Sorry again but do we expect more changes to come from https://issues.apache.org/jira/browse/FLINK-21407 if so we should create subtasks for the different connectors and formats. I think we should start with it right away because now we have multiple PRs coming from the same ticket to the same branch. Hi @fapaul I had the same interrogation: we don't expect other changes as @MartijnVisser answered above: no other connectors to clean. There was indeed 2 PRs on each branch (1.13, 1.14, master) the first one adds the formats and the second one removes MongoDb that was added by error. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] gaoyunhaii commented on a change in pull request #18330: [FLINK-25570][streaming] Add topology extension points to Sink V2
gaoyunhaii commented on a change in pull request #18330: URL: https://github.com/apache/flink/pull/18330#discussion_r784116528 ## File path: flink-streaming-java/src/main/java/org/apache/flink/streaming/api/connector/sink2/CommittableMessage.java ## @@ -0,0 +1,36 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.streaming.api.connector.sink2; + +import org.apache.flink.annotation.Experimental; + +import java.util.OptionalLong; + +/** The message send from {@link SinkWriter} to {@link Committer}. */ Review comment: The two links would need imports. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] CrynetLogistics commented on a change in pull request #18165: [FLINK-24904][docs] Updated docs to reflect new KDS Sink and deprecat…
CrynetLogistics commented on a change in pull request #18165: URL: https://github.com/apache/flink/pull/18165#discussion_r784138446 ## File path: docs/content/docs/connectors/datastream/kinesis.md ## @@ -29,9 +29,26 @@ under the License. The Kinesis connector provides access to [Amazon AWS Kinesis Streams](http://aws.amazon.com/kinesis/streams/). -To use the connector, add the following Maven dependency to your project: - -{{< artifact flink-connector-kinesis >}} +To use this connector, add one or more of the following dependencies to your project, depending on whether you are reading from and/or writing to Kinesis Data Streams: + + + + + KDS Connectivity + Maven Dependency + + + + +Source +{{< artifact flink-connector-kinesis >}} + + +Sink +{{< artifact flink-connector-aws-kinesis-data-streams >}} + + + {{< hint warning >}} **Attention** Prior to Flink version 1.10.0 the `flink-connector-kinesis` has a dependency on code licensed under the [Amazon Software License](https://aws.amazon.com/asl/). Review comment: You're right, is it time to remove it? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] CrynetLogistics commented on a change in pull request #18165: [FLINK-24904][docs] Updated docs to reflect new KDS Sink and deprecat…
CrynetLogistics commented on a change in pull request #18165: URL: https://github.com/apache/flink/pull/18165#discussion_r784136808 ## File path: docs/content/docs/connectors/datastream/kinesis.md ## @@ -566,124 +583,126 @@ Retry and backoff parameters can be configured using the `ConsumerConfigConstant this is called once per stream during stream consumer deregistration, unless the `NONE` or `EAGER` registration strategy is configured. Retry and backoff parameters can be configured using the `ConsumerConfigConstants.DEREGISTER_STREAM_*` keys. -## Kinesis Producer - -The `FlinkKinesisProducer` uses [Kinesis Producer Library (KPL)](http://docs.aws.amazon.com/streams/latest/dev/developing-producers-with-kpl.html) to put data from a Flink stream into a Kinesis stream. - -Note that the producer is not participating in Flink's checkpointing and doesn't provide exactly-once processing guarantees. Also, the Kinesis producer does not guarantee that records are written in order to the shards (See [here](https://github.com/awslabs/amazon-kinesis-producer/issues/23) and [here](http://docs.aws.amazon.com/kinesis/latest/APIReference/API_PutRecord.html#API_PutRecord_RequestSyntax) for more details). +## Kinesis Data Streams Sink -In case of a failure or a resharding, data will be written again to Kinesis, leading to duplicates. This behavior is usually called "at-least-once" semantics. +The Kinesis Data Streams sink (hereafter "Kinesis sink") uses the [AWS v2 SDK for Java](https://docs.aws.amazon.com/sdk-for-java/latest/developer-guide/home.html) to write data from a Flink stream into a Kinesis stream. -To put data into a Kinesis stream, make sure the stream is marked as "ACTIVE" in the AWS dashboard. +To write data into a Kinesis stream, make sure the stream is marked as "ACTIVE" in the AWS dashboard. For the monitoring to work, the user accessing the stream needs access to the CloudWatch service. {{< tabs "6df3b696-c2ca-4f44-bea0-96cf8275d61c" >}} {{< tab "Java" >}} ```java -Properties producerConfig = new Properties(); -// Required configs -producerConfig.put(AWSConfigConstants.AWS_REGION, "us-east-1"); -producerConfig.put(AWSConfigConstants.AWS_ACCESS_KEY_ID, "aws_access_key_id"); -producerConfig.put(AWSConfigConstants.AWS_SECRET_ACCESS_KEY, "aws_secret_access_key"); -// Optional configs -producerConfig.put("AggregationMaxCount", "4294967295"); -producerConfig.put("CollectionMaxCount", "1000"); -producerConfig.put("RecordTtl", "3"); -producerConfig.put("RequestTimeout", "6000"); -producerConfig.put("ThreadPoolSize", "15"); - -// Disable Aggregation if it's not supported by a consumer -// producerConfig.put("AggregationEnabled", "false"); -// Switch KinesisProducer's threading model -// producerConfig.put("ThreadingModel", "PER_REQUEST"); - -FlinkKinesisProducer kinesis = new FlinkKinesisProducer<>(new SimpleStringSchema(), producerConfig); -kinesis.setFailOnError(true); -kinesis.setDefaultStream("kinesis_stream_name"); -kinesis.setDefaultPartition("0"); +ElementConverter elementConverter = +KinesisDataStreamsSinkElementConverter.builder() +.setSerializationSchema(new SimpleStringSchema()) +.setPartitionKeyGenerator(element -> String.valueOf(element.hashCode())) +.build(); + +Properties sinkProperties = new Properties(); +// Required +sinkProperties.put(AWSConfigConstants.AWS_REGION, "us-east-1"); + +// Optional, provide via alternative routes e.g. environment variables +sinkProperties.put(AWSConfigConstants.AWS_ACCESS_KEY_ID, "aws_access_key_id"); +sinkProperties.put(AWSConfigConstants.AWS_SECRET_ACCESS_KEY, "aws_secret_access_key"); + +KinesisDataStreamsSink kdsSink = +KinesisDataStreamsSink.builder() +.setKinesisClientProperties(sinkProperties)// Required +.setElementConverter(elementConverter) // Required +.setStreamName("your-stream-name") // Required +.setFailOnError(false) // Optional +.setMaxBatchSize(500) // Optional +.setMaxInFlightRequests(16)// Optional +.setMaxBufferedRequests(10_000)// Optional +.setMaxBatchSizeInBytes(5 * 1024 * 1024) // Optional +.setMaxTimeInBufferMS(5000)// Optional +.setMaxRecordSizeInBytes(1 * 1024 * 1024) // Optional +.build(); DataStream simpleStringStream = ...; -simpleStringStream.addSink(kinesis); +simpleStringStream.sinkTo(kdsSink); ``` {{< /tab >}} {{< tab "Scala" >}} ```scala -val producerConfig = new Properties() -// Required configs -producerConfig.put(AWSConfigConstants.AWS_REGION, "us-east-1") -producerConfig.put(AWSConfigConstants.AWS_ACCESS_KEY_ID, "aws_access_key_id") -producerConfig.put(AWSConfigConstants.AWS_SECRET_ACCESS_KEY, "aws_secret_access_key") -// Optional KPL configs -producerConfig.put("AggregationMaxCount", "4294967295") -prod
[GitHub] [flink] fapaul commented on pull request #18284: [FLINK-21407][docs][formats] Remove MongoDb connector as it is not supported in DataStream API
fapaul commented on pull request #18284: URL: https://github.com/apache/flink/pull/18284#issuecomment-1012308846 Sorry again but do we expect more changes to come from https://issues.apache.org/jira/browse/FLINK-21407 if so we should create subtasks for the different connectors and formats. I think we should start with it right away because now we have multiple PRs coming from the same ticket to the same branch. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] fapaul commented on pull request #17937: [FLINK-25044][testing][Pulsar Connector] Add More Unit Test For Pulsar Source
fapaul commented on pull request #17937: URL: https://github.com/apache/flink/pull/17937#issuecomment-1012304383 Looks good please squash and rebase then I'll take care of merging. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] slinkydeveloper commented on a change in pull request #18352: [FLINK-15585][table] Improve function identifier string in plan digest
slinkydeveloper commented on a change in pull request #18352: URL: https://github.com/apache/flink/pull/18352#discussion_r784104335 ## File path: flink-table/flink-table-common/src/main/java/org/apache/flink/table/functions/UserDefinedFunctionHelper.java ## @@ -243,6 +244,36 @@ public static void prepareInstance(ReadableConfig config, UserDefinedFunction fu cleanFunction(config, function); } +/** + * Returns whether a {@link UserDefinedFunction} can be easily serialized and identified by only + * a fully qualified class name. It must have a default constructor and no serializable fields. + */ +public static boolean isClassNameSerializable(UserDefinedFunction function) { +final Class functionClass = function.getClass(); +if (!InstantiationUtil.hasPublicNullaryConstructor(functionClass)) { +// function must be parameterized +return false; +} +Class currentClass = functionClass; +while (!currentClass.equals(UserDefinedFunction.class)) { +for (Field field : currentClass.getDeclaredFields()) { +if (!Modifier.isTransient(field.getModifiers()) +&& !Modifier.isStatic(field.getModifiers())) { +// function seems to be stateful +return false; +} +} +currentClass = currentClass.getSuperclass(); +} Review comment: That sounds like a very magic way to find out whether a function is stateful or not. Don't we have a marker interface for that? ## File path: flink-table/flink-table-common/src/test/java/org/apache/flink/table/functions/UserDefinedFunctionHelperTest.java ## @@ -143,34 +162,27 @@ private void testErrorMessage() { @Nullable String expectedErrorMessage; -TestSpec(Class functionClass) { +TestSpec( +@Nullable Class functionClass, +@Nullable UserDefinedFunction functionInstance, +@Nullable CatalogFunction catalogFunction) { this.functionClass = functionClass; -this.functionInstance = null; -this.catalogFunction = null; -} - -TestSpec(UserDefinedFunction functionInstance) { -this.functionClass = null; this.functionInstance = functionInstance; -this.catalogFunction = null; -} - -TestSpec(CatalogFunction catalogFunction) { -this.functionClass = null; -this.functionInstance = null; this.catalogFunction = catalogFunction; } static TestSpec forClass(Class function) { -return new TestSpec(function); +return new TestSpec(function, null, null); } static TestSpec forInstance(UserDefinedFunction function) { -return new TestSpec(function); +return new TestSpec(null, function, null); } -static TestSpec forCatalogFunction(String className, FunctionLanguage language) { -return new TestSpec(new CatalogFunctionMock(className, language)); +static TestSpec forCatalogFunction( +String className, +@SuppressWarnings("SameParameterValue") FunctionLanguage language) { Review comment: What warning is this one? ## File path: flink-table/flink-table-common/src/main/java/org/apache/flink/table/functions/UserDefinedFunction.java ## @@ -58,9 +60,13 @@ /** Returns a unique, serialized representation for this function. */ public final String functionIdentifier() { +final String className = getClass().getName(); +if (isClassNameSerializable(this)) { +return className; +} final String md5 = EncodingUtils.hex(EncodingUtils.md5(EncodingUtils.encodeObjectToString(this))); -return getClass().getName().replace('.', '$').concat("$").concat(md5); +return className.concat("$").concat(md5); Review comment: Does this logic works when the function is implemented as a scala `object`? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] fapaul commented on pull request #18330: [FLINK-25570][streaming] Add topology extension points to Sink V2
fapaul commented on pull request #18330: URL: https://github.com/apache/flink/pull/18330#issuecomment-1012277251 @flinkbot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot commented on pull request #18355: [FLINK-25160][docs] Clarified purpose of execution.checkpointing.tole…
flinkbot commented on pull request #18355: URL: https://github.com/apache/flink/pull/18355#issuecomment-1012276469 Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community to review your pull request. We will use this comment to track the progress of the review. ## Automated Checks Last check on commit 6b89b5c2cb9734084a479e7dfcea82a3218dd3ff (Thu Jan 13 16:04:21 UTC 2022) **Warnings:** * No documentation files were touched! Remember to keep the Flink docs up to date! Mention the bot in a comment to re-run the automated checks. ## Review Progress * ❓ 1. The [description] looks good. * ❓ 2. There is [consensus] that the contribution should go into to Flink. * ❓ 3. Needs [attention] from. * ❓ 4. The change fits into the overall [architecture]. * ❓ 5. Overall code [quality] is good. Please see the [Pull Request Review Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full explanation of the review process. The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands The @flinkbot bot supports the following commands: - `@flinkbot approve description` to approve one or more aspects (aspects: `description`, `consensus`, `architecture` and `quality`) - `@flinkbot approve all` to approve all aspects - `@flinkbot approve-until architecture` to approve everything until `architecture` - `@flinkbot attention @username1 [@username2 ..]` to require somebody's attention - `@flinkbot disapprove architecture` to remove an approval you gave earlier -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] leonardBang commented on pull request #17657: [FLINK-24745][format][json] Add support for Oracle OGG JSON format parser
leonardBang commented on pull request #17657: URL: https://github.com/apache/flink/pull/17657#issuecomment-1012275729 > @leonardBang I'm sorry, I've been a bit busy for a while, in fact, I probably won't be able to fix the bugs for the month. or should I close this PR for now or set it to draft mode and submit it when I'm done fixing it? It's okay, let me find someone to help take over this PR, we hope this feature can appear in 1.15. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (FLINK-25160) Make doc clear: tolerable-failed-checkpoints counts consecutive failures
[ https://issues.apache.org/jira/browse/FLINK-25160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-25160: --- Labels: pull-request-available (was: ) > Make doc clear: tolerable-failed-checkpoints counts consecutive failures > > > Key: FLINK-25160 > URL: https://issues.apache.org/jira/browse/FLINK-25160 > Project: Flink > Issue Type: Improvement > Components: Documentation >Affects Versions: 1.14.0, 1.12.5, 1.13.3 >Reporter: Jun Qin >Assignee: Anton Kalashnikov >Priority: Major > Labels: pull-request-available > > According to the code, tolerable-failed-checkpoints counts the consecutive > failures. We should make this clear in the doc > [config|https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/deployment/config/] > -- This message was sent by Atlassian Jira (v8.20.1#820001)
[GitHub] [flink] akalash opened a new pull request #18355: [FLINK-25160][docs] Clarified purpose of execution.checkpointing.tole…
akalash opened a new pull request #18355: URL: https://github.com/apache/flink/pull/18355 …rable-failed-checkpoints ## What is the purpose of the change Clarified that execution.checkpointing.tolerable-failed-checkpoints is for the consecutive failures. ## Verifying this change It is just documentation ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): (yes / **no**) - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (**yes** / no) - The serializers: (yes / **no** / don't know) - The runtime per-record code paths (performance sensitive): (yes / **no** / don't know) - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (yes / **no** / don't know) - The S3 file system connector: (yes / **no** / don't know) ## Documentation - Does this pull request introduce a new feature? (yes / **no**) - If yes, how is the feature documented? (not applicable / **docs** / JavaDocs / not documented) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] JingGe edited a comment on pull request #18333: [FLINK-25220][test] Write an architectural rule for all IT cases w.r.t. the MiniCluster
JingGe edited a comment on pull request #18333: URL: https://github.com/apache/flink/pull/18333#issuecomment-1012252594 One additional information I found to share (thanks @fapaul): some Flink modules customized test-jar build logic, which turns out not all test code could be found in test jars. This might also be a reason that test running with maven only cover a smaller scope than running in IntelliJ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] JingGe commented on pull request #18333: [FLINK-25220][test] Write an architectural rule for all IT cases w.r.t. the MiniCluster
JingGe commented on pull request #18333: URL: https://github.com/apache/flink/pull/18333#issuecomment-1012257162 > sure, this is the option I mentioned previously to shrink the scope. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] Airblader commented on pull request #18333: [FLINK-25220][test] Write an architectural rule for all IT cases w.r.t. the MiniCluster
Airblader commented on pull request #18333: URL: https://github.com/apache/flink/pull/18333#issuecomment-1012254365 > some Flink modules customized test-jar build logic, which turns out not all test code could be found in test jars Yeah, that sounds like it might well be the cause, or at least be partially the reason. Do you know which or how many modules? If removing one or two such modules fixes the issue, we could use that as a temporary solution. I'd rather cover less code with the tests than have inconsistent results. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] JingGe commented on pull request #18333: [FLINK-25220][test] Write an architectural rule for all IT cases w.r.t. the MiniCluster
JingGe commented on pull request #18333: URL: https://github.com/apache/flink/pull/18333#issuecomment-1012252594 One additional information I found to share: some Flink modules customized test-jar build logic, which turns out not all test code could be found in test jars. This might also be a reason that test running with maven only cover a smaller scope than running in IntelliJ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] Airblader commented on pull request #18333: [FLINK-25220][test] Write an architectural rule for all IT cases w.r.t. the MiniCluster
Airblader commented on pull request #18333: URL: https://github.com/apache/flink/pull/18333#issuecomment-1012247801 > It means that maven and IntelliJ are running in different scopes, i.e. "more violations" will be found by IntelliJ because it use a bigger scope than maven Running your branch in IntelliJ adds new violations, but also removes some violations. > […] than working on it within this PR? I'm also OK if you want to resolve this outside of this PR, but then the PR would have to wait until then. If we want to merge this PR, we need to address it here. As long as the test executes in both IntelliJ and Maven, but produces different results in each, I will not be +1, because I think it is too confusing and frustrating for developers. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] JingGe commented on a change in pull request #18333: [FLINK-25220][test] Write an architectural rule for all IT cases w.r.t. the MiniCluster
JingGe commented on a change in pull request #18333: URL: https://github.com/apache/flink/pull/18333#discussion_r784063682 ## File path: flink-architecture-tests/src/test/java/org/apache/flink/architecture/common/Predicates.java ## @@ -50,5 +83,78 @@ && field.getRawType().isEquivalentTo(clazz)); } +/** + * Tests that the given field is {@code public final} and not {@code static} and of the given + * type {@code clazz} . + */ +public static DescribedPredicate arePublicFinalOfTyp(Class clazz) { +return is(ofType(clazz)).and(isPublic()).and(isFinal()).and(isNotStatic()); +} + +/** + * Tests that the given field is {@code public static final} and of the given type {@code clazz} + * . + */ +public static DescribedPredicate arePublicStaticFinalOfTyp(Class clazz) { +return arePublicStaticOfType(clazz).and(isFinal()); +} + +/** + * Tests that the given field is {@code public final} and of the given type {@code clazz} with + * exactly the given {@code annotationType}. + */ +public static DescribedPredicate arePublicFinalOfTypeWithAnnotation( +Class clazz, Class annotationType) { +return arePublicFinalOfTyp(clazz).and(annotatedWith(annotationType)); +} + +/** + * Tests that the given field is {@code public static final} and of the given type {@code clazz} + * with exactly the given {@code annotationType}. + */ +public static DescribedPredicate arePublicStaticFinalOfTypeWithAnnotation( +Class clazz, Class annotationType) { +return arePublicStaticFinalOfTyp(clazz).and(annotatedWith(annotationType)); +} + +/** + * Tests that the given field is {@code public static final} and of the given type. It must have + * any of the given {@code annotationTypes} + */ +@SafeVarargs +public static DescribedPredicate arePublicStaticFinalOfTypeWithOneOfAnnotations( Review comment: I have nothing to against removing unused code. It was committed it to provide convenience for further development. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] matriv commented on a change in pull request #18342: [FLINK-25230][table-planner] Replace RelDataType with LogicalType serialization
matriv commented on a change in pull request #18342: URL: https://github.com/apache/flink/pull/18342#discussion_r784059921 ## File path: flink-table/flink-table-planner/src/test/java/org/apache/flink/table/planner/plan/nodes/exec/serde/RelDataTypeJsonSerdeTest.java ## @@ -46,83 +35,113 @@ import org.apache.calcite.sql.parser.SqlParserPos; import org.apache.calcite.sql.type.SqlTypeName; import org.junit.Test; -import org.junit.runner.RunWith; -import org.junit.runners.Parameterized; +import org.junit.jupiter.params.ParameterizedTest; +import org.junit.jupiter.params.provider.MethodSource; +import org.junit.runners.Parameterized.Parameters; import java.io.IOException; import java.util.ArrayList; import java.util.Arrays; -import java.util.Collection; import java.util.List; -import static org.junit.Assert.assertEquals; -import static org.junit.Assert.assertSame; +import static org.apache.flink.table.planner.plan.nodes.exec.serde.JsonSerdeMocks.configuredSerdeContext; +import static org.apache.flink.table.planner.plan.nodes.exec.serde.JsonSerdeMocks.toJson; +import static org.apache.flink.table.planner.plan.nodes.exec.serde.JsonSerdeMocks.toObject; +import static org.assertj.core.api.Assertions.assertThat; -/** Tests for serialization/deserialization of {@link RelDataType}. */ -@RunWith(Parameterized.class) +/** Tests for {@link RelDataType} serialization and deserialization. */ public class RelDataTypeJsonSerdeTest { + private static final FlinkTypeFactory FACTORY = FlinkTypeFactory.INSTANCE(); -@Parameterized.Parameters(name = "type = {0}") -public static Collection parameters() { +@ParameterizedTest +@MethodSource("testRelDataTypeSerde") +public void testRelDataTypeSerde(RelDataType relDataType) throws IOException { +final SerdeContext serdeContext = configuredSerdeContext(); + +final String json = toJson(serdeContext, relDataType); +final RelDataType actual = toObject(serdeContext, json, RelDataType.class); + +assertThat(actual).isSameAs(relDataType); +} + +@Test +public void testMissingPrecisionAndScale() { +final SerdeContext serdeContext = configuredSerdeContext(); + +final String json = +toJson( +serdeContext, +FACTORY.createSqlIntervalType( +new SqlIntervalQualifier( +TimeUnit.DAY, TimeUnit.SECOND, SqlParserPos.ZERO))); +final RelDataType actual = toObject(serdeContext, json, RelDataType.class); + +assertThat(actual) +.isSameAs( +FACTORY.createSqlIntervalType( +new SqlIntervalQualifier( +TimeUnit.DAY, + DayTimeIntervalType.DEFAULT_DAY_PRECISION, +TimeUnit.SECOND, + DayTimeIntervalType.DEFAULT_FRACTIONAL_PRECISION, +SqlParserPos.ZERO))); +} + +// +// Test data +// + +@Parameters(name = "{0}") +public static List testRelDataTypeSerde() { // the values in the list do not care about nullable. -List types = +final List types = Arrays.asList( FACTORY.createSqlType(SqlTypeName.BOOLEAN), FACTORY.createSqlType(SqlTypeName.TINYINT), FACTORY.createSqlType(SqlTypeName.SMALLINT), FACTORY.createSqlType(SqlTypeName.INTEGER), FACTORY.createSqlType(SqlTypeName.BIGINT), -FACTORY.createSqlType(SqlTypeName.DECIMAL, 3, 10), -FACTORY.createSqlType(SqlTypeName.DECIMAL, 0, 19), -FACTORY.createSqlType(SqlTypeName.DECIMAL, -1, 19), Review comment: I can't think of such a case, doesn't make sense. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] JingGe edited a comment on pull request #18333: [FLINK-25220][test] Write an architectural rule for all IT cases w.r.t. the MiniCluster
JingGe edited a comment on pull request #18333: URL: https://github.com/apache/flink/pull/18333#issuecomment-1012232986 > Of course technically it seems like IntelliJ actually finds more correct violations, so the IntelliJ result is "more correct", but IntelliJ is not our source of truth. It means that maven and IntelliJ are running in different scopes, i.e. "more violations" will be found by IntelliJ because it use a bigger scope than maven, IMHO, it has nothing to do with correctness. It might be an issue (incorrectness) that ArchUnit should fix. > The problem here are the test-jars and how they're handled; we currently don't have an issue because we do not use JARs for the dependencies in the module. that is what I mentioned that complex scope has not been touched. > > When IntelliJ executes the tests, it handles the dependencies differently than Maven, and that is causing the inconsistency. We can fix this by improving the import options to cover both scenarios equally. It sounds more like a compatible IntelliJ setup with maven project w.r.t. dependencies or ArchUnit abstracts and hides the hybrid dependencies with maven, IntelliJ, gradle, Bazel, etc. than working on it within this PR? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] JingGe edited a comment on pull request #18333: [FLINK-25220][test] Write an architectural rule for all IT cases w.r.t. the MiniCluster
JingGe edited a comment on pull request #18333: URL: https://github.com/apache/flink/pull/18333#issuecomment-1012232986 > Of course technically it seems like IntelliJ actually finds more correct violations, so the IntelliJ result is "more correct", but IntelliJ is not our source of truth. It means that maven and IntelliJ are running in different scopes, i.e. "more violations" will be found by IntelliJ because it use a bigger scope than maven, IMHO, it has nothing to do with correctness. > The problem here are the test-jars and how they're handled; we currently don't have an issue because we do not use JARs for the dependencies in the module. that is what I mentioned that complex scope has not been touched. > > When IntelliJ executes the tests, it handles the dependencies differently than Maven, and that is causing the inconsistency. We can fix this by improving the import options to cover both scenarios equally. It sounds more like a compatible IntelliJ setup with maven project w.r.t. dependencies or ArchUnit abstracts and hides the hybrid dependencies with maven, IntelliJ, gradle, Bazel, etc. than working on it within this PR? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] JingGe edited a comment on pull request #18333: [FLINK-25220][test] Write an architectural rule for all IT cases w.r.t. the MiniCluster
JingGe edited a comment on pull request #18333: URL: https://github.com/apache/flink/pull/18333#issuecomment-1012232986 > Of course technically it seems like IntelliJ actually finds more correct violations, so the IntelliJ result is "more correct", but IntelliJ is not our source of truth. It means that maven and IntelliJ are running in different scopes, i.e. "more violations", IMHO, it has nothing to do with correctness. > The problem here are the test-jars and how they're handled; we currently don't have an issue because we do not use JARs for the dependencies in the module. that is what I mentioned that complex scope has not been touched. > > When IntelliJ executes the tests, it handles the dependencies differently than Maven, and that is causing the inconsistency. We can fix this by improving the import options to cover both scenarios equally. It sounds more like a compatible IntelliJ setup with maven project w.r.t. dependencies or ArchUnit abstracts and hides the hybrid dependencies with maven, IntelliJ, gradle, Bazel, etc. than working on it within this PR? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] JingGe commented on pull request #18333: [FLINK-25220][test] Write an architectural rule for all IT cases w.r.t. the MiniCluster
JingGe commented on pull request #18333: URL: https://github.com/apache/flink/pull/18333#issuecomment-1012232986 > Of course technically it seems like IntelliJ actually finds more correct violations, so the IntelliJ result is "more correct", but IntelliJ is not our source of truth. It means that maven and IntelliJ are running in different scopes, i.e. "more violations", IMHO, it has nothing to do with correctness. > The problem here are the test-jars and how they're handled; we currently don't have an issue because we do not use JARs for the dependencies in the module. that is what I mentioned that complex scope has not been touched. > > When IntelliJ executes the tests, it handles the dependencies differently than Maven, and that is causing the inconsistency. We can fix this by improving the import options to cover both scenarios equally. Tt sounds more like a compatible IntelliJ setup with maven project w.r.t. dependencies or ArchUnit abstracts and hides the hybrid dependencies with maven, IntelliJ, gradle, Bazel, etc. than working on it within this PR? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org