[GitHub] [flink] flinkbot edited a comment on pull request #16513: [FLINK-23389][Formats] Glue schema registry JSON support
flinkbot edited a comment on pull request #16513: URL: https://github.com/apache/flink/pull/16513#issuecomment-881248601 ## CI report: * b83590dcb652ec96755c63e5ec7c9761fc96d87b UNKNOWN * 0ab16329651532d687f521ce9f6e75885f85d575 UNKNOWN * 6c86314c2f42fd10b0492dd8b570563ba0913c4b Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=23816) * f2582576a62238f78f5b32d7d1e0917256de4ead UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Closed] (FLINK-23378) ContinuousProcessingTimeTrigger最后一个定时器无法触发
[ https://issues.apache.org/jira/browse/FLINK-23378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] frey closed FLINK-23378. Fix Version/s: (was: 1.12.3) 1.14.1 Resolution: Fixed > ContinuousProcessingTimeTrigger最后一个定时器无法触发 > -- > > Key: FLINK-23378 > URL: https://issues.apache.org/jira/browse/FLINK-23378 > Project: Flink > Issue Type: Bug >Affects Versions: 1.12.3 >Reporter: frey >Priority: Major > Labels: pull-request-available > Fix For: 1.14.1 > > > 使用滚动窗口,时间语义为ProcessingTime时,修改默认触发器为ContinuousProcessingTimeTrigger后,最后一个定时器时间等于下一个窗口的起始时间,所以无法触发最后一个定时器的计算 > 可修改onProcessingTime中time=window.maxTimestamp()时FIRE -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-24197) Streaming File Sink end-to-end test fails with : "RestClientException: [File upload failed.]"
[ https://issues.apache.org/jira/browse/FLINK-24197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17412340#comment-17412340 ] Chesnay Schepler commented on FLINK-24197: -- I believe this to be an exotic bug in netty, and have filed a ticket: https://github.com/netty/netty/issues/11668 > Streaming File Sink end-to-end test fails with : "RestClientException: [File > upload failed.]" > - > > Key: FLINK-24197 > URL: https://issues.apache.org/jira/browse/FLINK-24197 > Project: Flink > Issue Type: Bug > Components: Connectors / FileSystem, Tests >Affects Versions: 1.15.0 >Reporter: Dawid Wysakowicz >Assignee: Chesnay Schepler >Priority: Major > Labels: test-stability > > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=23672=logs=c88eea3b-64a0-564d-0031-9fdcd7b8abee=070ff179-953e-5bda-71fa-d6599415701c=11040 > {code} > Caused by: org.apache.flink.util.FlinkException: Failed to execute job > 'StreamingFileSinkProgram'. > at > org.apache.flink.streaming.api.environment.StreamExecutionEnvironment.executeAsync(StreamExecutionEnvironment.java:2056) > at > org.apache.flink.client.program.StreamContextEnvironment.executeAsync(StreamContextEnvironment.java:137) > at > org.apache.flink.client.program.StreamContextEnvironment.execute(StreamContextEnvironment.java:76) > at > org.apache.flink.streaming.api.environment.StreamExecutionEnvironment.execute(StreamExecutionEnvironment.java:1917) > at FileSinkProgram.main(FileSinkProgram.java:105) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:355) > ... 8 more > Caused by: org.apache.flink.runtime.client.JobSubmissionException: Failed to > submit JobGraph. > at > org.apache.flink.client.program.rest.RestClusterClient.lambda$submitJob$11(RestClusterClient.java:433) > at > java.util.concurrent.CompletableFuture.uniExceptionally(CompletableFuture.java:884) > at > java.util.concurrent.CompletableFuture$UniExceptionally.tryFire(CompletableFuture.java:866) > at > java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488) > at > java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1990) > at > org.apache.flink.util.concurrent.FutureUtils.lambda$retryOperationWithDelay$9(FutureUtils.java:373) > at > java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:774) > at > java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:750) > at > java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488) > at > java.util.concurrent.CompletableFuture.postFire(CompletableFuture.java:575) > at > java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:943) > at > java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:456) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > Caused by: org.apache.flink.runtime.rest.util.RestClientException: [File > upload failed.] > at > org.apache.flink.runtime.rest.RestClient.parseResponse(RestClient.java:532) > at > org.apache.flink.runtime.rest.RestClient.lambda$submitRequest$3(RestClient.java:512) > at > java.util.concurrent.CompletableFuture.uniCompose(CompletableFuture.java:966) > at > java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:940) > ... 4 more > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-24222) Use ContextClassLoaderExtension in ReporterSetupTest
Chesnay Schepler created FLINK-24222: Summary: Use ContextClassLoaderExtension in ReporterSetupTest Key: FLINK-24222 URL: https://issues.apache.org/jira/browse/FLINK-24222 Project: Flink Issue Type: Technical Debt Components: Runtime / Metrics, Tests Reporter: Chesnay Schepler Fix For: 1.15.0 Migrate the ReporterSetupTests to use ContextClassLoaderExtension to generate service entries at runtime, scoped to the test, instead of having them as a test resource. See https://github.com/apache/flink/commit/8081dfbcc2c63dfda385a68f4615ddf5d51ccc26 as an example. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Closed] (FLINK-24193) Test service entries cause noise in other tests
[ https://issues.apache.org/jira/browse/FLINK-24193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chesnay Schepler closed FLINK-24193. Resolution: Fixed master: Extension added: a5e83afb510aafa05aac31413e2b59704c374aa3 REST tests migrated: 8081dfbcc2c63dfda385a68f4615ddf5d51ccc26 > Test service entries cause noise in other tests > --- > > Key: FLINK-24193 > URL: https://issues.apache.org/jira/browse/FLINK-24193 > Project: Flink > Issue Type: Technical Debt > Components: Tests >Reporter: Chesnay Schepler >Assignee: Chesnay Schepler >Priority: Major > Labels: pull-request-available > Fix For: 1.15.0 > > > At various points in Flink we use the ServiceLoader mechanism to load > implementations, for example filesystems or reporters. > It is thus only natural that we also have some test service implementations > which are used in specific tests. > However, these test implementations are intended to only be used in very > specific tests, but are currently but on the classpath for all tests in that > module (+all that depend on the test-jar of such a module). This causes > confusion (e.g., suddenly there are 5 reporter factories available) or > logging noise (e.g., custom netty handlers being loaded by each MiniCluster). > We should implement a junit extension that runs the test with a customized > classloader, which also as access to a temporary directory containing > generated service entries. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] zentol merged pull request #17184: [FLINK-24193][tests] Add ClassLoaderExtension
zentol merged pull request #17184: URL: https://github.com/apache/flink/pull/17184 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Closed] (FLINK-24209) Jdbc connector uses postgres testcontainers in compile scope
[ https://issues.apache.org/jira/browse/FLINK-24209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chesnay Schepler closed FLINK-24209. Resolution: Fixed master/1.14 are not affected. 1.13: c97ddb8181e6ad16510f4e6b558f9441c8f1f1d7 > Jdbc connector uses postgres testcontainers in compile scope > > > Key: FLINK-24209 > URL: https://issues.apache.org/jira/browse/FLINK-24209 > Project: Flink > Issue Type: Bug > Components: Connectors / JDBC >Affects Versions: 1.13.2 >Reporter: Dawid Wysakowicz >Assignee: Chesnay Schepler >Priority: Blocker > Labels: pull-request-available > Fix For: 1.13.3 > > > testcontainer dependencies should only ever be used in the test scope. It has > been fixed at least on 1.14 branch (I have not checked on master) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] flinkbot edited a comment on pull request #17208: [BP-1.14][FLINK-23773][connector/kafka] Mark empty splits as finished to cleanup states in SplitFetcher
flinkbot edited a comment on pull request #17208: URL: https://github.com/apache/flink/pull/17208#issuecomment-915763233 ## CI report: * e677f27ac5d53678debb68648ac7ddc18dc2fb2c Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=23819) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] zentol merged pull request #17197: [FLINK-24209][jdbc][build] Add test scope
zentol merged pull request #17197: URL: https://github.com/apache/flink/pull/17197 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot commented on pull request #17209: FLINK-23378
flinkbot commented on pull request #17209: URL: https://github.com/apache/flink/pull/17209#issuecomment-915772622 Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community to review your pull request. We will use this comment to track the progress of the review. ## Automated Checks Last check on commit 23158ac24be163d07459399921135276447f5544 (Thu Sep 09 05:21:19 UTC 2021) **Warnings:** * No documentation files were touched! Remember to keep the Flink docs up to date! * **This pull request references an unassigned [Jira ticket](https://issues.apache.org/jira/browse/FLINK-23378).** According to the [code contribution guide](https://flink.apache.org/contributing/contribute-code.html), tickets need to be assigned before starting with the implementation work. Mention the bot in a comment to re-run the automated checks. ## Review Progress * ❓ 1. The [description] looks good. * ❓ 2. There is [consensus] that the contribution should go into to Flink. * ❓ 3. Needs [attention] from. * ❓ 4. The change fits into the overall [architecture]. * ❓ 5. Overall code [quality] is good. Please see the [Pull Request Review Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full explanation of the review process. The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands The @flinkbot bot supports the following commands: - `@flinkbot approve description` to approve one or more aspects (aspects: `description`, `consensus`, `architecture` and `quality`) - `@flinkbot approve all` to approve all aspects - `@flinkbot approve-until architecture` to approve everything until `architecture` - `@flinkbot attention @username1 [@username2 ..]` to require somebody's attention - `@flinkbot disapprove architecture` to remove an approval you gave earlier -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (FLINK-23378) ContinuousProcessingTimeTrigger最后一个定时器无法触发
[ https://issues.apache.org/jira/browse/FLINK-23378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-23378: --- Labels: pull-request-available (was: ) > ContinuousProcessingTimeTrigger最后一个定时器无法触发 > -- > > Key: FLINK-23378 > URL: https://issues.apache.org/jira/browse/FLINK-23378 > Project: Flink > Issue Type: Bug >Affects Versions: 1.12.3 >Reporter: frey >Priority: Major > Labels: pull-request-available > Fix For: 1.12.3 > > > 使用滚动窗口,时间语义为ProcessingTime时,修改默认触发器为ContinuousProcessingTimeTrigger后,最后一个定时器时间等于下一个窗口的起始时间,所以无法触发最后一个定时器的计算 > 可修改onProcessingTime中time=window.maxTimestamp()时FIRE -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] frey66 opened a new pull request #17209: FLINK-23378
frey66 opened a new pull request #17209: URL: https://github.com/apache/flink/pull/17209 ## What is the purpose of the change *(For example: This pull request makes task deployment go through the blob server, rather than through RPC. That way we avoid re-transferring them on each deployment (during recovery).)* ## Brief change log *(for example:)* - *The TaskInfo is stored in the blob store on job creation time as a persistent artifact* - *Deployments RPC transmits only the blob storage reference* - *TaskManagers retrieve the TaskInfo from the blob cache* ## Verifying this change *(Please pick either of the following options)* This change is a trivial rework / code cleanup without any test coverage. *(or)* This change is already covered by existing tests, such as *(please describe tests)*. *(or)* This change added tests and can be verified as follows: *(example:)* - *Added integration tests for end-to-end deployment with large payloads (100MB)* - *Extended integration test for recovery after master (JobManager) failure* - *Added test that validates that TaskInfo is transferred only once across recoveries* - *Manually verified the change by running a 4 node cluser with 2 JobManagers and 4 TaskManagers, a stateful streaming program, and killing one JobManager and two TaskManagers during the execution, verifying that recovery happens correctly.* ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): (yes / no) - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (yes / no) - The serializers: (yes / no / don't know) - The runtime per-record code paths (performance sensitive): (yes / no / don't know) - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (yes / no / don't know) - The S3 file system connector: (yes / no / don't know) ## Documentation - Does this pull request introduce a new feature? (yes / no) - If yes, how is the feature documented? (not applicable / docs / JavaDocs / not documented) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot commented on pull request #17208: [BP-1.14][FLINK-23773][connector/kafka] Mark empty splits as finished to cleanup states in SplitFetcher
flinkbot commented on pull request #17208: URL: https://github.com/apache/flink/pull/17208#issuecomment-915763233 ## CI report: * e677f27ac5d53678debb68648ac7ddc18dc2fb2c UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #17207: [FLINK-22901][table-planner-blink] Introduce getUpsertKeys in FlinkRelMetadataQuery
flinkbot edited a comment on pull request #17207: URL: https://github.com/apache/flink/pull/17207#issuecomment-915750810 ## CI report: * 4f8b0f5ad64d4fe7c998042ec2b6b1ecec714be5 Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=23818) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #16513: [FLINK-23389][Formats] Glue schema registry JSON support
flinkbot edited a comment on pull request #16513: URL: https://github.com/apache/flink/pull/16513#issuecomment-881248601 ## CI report: * b83590dcb652ec96755c63e5ec7c9761fc96d87b UNKNOWN * 0ab16329651532d687f521ce9f6e75885f85d575 UNKNOWN * 6c86314c2f42fd10b0492dd8b570563ba0913c4b Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=23816) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] godfreyhe commented on a change in pull request #17118: [FLINK-22603][table-planner]The digest can be produced by SourceAbilitySpec.
godfreyhe commented on a change in pull request #17118: URL: https://github.com/apache/flink/pull/17118#discussion_r704944200 ## File path: flink-table/flink-table-planner/src/main/scala/org/apache/flink/table/planner/plan/schema/TableSourceTable.scala ## @@ -44,8 +43,8 @@ import org.apache.flink.table.types.logical.RowType * @param tableSource The [[DynamicTableSource]] for which is converted to a Calcite Table * @param isStreamingMode A flag that tells if the current table is in stream mode * @param catalogTable Resolved catalog table where this table source table comes from - * @param flinkContext The flink context - * @param abilitySpecs The abilitySpec applied to the source + * @param flinkContext The flink context abilitySpecs use to generate corresponding digests Review comment: @param flinkContext which is used to generate extra digests based on abilitySpecs ? ## File path: flink-table/flink-table-planner/src/test/java/org/apache/flink/table/planner/plan/nodes/exec/serde/TemporalTableSourceSpecSerdeTest.java ## @@ -59,17 +60,23 @@ public class TemporalTableSourceSpecSerdeTest { private static final FlinkTypeFactory FACTORY = FlinkTypeFactory.INSTANCE(); +private static final FlinkContext flinkContext = createFlinkContext(); + +private static FlinkContext createFlinkContext() { +return new FlinkContextImpl( +false, +TableConfig.getDefault(), +null, +CatalogManagerMocks.createEmptyCatalogManager(), +null); +} Review comment: rivate static final FlinkContext FLINK_CONTEXT = new FlinkContextImpl( false, TableConfig.getDefault(), null, CatalogManagerMocks.createEmptyCatalogManager(), null); ## File path: flink-table/flink-table-planner/src/test/scala/org/apache/flink/table/planner/plan/metadata/MetadataTestUtil.scala ## @@ -247,6 +248,18 @@ object MetadataTestUtil { getMetadataTable(fieldNames, fieldTypes, new FlinkStatistic(tableStats)) } + + private val flinkContext = createFlinkContext + + private def createFlinkContext(): FlinkContext = { +new FlinkContextImpl( + false, + TableConfig.getDefault, + null, + CatalogManagerMocks.createEmptyCatalogManager, + null) + } Review comment: private val flinkContext = new FlinkContextImpl( false, TableConfig.getDefault, null, CatalogManagerMocks.createEmptyCatalogManager, null) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (FLINK-24149) Make checkpoint self-contained and relocatable
[ https://issues.apache.org/jira/browse/FLINK-24149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Feifan Wang updated FLINK-24149: Summary: Make checkpoint self-contained and relocatable (was: Make checkpoint relocatable) > Make checkpoint self-contained and relocatable > -- > > Key: FLINK-24149 > URL: https://issues.apache.org/jira/browse/FLINK-24149 > Project: Flink > Issue Type: Improvement > Components: Runtime / Checkpointing >Reporter: Feifan Wang >Priority: Major > Labels: pull-request-available > Attachments: image-2021-09-08-17-06-31-560.png, > image-2021-09-08-17-10-28-240.png, image-2021-09-08-17-55-46-898.png, > image-2021-09-08-18-01-03-176.png > > > h3. 1. Backgroud > FLINK-5763 proposal make savepoint relocatable, checkpoint has similar > requirements. For example, to migrate jobs to other HDFS clusters, although > it can be achieved through a savepoint, but we prefer to use persistent > checkpoints, especially RocksDBStateBackend incremental checkpoints have > better performance than savepoint during snapshot and restore. > > FLINK-8531 standardized directory layout : > {code:java} > /user-defined-checkpoint-dir > | > + 1b080b6e710aabbef8993ab18c6de98b (job's ID) > | > + --shared/ > + --taskowned/ > + --chk-1/ > + --chk-2/ > + --chk-3/ > ... > {code} > * State backend will create a subdirectory with the job's ID that will > contain the actual checkpoints, such as: > user-defined-checkpoint-dir/1b080b6e710aabbef8993ab18c6de98b/ > * Each checkpoint individually will store all its files in a subdirectory > that includes the checkpoint number, such as: > user-defined-checkpoint-dir/1b080b6e710aabbef8993ab18c6de98b/chk-3/ > * Files shared between checkpoints will be stored in the shared/ directory > in the same parent directory as the separate checkpoint directory, such as: > user-defined-checkpoint-dir/1b080b6e710aabbef8993ab18c6de98b/shared/ > * Similar to shared files, files owned strictly by tasks will be stored in > the taskowned/ directory in the same parent directory as the separate > checkpoint directory, such as: > user-defined-checkpoint-dir/1b080b6e710aabbef8993ab18c6de98b/taskowned/ > h3. Proposal > Since the individually checkpoint directory does not contain complete state > data, we cannot make it relocatable, but its parent directory can. The only > work left is make the metadata file references relative file paths. > I proposal make these changes to _*FsCheckpointStateOutputStream*_ : > * introduce _*checkpointDirectory*_ field, and remove *_allowRelativePaths_* > field > * introduce *_entropyInjecting_* field > * *_closeAndGetHandle()_* return _*RelativeFileStateHandle*_ with relative > path base on _*checkpointDirectory*_ (except entropy injecting file system) > [~yunta], [~trohrmann] , I verified this in our environment , and submitted a > pull request to accomplish this feature. Please help evaluate whether it is > appropriate. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-24149) Make checkpoint relocatable
[ https://issues.apache.org/jira/browse/FLINK-24149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Feifan Wang updated FLINK-24149: Description: h3. 1. Backgroud FLINK-5763 proposal make savepoint relocatable, checkpoint has similar requirements. For example, to migrate jobs to other HDFS clusters, although it can be achieved through a savepoint, but we prefer to use persistent checkpoints, especially RocksDBStateBackend incremental checkpoints have better performance than savepoint during snapshot and restore. FLINK-8531 standardized directory layout : {code:java} /user-defined-checkpoint-dir | + 1b080b6e710aabbef8993ab18c6de98b (job's ID) | + --shared/ + --taskowned/ + --chk-1/ + --chk-2/ + --chk-3/ ... {code} * State backend will create a subdirectory with the job's ID that will contain the actual checkpoints, such as: user-defined-checkpoint-dir/1b080b6e710aabbef8993ab18c6de98b/ * Each checkpoint individually will store all its files in a subdirectory that includes the checkpoint number, such as: user-defined-checkpoint-dir/1b080b6e710aabbef8993ab18c6de98b/chk-3/ * Files shared between checkpoints will be stored in the shared/ directory in the same parent directory as the separate checkpoint directory, such as: user-defined-checkpoint-dir/1b080b6e710aabbef8993ab18c6de98b/shared/ * Similar to shared files, files owned strictly by tasks will be stored in the taskowned/ directory in the same parent directory as the separate checkpoint directory, such as: user-defined-checkpoint-dir/1b080b6e710aabbef8993ab18c6de98b/taskowned/ h3. Proposal Since the individually checkpoint directory does not contain complete state data, we cannot make it relocatable, but its parent directory can. The only work left is make the metadata file references relative file paths. I proposal make these changes to _*FsCheckpointStateOutputStream*_ : * introduce _*checkpointDirectory*_ field, and remove *_allowRelativePaths_* field * introduce *_entropyInjecting_* field * *_closeAndGetHandle()_* return _*RelativeFileStateHandle*_ with relative path base on _*checkpointDirectory*_ (except entropy injecting file system) [~yunta], [~trohrmann] , I verified this in our environment , and submitted a pull request to accomplish this feature. Please help evaluate whether it is appropriate. was: h3. Backgroud FLINK-5763 proposal make savepoint relocatable, checkpoint has similar requirements. For example, to migrate jobs to other HDFS clusters, although it can be achieved through a savepoint, but we prefer to use persistent checkpoints, especially RocksDBStateBackend incremental checkpoints have better performance than savepoint during snapshot and restore. FLINK-8531 standardized directory layout : {code:java} /user-defined-checkpoint-dir | + 1b080b6e710aabbef8993ab18c6de98b (job's ID) | + --shared/ + --taskowned/ + --chk-1/ + --chk-2/ + --chk-3/ ... {code} * State backend will create a subdirectory with the job's ID that will contain the actual checkpoints, such as: user-defined-checkpoint-dir/1b080b6e710aabbef8993ab18c6de98b/ * Each checkpoint individually will store all its files in a subdirectory that includes the checkpoint number, such as: user-defined-checkpoint-dir/1b080b6e710aabbef8993ab18c6de98b/chk-3/ * Files shared between checkpoints will be stored in the shared/ directory in the same parent directory as the separate checkpoint directory, such as: user-defined-checkpoint-dir/1b080b6e710aabbef8993ab18c6de98b/shared/ * Similar to shared files, files owned strictly by tasks will be stored in the taskowned/ directory in the same parent directory as the separate checkpoint directory, such as: user-defined-checkpoint-dir/1b080b6e710aabbef8993ab18c6de98b/taskowned/ h3. Proposal Since the individually checkpoint directory does not contain complete state data, we cannot make it relocatable, but its parent directory can. The only work left is make the metadata file references relative file paths. I proposal make these changes to _*FsCheckpointStateOutputStream*_ : * introduce _*checkpointDirectory*_ field, and remove *_allowRelativePaths_* field * introduce *_entropyInjecting_* field * *_closeAndGetHandle()_* return _*RelativeFileStateHandle*_ with relative path base on _*checkpointDirectory*_ (except entropy injecting file system) [~yunta], [~trohrmann] , I verified this in our environment , and submitted a pull request to accomplish this feature. Please help evaluate whether it is appropriate. > Make checkpoint relocatable > --- > > Key: FLINK-24149 > URL: https://issues.apache.org/jira/browse/FLINK-24149 > Project: Flink > Issue Type: Improvement >
[jira] [Commented] (FLINK-24149) Make checkpoint relocatable
[ https://issues.apache.org/jira/browse/FLINK-24149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17412326#comment-17412326 ] Feifan Wang commented on FLINK-24149: - Hi [~pnowojski], remove the current distinction between savepoint and checkpoint looks good to me, but this does not seem to conflict with making the checkpoint to self-contained and relocatable. I will rephrase the description to make it more comprehensible. > Make checkpoint relocatable > --- > > Key: FLINK-24149 > URL: https://issues.apache.org/jira/browse/FLINK-24149 > Project: Flink > Issue Type: Improvement > Components: Runtime / Checkpointing >Reporter: Feifan Wang >Priority: Major > Labels: pull-request-available > Attachments: image-2021-09-08-17-06-31-560.png, > image-2021-09-08-17-10-28-240.png, image-2021-09-08-17-55-46-898.png, > image-2021-09-08-18-01-03-176.png > > > h3. Backgroud > FLINK-5763 proposal make savepoint relocatable, checkpoint has similar > requirements. For example, to migrate jobs to other HDFS clusters, although > it can be achieved through a savepoint, but we prefer to use persistent > checkpoints, especially RocksDBStateBackend incremental checkpoints have > better performance than savepoint during snapshot and restore. > > FLINK-8531 standardized directory layout : > {code:java} > /user-defined-checkpoint-dir > | > + 1b080b6e710aabbef8993ab18c6de98b (job's ID) > | > + --shared/ > + --taskowned/ > + --chk-1/ > + --chk-2/ > + --chk-3/ > ... > {code} > * State backend will create a subdirectory with the job's ID that will > contain the actual checkpoints, such as: > user-defined-checkpoint-dir/1b080b6e710aabbef8993ab18c6de98b/ > * Each checkpoint individually will store all its files in a subdirectory > that includes the checkpoint number, such as: > user-defined-checkpoint-dir/1b080b6e710aabbef8993ab18c6de98b/chk-3/ > * Files shared between checkpoints will be stored in the shared/ directory > in the same parent directory as the separate checkpoint directory, such as: > user-defined-checkpoint-dir/1b080b6e710aabbef8993ab18c6de98b/shared/ > * Similar to shared files, files owned strictly by tasks will be stored in > the taskowned/ directory in the same parent directory as the separate > checkpoint directory, such as: > user-defined-checkpoint-dir/1b080b6e710aabbef8993ab18c6de98b/taskowned/ > h3. Proposal > Since the individually checkpoint directory does not contain complete state > data, we cannot make it relocatable, but its parent directory can. The only > work left is make the metadata file references relative file paths. > I proposal make these changes to _*FsCheckpointStateOutputStream*_ : > * introduce _*checkpointDirectory*_ field, and remove *_allowRelativePaths_* > field > * introduce *_entropyInjecting_* field > * *_closeAndGetHandle()_* return _*RelativeFileStateHandle*_ with relative > path base on _*checkpointDirectory*_ (except entropy injecting file system) > [~yunta], [~trohrmann] , I verified this in our environment , and submitted a > pull request to accomplish this feature. Please help evaluate whether it is > appropriate. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] flinkbot commented on pull request #17207: [FLINK-22901][table-planner-blink] Introduce getUpsertKeys in FlinkRelMetadataQuery
flinkbot commented on pull request #17207: URL: https://github.com/apache/flink/pull/17207#issuecomment-915750810 ## CI report: * 4f8b0f5ad64d4fe7c998042ec2b6b1ecec714be5 UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #16513: [FLINK-23389][Formats] Glue schema registry JSON support
flinkbot edited a comment on pull request #16513: URL: https://github.com/apache/flink/pull/16513#issuecomment-881248601 ## CI report: * b83590dcb652ec96755c63e5ec7c9761fc96d87b UNKNOWN * 0ab16329651532d687f521ce9f6e75885f85d575 UNKNOWN * e007d566345c353e5f8b068649cd398ba7259cb7 Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=23813) * 6c86314c2f42fd10b0492dd8b570563ba0913c4b UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Comment Edited] (FLINK-24041) [FLIP-171] Generic AsyncSinkBase
[ https://issues.apache.org/jira/browse/FLINK-24041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17412322#comment-17412322 ] Piotr Nowojski edited comment on FLINK-24041 at 9/9/21, 4:18 AM: - I had to revert this as 1702c9e and 16ab8b41af0 because of build failures. Looks like this PR hasn't been rebased and conflicted with another change in the master. https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=23814=logs=52b61abe-a3cc-5bde-cc35-1bbe89bb7df5=54421a62-0c80-5aad-3319-094ff69180bb {noformat} [ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.8.0:testCompile (default-testCompile) on project flink-connector-base: Compilation failure [ERROR] /__w/2/s/flink-connectors/flink-connector-base/src/test/java/org/apache/flink/connector/base/sink/writer/AsyncSinkWriterTest.java:[309,20] org.apache.flink.connector.base.sink.writer.AsyncSinkWriterTest.SinkInitContext is not abstract and does not override abstract method getRestoredCheckpointId() in org.apache.flink.api.connector.sink.Sink.InitContext {noformat} was (Author: pnowojski): I had to revert this as 1702c9e because of build failures. Looks like this PR hasn't been rebased and conflicted with another change in the master. https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=23814=logs=52b61abe-a3cc-5bde-cc35-1bbe89bb7df5=54421a62-0c80-5aad-3319-094ff69180bb {noformat} [ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.8.0:testCompile (default-testCompile) on project flink-connector-base: Compilation failure [ERROR] /__w/2/s/flink-connectors/flink-connector-base/src/test/java/org/apache/flink/connector/base/sink/writer/AsyncSinkWriterTest.java:[309,20] org.apache.flink.connector.base.sink.writer.AsyncSinkWriterTest.SinkInitContext is not abstract and does not override abstract method getRestoredCheckpointId() in org.apache.flink.api.connector.sink.Sink.InitContext {noformat} > [FLIP-171] Generic AsyncSinkBase > > > Key: FLINK-24041 > URL: https://issues.apache.org/jira/browse/FLINK-24041 > Project: Flink > Issue Type: New Feature > Components: Connectors / Common >Reporter: Zichen Liu >Assignee: Zichen Liu >Priority: Major > Labels: pull-request-available > Fix For: 1.15.0 > > > h2. Motivation > Apache Flink has a rich connector ecosystem that can persist data in various > destinations. Flink natively supports Apache Kafka, Amazon Kinesis Data > Streams, Elasticsearch, HBase, and many more destinations. Additional > connectors are maintained in Apache Bahir or directly on GitHub. The basic > functionality of these sinks is quite similar. They batch events according to > user defined buffering hints, sign requests and send them to the respective > endpoint, retry unsuccessful or throttled requests, and participate in > checkpointing. They primarily just differ in the way they interface with the > destination. Yet, all the above-mentioned sinks are developed and maintained > independently. > We hence propose to create a sink that abstracts away this common > functionality into a generic sink. Adding support for a new destination then > just means creating a lightweight shim that only implements the specific > interfaces of the destination using a client that supports async requests. > Having a common abstraction will reduce the effort required to maintain all > these individual sinks. It will also make it much easier and faster to create > integrations with additional destinations. Moreover, improvements or bug > fixes to the core of the sink will benefit all implementations that are based > on it. > The design of the sink focusses on extensibility and a broad support of > destinations. The core of the sink is kept generic and free of any connector > specific dependencies. The sink is designed to participate in checkpointing > to provide at-least once semantics, but it is limited to destinations that > provide a client that supports async requests. > h2. References > More details to be found > https://cwiki.apache.org/confluence/display/FLINK/FLIP-171%3A+Async+Sink -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] flinkbot commented on pull request #17208: [BP-1.14][FLINK-23773][connector/kafka] Mark empty splits as finished to cleanup states in SplitFetcher
flinkbot commented on pull request #17208: URL: https://github.com/apache/flink/pull/17208#issuecomment-915749291 Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community to review your pull request. We will use this comment to track the progress of the review. ## Automated Checks Last check on commit e677f27ac5d53678debb68648ac7ddc18dc2fb2c (Thu Sep 09 04:17:53 UTC 2021) **Warnings:** * No documentation files were touched! Remember to keep the Flink docs up to date! Mention the bot in a comment to re-run the automated checks. ## Review Progress * ❓ 1. The [description] looks good. * ❓ 2. There is [consensus] that the contribution should go into to Flink. * ❓ 3. Needs [attention] from. * ❓ 4. The change fits into the overall [architecture]. * ❓ 5. Overall code [quality] is good. Please see the [Pull Request Review Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full explanation of the review process. The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands The @flinkbot bot supports the following commands: - `@flinkbot approve description` to approve one or more aspects (aspects: `description`, `consensus`, `architecture` and `quality`) - `@flinkbot approve all` to approve all aspects - `@flinkbot approve-until architecture` to approve everything until `architecture` - `@flinkbot attention @username1 [@username2 ..]` to require somebody's attention - `@flinkbot disapprove architecture` to remove an approval you gave earlier -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] PatrickRen opened a new pull request #17208: [BP-1.14][FLINK-23773][connector/kafka] Mark empty splits as finished to cleanup states in SplitFetcher
PatrickRen opened a new pull request #17208: URL: https://github.com/apache/flink/pull/17208 Unchanged backport of #16870 on release-1.14 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-20370) Result is wrong when sink primary key is not the same with query
[ https://issues.apache.org/jira/browse/FLINK-20370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17412323#comment-17412323 ] Jingsong Lee commented on FLINK-20370: -- [~jark] I updated cases. I think UpsertMaterialize can works. For the third case, we can add UpsertMaterialize, but sometimes it may be redundant. > Result is wrong when sink primary key is not the same with query > > > Key: FLINK-20370 > URL: https://issues.apache.org/jira/browse/FLINK-20370 > Project: Flink > Issue Type: Bug > Components: Table SQL / Planner >Affects Versions: 1.12.0 >Reporter: Jark Wu >Priority: Minor > Labels: auto-deprioritized-major > > Both sources are upsert-kafka which synchronizes the changes from MySQL > tables (source_city, source_customer). The sink is another MySQL table which > is in upsert mode with "city_name" primary key. The join key is "city_id". > In this case, the result will be wrong when updating > {{source_city.city_name}} column in MySQL, as the UPDATE_BEFORE is ignored > and the old city_name is retained in the sink table. > {code} > Sink(table=[default_catalog.default_database.sink_kafka_count_city], > fields=[city_name, count_customer, sum_gender], changelogMode=[NONE]) > +- Calc(select=[city_name, CAST(count_customer) AS count_customer, > CAST(sum_gender) AS sum_gender], changelogMode=[I,UA,D]) >+- Join(joinType=[InnerJoin], where=[=(city_id, id)], select=[city_id, > count_customer, sum_gender, id, city_name], > leftInputSpec=[JoinKeyContainsUniqueKey], > rightInputSpec=[JoinKeyContainsUniqueKey], changelogMode=[I,UA,D]) > :- Exchange(distribution=[hash[city_id]], changelogMode=[I,UA,D]) > : +- GlobalGroupAggregate(groupBy=[city_id], select=[city_id, > COUNT_RETRACT(count1$0) AS count_customer, SUM_RETRACT((sum$1, count$2)) AS > sum_gender], changelogMode=[I,UA,D]) > : +- Exchange(distribution=[hash[city_id]], changelogMode=[I]) > :+- LocalGroupAggregate(groupBy=[city_id], select=[city_id, > COUNT_RETRACT(*) AS count1$0, SUM_RETRACT(gender) AS (sum$1, count$2)], > changelogMode=[I]) > : +- Calc(select=[city_id, gender], changelogMode=[I,UB,UA,D]) > : +- ChangelogNormalize(key=[customer_id], > changelogMode=[I,UB,UA,D]) > : +- Exchange(distribution=[hash[customer_id]], > changelogMode=[UA,D]) > :+- MiniBatchAssigner(interval=[3000ms], > mode=[ProcTime], changelogMode=[UA,D]) > : +- TableSourceScan(table=[[default_catalog, > default_database, source_customer]], fields=[customer_id, city_id, age, > gender, update_time], changelogMode=[UA,D]) > +- Exchange(distribution=[hash[id]], changelogMode=[I,UA,D]) > +- ChangelogNormalize(key=[id], changelogMode=[I,UA,D]) > +- Exchange(distribution=[hash[id]], changelogMode=[UA,D]) >+- MiniBatchAssigner(interval=[3000ms], mode=[ProcTime], > changelogMode=[UA,D]) > +- TableSourceScan(table=[[default_catalog, > default_database, source_city]], fields=[id, city_name], changelogMode=[UA,D]) > {code} > We have suggested users to use the same key of the query as the primary key > on sink in the documentation: > https://ci.apache.org/projects/flink/flink-docs-master/dev/table/sql/queries.html#deduplication. > We should make this attention to be more highlight in CREATE TABLE page. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (FLINK-24041) [FLIP-171] Generic AsyncSinkBase
[ https://issues.apache.org/jira/browse/FLINK-24041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17412322#comment-17412322 ] Piotr Nowojski edited comment on FLINK-24041 at 9/9/21, 3:56 AM: - I had to revert this as 1702c9e because of build failures. Looks like this PR hasn't been rebased and conflicted with another change in the master. https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=23814=logs=52b61abe-a3cc-5bde-cc35-1bbe89bb7df5=54421a62-0c80-5aad-3319-094ff69180bb {noformat} [ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.8.0:testCompile (default-testCompile) on project flink-connector-base: Compilation failure [ERROR] /__w/2/s/flink-connectors/flink-connector-base/src/test/java/org/apache/flink/connector/base/sink/writer/AsyncSinkWriterTest.java:[309,20] org.apache.flink.connector.base.sink.writer.AsyncSinkWriterTest.SinkInitContext is not abstract and does not override abstract method getRestoredCheckpointId() in org.apache.flink.api.connector.sink.Sink.InitContext {noformat} was (Author: pnowojski): I had to revert this as 1702c9e because of build failures. Looks like this PR hasn't been rebased and conflicted with another change in the master. https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=23814=logs=52b61abe-a3cc-5bde-cc35-1bbe89bb7df5=54421a62-0c80-5aad-3319-094ff69180bb > [FLIP-171] Generic AsyncSinkBase > > > Key: FLINK-24041 > URL: https://issues.apache.org/jira/browse/FLINK-24041 > Project: Flink > Issue Type: New Feature > Components: Connectors / Common >Reporter: Zichen Liu >Assignee: Zichen Liu >Priority: Major > Labels: pull-request-available > Fix For: 1.15.0 > > > h2. Motivation > Apache Flink has a rich connector ecosystem that can persist data in various > destinations. Flink natively supports Apache Kafka, Amazon Kinesis Data > Streams, Elasticsearch, HBase, and many more destinations. Additional > connectors are maintained in Apache Bahir or directly on GitHub. The basic > functionality of these sinks is quite similar. They batch events according to > user defined buffering hints, sign requests and send them to the respective > endpoint, retry unsuccessful or throttled requests, and participate in > checkpointing. They primarily just differ in the way they interface with the > destination. Yet, all the above-mentioned sinks are developed and maintained > independently. > We hence propose to create a sink that abstracts away this common > functionality into a generic sink. Adding support for a new destination then > just means creating a lightweight shim that only implements the specific > interfaces of the destination using a client that supports async requests. > Having a common abstraction will reduce the effort required to maintain all > these individual sinks. It will also make it much easier and faster to create > integrations with additional destinations. Moreover, improvements or bug > fixes to the core of the sink will benefit all implementations that are based > on it. > The design of the sink focusses on extensibility and a broad support of > destinations. The core of the sink is kept generic and free of any connector > specific dependencies. The sink is designed to participate in checkpointing > to provide at-least once semantics, but it is limited to destinations that > provide a client that supports async requests. > h2. References > More details to be found > https://cwiki.apache.org/confluence/display/FLINK/FLIP-171%3A+Async+Sink -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] flinkbot commented on pull request #17207: [FLINK-22901][table-planner-blink] Introduce getUpsertKeys in FlinkRelMetadataQuery
flinkbot commented on pull request #17207: URL: https://github.com/apache/flink/pull/17207#issuecomment-915741202 Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community to review your pull request. We will use this comment to track the progress of the review. ## Automated Checks Last check on commit 4f8b0f5ad64d4fe7c998042ec2b6b1ecec714be5 (Thu Sep 09 03:55:55 UTC 2021) **Warnings:** * No documentation files were touched! Remember to keep the Flink docs up to date! Mention the bot in a comment to re-run the automated checks. ## Review Progress * ❓ 1. The [description] looks good. * ❓ 2. There is [consensus] that the contribution should go into to Flink. * ❓ 3. Needs [attention] from. * ❓ 4. The change fits into the overall [architecture]. * ❓ 5. Overall code [quality] is good. Please see the [Pull Request Review Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full explanation of the review process. The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands The @flinkbot bot supports the following commands: - `@flinkbot approve description` to approve one or more aspects (aspects: `description`, `consensus`, `architecture` and `quality`) - `@flinkbot approve all` to approve all aspects - `@flinkbot approve-until architecture` to approve everything until `architecture` - `@flinkbot attention @username1 [@username2 ..]` to require somebody's attention - `@flinkbot disapprove architecture` to remove an approval you gave earlier -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] pnowojski edited a comment on pull request #17068: [FLINK-24041][connectors] Flip 171 - Abstract Async Sink Base
pnowojski edited a comment on pull request #17068: URL: https://github.com/apache/flink/pull/17068#issuecomment-915740659 I had to revert this as 1702c9e4f4b because of build failures. Looks like this PR hasn't been rebased and conflicted with another change in the master. https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=23814=logs=52b61abe-a3cc-5bde-cc35-1bbe89bb7df5=54421a62-0c80-5aad-3319-094ff69180bb -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Comment Edited] (FLINK-24041) [FLIP-171] Generic AsyncSinkBase
[ https://issues.apache.org/jira/browse/FLINK-24041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17412322#comment-17412322 ] Piotr Nowojski edited comment on FLINK-24041 at 9/9/21, 3:54 AM: - I had to revert this as 1702c9e because of build failures. Looks like this PR hasn't been rebased and conflicted with another change in the master. https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=23814=logs=52b61abe-a3cc-5bde-cc35-1bbe89bb7df5=54421a62-0c80-5aad-3319-094ff69180bb was (Author: pnowojski): I had to revert this as 1702c9e because of build failures. Looks like this PR hasn't been rebased and conflicted with another change in the master. > [FLIP-171] Generic AsyncSinkBase > > > Key: FLINK-24041 > URL: https://issues.apache.org/jira/browse/FLINK-24041 > Project: Flink > Issue Type: New Feature > Components: Connectors / Common >Reporter: Zichen Liu >Assignee: Zichen Liu >Priority: Major > Labels: pull-request-available > Fix For: 1.15.0 > > > h2. Motivation > Apache Flink has a rich connector ecosystem that can persist data in various > destinations. Flink natively supports Apache Kafka, Amazon Kinesis Data > Streams, Elasticsearch, HBase, and many more destinations. Additional > connectors are maintained in Apache Bahir or directly on GitHub. The basic > functionality of these sinks is quite similar. They batch events according to > user defined buffering hints, sign requests and send them to the respective > endpoint, retry unsuccessful or throttled requests, and participate in > checkpointing. They primarily just differ in the way they interface with the > destination. Yet, all the above-mentioned sinks are developed and maintained > independently. > We hence propose to create a sink that abstracts away this common > functionality into a generic sink. Adding support for a new destination then > just means creating a lightweight shim that only implements the specific > interfaces of the destination using a client that supports async requests. > Having a common abstraction will reduce the effort required to maintain all > these individual sinks. It will also make it much easier and faster to create > integrations with additional destinations. Moreover, improvements or bug > fixes to the core of the sink will benefit all implementations that are based > on it. > The design of the sink focusses on extensibility and a broad support of > destinations. The core of the sink is kept generic and free of any connector > specific dependencies. The sink is designed to participate in checkpointing > to provide at-least once semantics, but it is limited to destinations that > provide a client that supports async requests. > h2. References > More details to be found > https://cwiki.apache.org/confluence/display/FLINK/FLIP-171%3A+Async+Sink -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (FLINK-20370) Result is wrong when sink primary key is not the same with query
[ https://issues.apache.org/jira/browse/FLINK-20370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17412302#comment-17412302 ] Jingsong Lee edited comment on FLINK-20370 at 9/9/21, 3:54 AM: --- Yes, it seems that the only solution is single parallelism. A upsert sink can accept inputs: # primary key = unique key, it is OK, nothing needs to do. # input is append only, sometimes is OK, We can only assume that users can allow some degree of distributed disorder. # input is change log, primary key != unique key (unique key can be none), the most problematic situation was (Author: lzljs3620320): Yes, it seems that the only solution is single parallelism. A upsert sink can accept inputs: # primary key = unique key, it is OK, nothing needs to do. # input is append only, sometimes is OK, We can only assume that users can allow some degree of distributed disorder. # input is change log, primary key != unique key (unique key can be null), the most problematic situation > Result is wrong when sink primary key is not the same with query > > > Key: FLINK-20370 > URL: https://issues.apache.org/jira/browse/FLINK-20370 > Project: Flink > Issue Type: Bug > Components: Table SQL / Planner >Affects Versions: 1.12.0 >Reporter: Jark Wu >Priority: Minor > Labels: auto-deprioritized-major > > Both sources are upsert-kafka which synchronizes the changes from MySQL > tables (source_city, source_customer). The sink is another MySQL table which > is in upsert mode with "city_name" primary key. The join key is "city_id". > In this case, the result will be wrong when updating > {{source_city.city_name}} column in MySQL, as the UPDATE_BEFORE is ignored > and the old city_name is retained in the sink table. > {code} > Sink(table=[default_catalog.default_database.sink_kafka_count_city], > fields=[city_name, count_customer, sum_gender], changelogMode=[NONE]) > +- Calc(select=[city_name, CAST(count_customer) AS count_customer, > CAST(sum_gender) AS sum_gender], changelogMode=[I,UA,D]) >+- Join(joinType=[InnerJoin], where=[=(city_id, id)], select=[city_id, > count_customer, sum_gender, id, city_name], > leftInputSpec=[JoinKeyContainsUniqueKey], > rightInputSpec=[JoinKeyContainsUniqueKey], changelogMode=[I,UA,D]) > :- Exchange(distribution=[hash[city_id]], changelogMode=[I,UA,D]) > : +- GlobalGroupAggregate(groupBy=[city_id], select=[city_id, > COUNT_RETRACT(count1$0) AS count_customer, SUM_RETRACT((sum$1, count$2)) AS > sum_gender], changelogMode=[I,UA,D]) > : +- Exchange(distribution=[hash[city_id]], changelogMode=[I]) > :+- LocalGroupAggregate(groupBy=[city_id], select=[city_id, > COUNT_RETRACT(*) AS count1$0, SUM_RETRACT(gender) AS (sum$1, count$2)], > changelogMode=[I]) > : +- Calc(select=[city_id, gender], changelogMode=[I,UB,UA,D]) > : +- ChangelogNormalize(key=[customer_id], > changelogMode=[I,UB,UA,D]) > : +- Exchange(distribution=[hash[customer_id]], > changelogMode=[UA,D]) > :+- MiniBatchAssigner(interval=[3000ms], > mode=[ProcTime], changelogMode=[UA,D]) > : +- TableSourceScan(table=[[default_catalog, > default_database, source_customer]], fields=[customer_id, city_id, age, > gender, update_time], changelogMode=[UA,D]) > +- Exchange(distribution=[hash[id]], changelogMode=[I,UA,D]) > +- ChangelogNormalize(key=[id], changelogMode=[I,UA,D]) > +- Exchange(distribution=[hash[id]], changelogMode=[UA,D]) >+- MiniBatchAssigner(interval=[3000ms], mode=[ProcTime], > changelogMode=[UA,D]) > +- TableSourceScan(table=[[default_catalog, > default_database, source_city]], fields=[id, city_name], changelogMode=[UA,D]) > {code} > We have suggested users to use the same key of the query as the primary key > on sink in the documentation: > https://ci.apache.org/projects/flink/flink-docs-master/dev/table/sql/queries.html#deduplication. > We should make this attention to be more highlight in CREATE TABLE page. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Reopened] (FLINK-24041) [FLIP-171] Generic AsyncSinkBase
[ https://issues.apache.org/jira/browse/FLINK-24041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Piotr Nowojski reopened FLINK-24041: I had to revert this as 1702c9e because of build failures. Looks like this PR hasn't been rebased and conflicted with another change in the master. > [FLIP-171] Generic AsyncSinkBase > > > Key: FLINK-24041 > URL: https://issues.apache.org/jira/browse/FLINK-24041 > Project: Flink > Issue Type: New Feature > Components: Connectors / Common >Reporter: Zichen Liu >Assignee: Zichen Liu >Priority: Major > Labels: pull-request-available > Fix For: 1.15.0 > > > h2. Motivation > Apache Flink has a rich connector ecosystem that can persist data in various > destinations. Flink natively supports Apache Kafka, Amazon Kinesis Data > Streams, Elasticsearch, HBase, and many more destinations. Additional > connectors are maintained in Apache Bahir or directly on GitHub. The basic > functionality of these sinks is quite similar. They batch events according to > user defined buffering hints, sign requests and send them to the respective > endpoint, retry unsuccessful or throttled requests, and participate in > checkpointing. They primarily just differ in the way they interface with the > destination. Yet, all the above-mentioned sinks are developed and maintained > independently. > We hence propose to create a sink that abstracts away this common > functionality into a generic sink. Adding support for a new destination then > just means creating a lightweight shim that only implements the specific > interfaces of the destination using a client that supports async requests. > Having a common abstraction will reduce the effort required to maintain all > these individual sinks. It will also make it much easier and faster to create > integrations with additional destinations. Moreover, improvements or bug > fixes to the core of the sink will benefit all implementations that are based > on it. > The design of the sink focusses on extensibility and a broad support of > destinations. The core of the sink is kept generic and free of any connector > specific dependencies. The sink is designed to participate in checkpointing > to provide at-least once semantics, but it is limited to destinations that > provide a client that supports async requests. > h2. References > More details to be found > https://cwiki.apache.org/confluence/display/FLINK/FLIP-171%3A+Async+Sink -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] pnowojski commented on pull request #17068: [FLINK-24041][connectors] Flip 171 - Abstract Async Sink Base
pnowojski commented on pull request #17068: URL: https://github.com/apache/flink/pull/17068#issuecomment-915740659 I had to revert this as 1702c9e4f4b because of build failures. Looks like this PR hasn't been rebased and conflicted with another change in the master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-24214) A submit job failure crashes the sql client
[ https://issues.apache.org/jira/browse/FLINK-24214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17412321#comment-17412321 ] Jark Wu commented on FLINK-24214: - >From the stack trace, you should be using Flink version <= 1.12, could you >upgrade to 1.13.2 and try again? I checked code in 1.13.2 and exceptions from {{LocalExecutor#executeXxx}} have been wrapped into {{SqlExcutionException}}. The previous reported FLINK-22188 was using 1.11 and the code has been refactored in 1.13. [~twalthr], could we catch all exceptions in SQL CLI? I think it is still very error-prone to only catch {{SqlExecutionException}}. Is there any reason not catch all exception in the initial design? > A submit job failure crashes the sql client > --- > > Key: FLINK-24214 > URL: https://issues.apache.org/jira/browse/FLINK-24214 > Project: Flink > Issue Type: Bug > Components: Table SQL / Client >Affects Versions: 1.13.2 > Environment: Flink 1.13.2 > Ubuntu 21.04 > Java 8 >Reporter: Francesco Guardiani >Priority: Not a Priority > > I've noticed that when executing a valid query, in case there is a "bad" > error when submitting it to the flink cluster, the client is going to crash, > with a misleading beginning of the stacktrace. For example: > {code:java} > Exception in thread "main" org.apache.flink.table.client.SqlClientException: > Unexpected exception. This is a bug. Please consider filing an issue. > at org.apache.flink.table.client.SqlClient.main(SqlClient.java:190) > Caused by: java.lang.RuntimeException: Error running SQL job. > at > org.apache.flink.table.client.gateway.local.LocalExecutor.executeUpdateInternal(LocalExecutor.java:606) > at > org.apache.flink.table.client.gateway.local.LocalExecutor.executeUpdate(LocalExecutor.java:527) > at > org.apache.flink.table.client.cli.CliClient.callInsert(CliClient.java:537) > at > org.apache.flink.table.client.cli.CliClient.callCommand(CliClient.java:299) > at java.util.Optional.ifPresent(Optional.java:159) > at org.apache.flink.table.client.cli.CliClient.open(CliClient.java:200) > at org.apache.flink.table.client.SqlClient.openCli(SqlClient.java:125) > at org.apache.flink.table.client.SqlClient.start(SqlClient.java:104) > at org.apache.flink.table.client.SqlClient.main(SqlClient.java:178) > Caused by: java.util.concurrent.ExecutionException: > org.apache.flink.runtime.client.JobSubmissionException: Failed to submit > JobGraph. > at > java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357) > at > java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1908) > at > org.apache.flink.table.client.gateway.local.LocalExecutor.executeUpdateInternal(LocalExecutor.java:603) > ... 8 more > Caused by: org.apache.flink.runtime.client.JobSubmissionException: Failed to > submit JobGraph. > at > org.apache.flink.client.program.rest.RestClusterClient.lambda$submitJob$7(RestClusterClient.java:359) > at > java.util.concurrent.CompletableFuture.uniExceptionally(CompletableFuture.java:884) > at > java.util.concurrent.CompletableFuture$UniExceptionally.tryFire(CompletableFuture.java:866) > at > java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488) > at > java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1990) > at > org.apache.flink.runtime.concurrent.FutureUtils.lambda$retryOperationWithDelay$8(FutureUtils.java:274) > at > java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:774) > at > java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:750) > at > java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488) > at > java.util.concurrent.CompletableFuture.postFire(CompletableFuture.java:575) > at > java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:943) > at > java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:456) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > Caused by: org.apache.flink.runtime.rest.util.RestClientException: [Internal > server error., org.apache.flink.runtime.client.JobSubmissionException: Failed to submit job. > at > org.apache.flink.runtime.dispatcher.Dispatcher.lambda$internalSubmitJob$3(Dispatcher.java:336) > at > java.util.concurrent.CompletableFuture.uniHandle(CompletableFuture.java:836) > at >
[GitHub] [flink] flinkbot edited a comment on pull request #16513: [FLINK-23389][Formats] Glue schema registry JSON support
flinkbot edited a comment on pull request #16513: URL: https://github.com/apache/flink/pull/16513#issuecomment-881248601 ## CI report: * b83590dcb652ec96755c63e5ec7c9761fc96d87b UNKNOWN * 0ab16329651532d687f521ce9f6e75885f85d575 UNKNOWN * 0dcc76878db9d5f613a144f8b13747e541a7bdbe Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=23809) * e007d566345c353e5f8b068649cd398ba7259cb7 UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Comment Edited] (FLINK-20370) Result is wrong when sink primary key is not the same with query
[ https://issues.apache.org/jira/browse/FLINK-20370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17412302#comment-17412302 ] Jingsong Lee edited comment on FLINK-20370 at 9/9/21, 3:51 AM: --- Yes, it seems that the only solution is single parallelism. A upsert sink can accept inputs: # primary key = unique key, it is OK, nothing needs to do. # input is append only, sometimes is OK, We can only assume that users can allow some degree of distributed disorder. # input is change log, primary key != unique key (unique key can be null), the most problematic situation was (Author: lzljs3620320): Yes, it seems that the only solution is single parallelism. A upsert sink can accept inputs: # primary key = unique key, it is OK, nothing needs to do. # input is append only, sometimes is OK, We can only assume that users can allow some degree of distributed disorder. # primary key != unique key (unique key can be null), the most problematic situation > Result is wrong when sink primary key is not the same with query > > > Key: FLINK-20370 > URL: https://issues.apache.org/jira/browse/FLINK-20370 > Project: Flink > Issue Type: Bug > Components: Table SQL / Planner >Affects Versions: 1.12.0 >Reporter: Jark Wu >Priority: Minor > Labels: auto-deprioritized-major > > Both sources are upsert-kafka which synchronizes the changes from MySQL > tables (source_city, source_customer). The sink is another MySQL table which > is in upsert mode with "city_name" primary key. The join key is "city_id". > In this case, the result will be wrong when updating > {{source_city.city_name}} column in MySQL, as the UPDATE_BEFORE is ignored > and the old city_name is retained in the sink table. > {code} > Sink(table=[default_catalog.default_database.sink_kafka_count_city], > fields=[city_name, count_customer, sum_gender], changelogMode=[NONE]) > +- Calc(select=[city_name, CAST(count_customer) AS count_customer, > CAST(sum_gender) AS sum_gender], changelogMode=[I,UA,D]) >+- Join(joinType=[InnerJoin], where=[=(city_id, id)], select=[city_id, > count_customer, sum_gender, id, city_name], > leftInputSpec=[JoinKeyContainsUniqueKey], > rightInputSpec=[JoinKeyContainsUniqueKey], changelogMode=[I,UA,D]) > :- Exchange(distribution=[hash[city_id]], changelogMode=[I,UA,D]) > : +- GlobalGroupAggregate(groupBy=[city_id], select=[city_id, > COUNT_RETRACT(count1$0) AS count_customer, SUM_RETRACT((sum$1, count$2)) AS > sum_gender], changelogMode=[I,UA,D]) > : +- Exchange(distribution=[hash[city_id]], changelogMode=[I]) > :+- LocalGroupAggregate(groupBy=[city_id], select=[city_id, > COUNT_RETRACT(*) AS count1$0, SUM_RETRACT(gender) AS (sum$1, count$2)], > changelogMode=[I]) > : +- Calc(select=[city_id, gender], changelogMode=[I,UB,UA,D]) > : +- ChangelogNormalize(key=[customer_id], > changelogMode=[I,UB,UA,D]) > : +- Exchange(distribution=[hash[customer_id]], > changelogMode=[UA,D]) > :+- MiniBatchAssigner(interval=[3000ms], > mode=[ProcTime], changelogMode=[UA,D]) > : +- TableSourceScan(table=[[default_catalog, > default_database, source_customer]], fields=[customer_id, city_id, age, > gender, update_time], changelogMode=[UA,D]) > +- Exchange(distribution=[hash[id]], changelogMode=[I,UA,D]) > +- ChangelogNormalize(key=[id], changelogMode=[I,UA,D]) > +- Exchange(distribution=[hash[id]], changelogMode=[UA,D]) >+- MiniBatchAssigner(interval=[3000ms], mode=[ProcTime], > changelogMode=[UA,D]) > +- TableSourceScan(table=[[default_catalog, > default_database, source_city]], fields=[id, city_name], changelogMode=[UA,D]) > {code} > We have suggested users to use the same key of the query as the primary key > on sink in the documentation: > https://ci.apache.org/projects/flink/flink-docs-master/dev/table/sql/queries.html#deduplication. > We should make this attention to be more highlight in CREATE TABLE page. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (FLINK-20370) Result is wrong when sink primary key is not the same with query
[ https://issues.apache.org/jira/browse/FLINK-20370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17412302#comment-17412302 ] Jingsong Lee edited comment on FLINK-20370 at 9/9/21, 3:50 AM: --- Yes, it seems that the only solution is single parallelism. A upsert sink can accept inputs: # primary key = unique key, it is OK, nothing needs to do. # input is append only, sometimes is OK, We can only assume that users can allow some degree of distributed disorder. # primary key != unique key (unique key can be null), the most problematic situation was (Author: lzljs3620320): Yes, it seems that the only solution is single parallelism. A upsert sink can accept inputs: # primary key = unique key, it is OK # input is append only, sometimes is OK # primary key != unique key, the most problematic situation Maybe the third can be disabled. But maybe it will make many situations difficult to use. We need a flag... > Result is wrong when sink primary key is not the same with query > > > Key: FLINK-20370 > URL: https://issues.apache.org/jira/browse/FLINK-20370 > Project: Flink > Issue Type: Bug > Components: Table SQL / Planner >Affects Versions: 1.12.0 >Reporter: Jark Wu >Priority: Minor > Labels: auto-deprioritized-major > > Both sources are upsert-kafka which synchronizes the changes from MySQL > tables (source_city, source_customer). The sink is another MySQL table which > is in upsert mode with "city_name" primary key. The join key is "city_id". > In this case, the result will be wrong when updating > {{source_city.city_name}} column in MySQL, as the UPDATE_BEFORE is ignored > and the old city_name is retained in the sink table. > {code} > Sink(table=[default_catalog.default_database.sink_kafka_count_city], > fields=[city_name, count_customer, sum_gender], changelogMode=[NONE]) > +- Calc(select=[city_name, CAST(count_customer) AS count_customer, > CAST(sum_gender) AS sum_gender], changelogMode=[I,UA,D]) >+- Join(joinType=[InnerJoin], where=[=(city_id, id)], select=[city_id, > count_customer, sum_gender, id, city_name], > leftInputSpec=[JoinKeyContainsUniqueKey], > rightInputSpec=[JoinKeyContainsUniqueKey], changelogMode=[I,UA,D]) > :- Exchange(distribution=[hash[city_id]], changelogMode=[I,UA,D]) > : +- GlobalGroupAggregate(groupBy=[city_id], select=[city_id, > COUNT_RETRACT(count1$0) AS count_customer, SUM_RETRACT((sum$1, count$2)) AS > sum_gender], changelogMode=[I,UA,D]) > : +- Exchange(distribution=[hash[city_id]], changelogMode=[I]) > :+- LocalGroupAggregate(groupBy=[city_id], select=[city_id, > COUNT_RETRACT(*) AS count1$0, SUM_RETRACT(gender) AS (sum$1, count$2)], > changelogMode=[I]) > : +- Calc(select=[city_id, gender], changelogMode=[I,UB,UA,D]) > : +- ChangelogNormalize(key=[customer_id], > changelogMode=[I,UB,UA,D]) > : +- Exchange(distribution=[hash[customer_id]], > changelogMode=[UA,D]) > :+- MiniBatchAssigner(interval=[3000ms], > mode=[ProcTime], changelogMode=[UA,D]) > : +- TableSourceScan(table=[[default_catalog, > default_database, source_customer]], fields=[customer_id, city_id, age, > gender, update_time], changelogMode=[UA,D]) > +- Exchange(distribution=[hash[id]], changelogMode=[I,UA,D]) > +- ChangelogNormalize(key=[id], changelogMode=[I,UA,D]) > +- Exchange(distribution=[hash[id]], changelogMode=[UA,D]) >+- MiniBatchAssigner(interval=[3000ms], mode=[ProcTime], > changelogMode=[UA,D]) > +- TableSourceScan(table=[[default_catalog, > default_database, source_city]], fields=[id, city_name], changelogMode=[UA,D]) > {code} > We have suggested users to use the same key of the query as the primary key > on sink in the documentation: > https://ci.apache.org/projects/flink/flink-docs-master/dev/table/sql/queries.html#deduplication. > We should make this attention to be more highlight in CREATE TABLE page. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-22901) Introduce getChangeLogUpsertKeys in FlinkRelMetadataQuery
[ https://issues.apache.org/jira/browse/FLINK-22901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jingsong Lee updated FLINK-22901: - Fix Version/s: 1.13.3 > Introduce getChangeLogUpsertKeys in FlinkRelMetadataQuery > - > > Key: FLINK-22901 > URL: https://issues.apache.org/jira/browse/FLINK-22901 > Project: Flink > Issue Type: Sub-task > Components: Table SQL / Planner >Reporter: Jingsong Lee >Assignee: Jingsong Lee >Priority: Major > Labels: pull-request-available > Fix For: 1.14.0, 1.13.3 > > > For fix FLINK-20374, we need to resolve streaming computation disorder. we > need to introduce a change log upsert keys, this is not unique keys. > > {code:java} > /** > * Determines the set of change log upsert minimal keys for this expression. > A key is > * represented as an {@link org.apache.calcite.util.ImmutableBitSet}, where > each bit position > * represents a 0-based output column ordinal. > * > * Different from the unique keys: In distributed streaming computing, one > record may be > * divided into RowKind.UPDATE_BEFORE and RowKind.UPDATE_AFTER. If a key > changing join is > * connected downstream, the two records will be divided into different > tasks, resulting in > * disorder. In this case, the downstream cannot rely on the order of the > original key. So in > * this case, it has unique keys in the traditional sense, but it doesn't > have change log upsert > * keys. > * > * @return set of keys, or null if this information cannot be determined > (whereas empty set > * indicates definitely no keys at all) > */ > public Set getChangeLogUpsertKeys(RelNode rel); > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Reopened] (FLINK-22901) Introduce getChangeLogUpsertKeys in FlinkRelMetadataQuery
[ https://issues.apache.org/jira/browse/FLINK-22901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jingsong Lee reopened FLINK-22901: -- re-open for 1.13 > Introduce getChangeLogUpsertKeys in FlinkRelMetadataQuery > - > > Key: FLINK-22901 > URL: https://issues.apache.org/jira/browse/FLINK-22901 > Project: Flink > Issue Type: Sub-task > Components: Table SQL / Planner >Reporter: Jingsong Lee >Assignee: Jingsong Lee >Priority: Major > Labels: pull-request-available > Fix For: 1.14.0 > > > For fix FLINK-20374, we need to resolve streaming computation disorder. we > need to introduce a change log upsert keys, this is not unique keys. > > {code:java} > /** > * Determines the set of change log upsert minimal keys for this expression. > A key is > * represented as an {@link org.apache.calcite.util.ImmutableBitSet}, where > each bit position > * represents a 0-based output column ordinal. > * > * Different from the unique keys: In distributed streaming computing, one > record may be > * divided into RowKind.UPDATE_BEFORE and RowKind.UPDATE_AFTER. If a key > changing join is > * connected downstream, the two records will be divided into different > tasks, resulting in > * disorder. In this case, the downstream cannot rely on the order of the > original key. So in > * this case, it has unique keys in the traditional sense, but it doesn't > have change log upsert > * keys. > * > * @return set of keys, or null if this information cannot be determined > (whereas empty set > * indicates definitely no keys at all) > */ > public Set getChangeLogUpsertKeys(RelNode rel); > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-20370) Result is wrong when sink primary key is not the same with query
[ https://issues.apache.org/jira/browse/FLINK-20370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17412318#comment-17412318 ] Jark Wu commented on FLINK-20370: - [~lzljs3620320] there is 4th case: input is changelog, but there is no unique key. > Result is wrong when sink primary key is not the same with query > > > Key: FLINK-20370 > URL: https://issues.apache.org/jira/browse/FLINK-20370 > Project: Flink > Issue Type: Bug > Components: Table SQL / Planner >Affects Versions: 1.12.0 >Reporter: Jark Wu >Priority: Minor > Labels: auto-deprioritized-major > > Both sources are upsert-kafka which synchronizes the changes from MySQL > tables (source_city, source_customer). The sink is another MySQL table which > is in upsert mode with "city_name" primary key. The join key is "city_id". > In this case, the result will be wrong when updating > {{source_city.city_name}} column in MySQL, as the UPDATE_BEFORE is ignored > and the old city_name is retained in the sink table. > {code} > Sink(table=[default_catalog.default_database.sink_kafka_count_city], > fields=[city_name, count_customer, sum_gender], changelogMode=[NONE]) > +- Calc(select=[city_name, CAST(count_customer) AS count_customer, > CAST(sum_gender) AS sum_gender], changelogMode=[I,UA,D]) >+- Join(joinType=[InnerJoin], where=[=(city_id, id)], select=[city_id, > count_customer, sum_gender, id, city_name], > leftInputSpec=[JoinKeyContainsUniqueKey], > rightInputSpec=[JoinKeyContainsUniqueKey], changelogMode=[I,UA,D]) > :- Exchange(distribution=[hash[city_id]], changelogMode=[I,UA,D]) > : +- GlobalGroupAggregate(groupBy=[city_id], select=[city_id, > COUNT_RETRACT(count1$0) AS count_customer, SUM_RETRACT((sum$1, count$2)) AS > sum_gender], changelogMode=[I,UA,D]) > : +- Exchange(distribution=[hash[city_id]], changelogMode=[I]) > :+- LocalGroupAggregate(groupBy=[city_id], select=[city_id, > COUNT_RETRACT(*) AS count1$0, SUM_RETRACT(gender) AS (sum$1, count$2)], > changelogMode=[I]) > : +- Calc(select=[city_id, gender], changelogMode=[I,UB,UA,D]) > : +- ChangelogNormalize(key=[customer_id], > changelogMode=[I,UB,UA,D]) > : +- Exchange(distribution=[hash[customer_id]], > changelogMode=[UA,D]) > :+- MiniBatchAssigner(interval=[3000ms], > mode=[ProcTime], changelogMode=[UA,D]) > : +- TableSourceScan(table=[[default_catalog, > default_database, source_customer]], fields=[customer_id, city_id, age, > gender, update_time], changelogMode=[UA,D]) > +- Exchange(distribution=[hash[id]], changelogMode=[I,UA,D]) > +- ChangelogNormalize(key=[id], changelogMode=[I,UA,D]) > +- Exchange(distribution=[hash[id]], changelogMode=[UA,D]) >+- MiniBatchAssigner(interval=[3000ms], mode=[ProcTime], > changelogMode=[UA,D]) > +- TableSourceScan(table=[[default_catalog, > default_database, source_city]], fields=[id, city_name], changelogMode=[UA,D]) > {code} > We have suggested users to use the same key of the query as the primary key > on sink in the documentation: > https://ci.apache.org/projects/flink/flink-docs-master/dev/table/sql/queries.html#deduplication. > We should make this attention to be more highlight in CREATE TABLE page. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] JingsongLi opened a new pull request #17207: [FLINK-22901][table] Introduce getUpsertKeys in FlinkRelMetadataQuery
JingsongLi opened a new pull request #17207: URL: https://github.com/apache/flink/pull/17207 Cherry-pick #16096 ## What is the purpose of the change For fix FLINK-20374, we need to resolve streaming computation disorder. we need to introduce a change log upsert keys, this is not unique keys. ``` /** * Determines the set of change log upsert minimal keys for this expression. A key is * represented as an {@link org.apache.calcite.util.ImmutableBitSet}, where each bit position * represents a 0-based output column ordinal. * * Different from the unique keys: In distributed streaming computing, one record may be * divided into RowKind.UPDATE_BEFORE and RowKind.UPDATE_AFTER. If a key changing join is * connected downstream, the two records will be divided into different tasks, resulting in * disorder. In this case, the downstream cannot rely on the order of the original key. So in * this case, it has unique keys in the traditional sense, but it doesn't have change log upsert * keys. * * @return set of keys, or null if this information cannot be determined (whereas empty set * indicates definitely no keys at all) */ public Set getChangeLogUpsertKeys(RelNode rel); ``` ## Brief change log - Introduce `FlinkRelMdChangeLogUpsertKeys` - Introduce `FlinkRelMdChangeLogUpsertKeysTest` ## Verifying this change `FlinkRelMdChangeLogUpsertKeysTest` ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): (yes / no) no - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (yes / no) no - The serializers: (yes / no / don't know) no - The runtime per-record code paths (performance sensitive): (yes / no / don't know) no - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn/Mesos, ZooKeeper: (yes / no / don't know) no - The S3 file system connector: (yes / no / don't know) no ## Documentation - Does this pull request introduce a new feature? (yes / no) no -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-20370) Result is wrong when sink primary key is not the same with query
[ https://issues.apache.org/jira/browse/FLINK-20370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17412317#comment-17412317 ] Jark Wu commented on FLINK-20370: - [~lzljs3620320] could UpsertMaterialize also solve this problem? > Result is wrong when sink primary key is not the same with query > > > Key: FLINK-20370 > URL: https://issues.apache.org/jira/browse/FLINK-20370 > Project: Flink > Issue Type: Bug > Components: Table SQL / Planner >Affects Versions: 1.12.0 >Reporter: Jark Wu >Priority: Minor > Labels: auto-deprioritized-major > > Both sources are upsert-kafka which synchronizes the changes from MySQL > tables (source_city, source_customer). The sink is another MySQL table which > is in upsert mode with "city_name" primary key. The join key is "city_id". > In this case, the result will be wrong when updating > {{source_city.city_name}} column in MySQL, as the UPDATE_BEFORE is ignored > and the old city_name is retained in the sink table. > {code} > Sink(table=[default_catalog.default_database.sink_kafka_count_city], > fields=[city_name, count_customer, sum_gender], changelogMode=[NONE]) > +- Calc(select=[city_name, CAST(count_customer) AS count_customer, > CAST(sum_gender) AS sum_gender], changelogMode=[I,UA,D]) >+- Join(joinType=[InnerJoin], where=[=(city_id, id)], select=[city_id, > count_customer, sum_gender, id, city_name], > leftInputSpec=[JoinKeyContainsUniqueKey], > rightInputSpec=[JoinKeyContainsUniqueKey], changelogMode=[I,UA,D]) > :- Exchange(distribution=[hash[city_id]], changelogMode=[I,UA,D]) > : +- GlobalGroupAggregate(groupBy=[city_id], select=[city_id, > COUNT_RETRACT(count1$0) AS count_customer, SUM_RETRACT((sum$1, count$2)) AS > sum_gender], changelogMode=[I,UA,D]) > : +- Exchange(distribution=[hash[city_id]], changelogMode=[I]) > :+- LocalGroupAggregate(groupBy=[city_id], select=[city_id, > COUNT_RETRACT(*) AS count1$0, SUM_RETRACT(gender) AS (sum$1, count$2)], > changelogMode=[I]) > : +- Calc(select=[city_id, gender], changelogMode=[I,UB,UA,D]) > : +- ChangelogNormalize(key=[customer_id], > changelogMode=[I,UB,UA,D]) > : +- Exchange(distribution=[hash[customer_id]], > changelogMode=[UA,D]) > :+- MiniBatchAssigner(interval=[3000ms], > mode=[ProcTime], changelogMode=[UA,D]) > : +- TableSourceScan(table=[[default_catalog, > default_database, source_customer]], fields=[customer_id, city_id, age, > gender, update_time], changelogMode=[UA,D]) > +- Exchange(distribution=[hash[id]], changelogMode=[I,UA,D]) > +- ChangelogNormalize(key=[id], changelogMode=[I,UA,D]) > +- Exchange(distribution=[hash[id]], changelogMode=[UA,D]) >+- MiniBatchAssigner(interval=[3000ms], mode=[ProcTime], > changelogMode=[UA,D]) > +- TableSourceScan(table=[[default_catalog, > default_database, source_city]], fields=[id, city_name], changelogMode=[UA,D]) > {code} > We have suggested users to use the same key of the query as the primary key > on sink in the documentation: > https://ci.apache.org/projects/flink/flink-docs-master/dev/table/sql/queries.html#deduplication. > We should make this attention to be more highlight in CREATE TABLE page. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] becketqin commented on pull request #17064: [FLINK-24059][Connectors/Common] Allow SourceReaderTestBase.NUM_SPLITS to be overridden
becketqin commented on pull request #17064: URL: https://github.com/apache/flink/pull/17064#issuecomment-915734942 Thanks for the patch. LGTM. Merged to master d4c483fadd3df32045fbb2ee117d0a6eeab9276e -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-10230) Support 'SHOW CREATE VIEW' syntax to print the query of a view
[ https://issues.apache.org/jira/browse/FLINK-10230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17412313#comment-17412313 ] Jark Wu commented on FLINK-10230: - I'm fine with both SHOW CREATE TABLE and SHOW CREATE VIEW. [~twalthr], do you have any preference here? > Support 'SHOW CREATE VIEW' syntax to print the query of a view > -- > > Key: FLINK-10230 > URL: https://issues.apache.org/jira/browse/FLINK-10230 > Project: Flink > Issue Type: New Feature > Components: Table SQL / API, Table SQL / Client >Reporter: Timo Walther >Priority: Minor > Labels: auto-deprioritized-major, auto-unassigned, > pull-request-available > Fix For: 1.14.0 > > > FLINK-10163 added initial support for views in SQL Client. We should add a > command that allows for printing the query of a view for debugging. MySQL > offers {{SHOW CREATE VIEW}} for this. Hive generalizes this to {{SHOW CREATE > TABLE}}. The latter one could be extended to also show information about the > used table factories and properties. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] becketqin merged pull request #17064: [FLINK-24059][Connectors/Common] Allow SourceReaderTestBase.NUM_SPLITS to be overridden
becketqin merged pull request #17064: URL: https://github.com/apache/flink/pull/17064 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] gaoyunhaii commented on a change in pull request #17023: [FLINK-24043][runtime] Reuse the code of 'check savepoint preconditions'.
gaoyunhaii commented on a change in pull request #17023: URL: https://github.com/apache/flink/pull/17023#discussion_r704923725 ## File path: flink-runtime/src/main/java/org/apache/flink/runtime/scheduler/SchedulerBase.java ## @@ -825,22 +824,9 @@ public void updateAccumulators(final AccumulatorSnapshot accumulatorSnapshot) { final CheckpointCoordinator checkpointCoordinator = executionGraph.getCheckpointCoordinator(); -if (checkpointCoordinator == null) { -throw new IllegalStateException( -String.format("Job %s is not a streaming job.", jobGraph.getJobID())); -} else if (targetDirectory == null -&& !checkpointCoordinator.getCheckpointStorage().hasDefaultSavepointLocation()) { -log.info( -"Trying to cancel job {} with savepoint, but no savepoint directory configured.", -jobGraph.getJobID()); - -throw new IllegalStateException( -"No savepoint directory configured. You can either specify a directory " -+ "while cancelling via -s :targetDirectory or configure a cluster-wide " Review comment: Hi @RocMarshal , sorry I might miss this point: since now we share the same message for stop-with-savepoint, normal savepoint and legacy cancelling with savepoint, this description seems not always right: 1. For normal savepoint, the command line format is `bin/flink savepoint ` 2. For stop-with-savepoint, the command line format is `bin/flink stop -p `. 3. For the legacy cancel with savepoint, the command line is indeed `bin/flink cancel -s `. Thus the message here seems not cover all the situations. Although it is not introduced in this PR, perhaps we could also change it to ``` "You can either specify a directory via configure a cluster-wide + "default via key '" + CheckpointingOptions.SAVEPOINT_DIRECTORY.key() + "' or specify a directory in the command line, like + "-s :targetDirectory for cancelling, -p :targetDirectory for stopping or :targetDirectory for " + "purely taking savepoint" ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] gaoyunhaii commented on a change in pull request #17023: [FLINK-24043][runtime] Reuse the code of 'check savepoint preconditions'.
gaoyunhaii commented on a change in pull request #17023: URL: https://github.com/apache/flink/pull/17023#discussion_r704923725 ## File path: flink-runtime/src/main/java/org/apache/flink/runtime/scheduler/SchedulerBase.java ## @@ -825,22 +824,9 @@ public void updateAccumulators(final AccumulatorSnapshot accumulatorSnapshot) { final CheckpointCoordinator checkpointCoordinator = executionGraph.getCheckpointCoordinator(); -if (checkpointCoordinator == null) { -throw new IllegalStateException( -String.format("Job %s is not a streaming job.", jobGraph.getJobID())); -} else if (targetDirectory == null -&& !checkpointCoordinator.getCheckpointStorage().hasDefaultSavepointLocation()) { -log.info( -"Trying to cancel job {} with savepoint, but no savepoint directory configured.", -jobGraph.getJobID()); - -throw new IllegalStateException( -"No savepoint directory configured. You can either specify a directory " -+ "while cancelling via -s :targetDirectory or configure a cluster-wide " Review comment: Hi @RocMarshal , sorry I might miss this point: since now we share the same message for stop-with-savepoint and normal savepoint, this description seems not always right: 1. For normal savepoint, the command line format is `bin/flink savepoint ` 2. For stop-with-savepoint, the command line format is `bin/flink stop -p `. 3. For the legacy cancel with savepoint, the command line is indeed `bin/flink cancel -s `. Thus the message here seems not cover all the situations. Although it is not introduced in this PR, perhaps we could also change it to ``` "You can either specify a directory via configure a cluster-wide + "default via key '" + CheckpointingOptions.SAVEPOINT_DIRECTORY.key() + "' or specify a directory in the command line, like + "-s :targetDirectory for cancelling, -p :targetDirectory for stopping or :targetDirectory for " + "purely taking savepoint" ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #17118: [FLINK-22603][table-planner]The digest can be produced by SourceAbilitySpec.
flinkbot edited a comment on pull request #17118: URL: https://github.com/apache/flink/pull/17118#issuecomment-911539514 ## CI report: * Unknown: [CANCELED](TBD) * 02e870f42aad87379e45b2aaf3fed2e8351fc0a2 Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=23812) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #16513: [FLINK-23389][Formats] Glue schema registry JSON support
flinkbot edited a comment on pull request #16513: URL: https://github.com/apache/flink/pull/16513#issuecomment-881248601 ## CI report: * b83590dcb652ec96755c63e5ec7c9761fc96d87b UNKNOWN * 0ab16329651532d687f521ce9f6e75885f85d575 UNKNOWN * 0dcc76878db9d5f613a144f8b13747e541a7bdbe Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=23809) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-22455) FlinkRelBuilder#windowAggregate will throw ClassCastException when function reuse
[ https://issues.apache.org/jira/browse/FLINK-22455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17412307#comment-17412307 ] xuyangzhong commented on FLINK-22455: - Hi, [~tartarus]. I'm trying to solve this issue recently, and i can't trigger this bug using real case by sql/table api and reproduce this question. It seems like that only using table api will enter this function while using sql api don't use FlinkRelBuilder(you can see the code in Function LogicalWindowAggregateRuleBase and line 100-113). I create some tests like the following: {code:java} // test1 Table result = tenv .from("orders") .window(Tumble.over(lit(4).seconds()).on($("rowtime")).as("mywindow")) .groupBy($("it1"), $("mywindow")) .select($("it1"), $("it2").count().as("b"), $("it2").count().as("b"), $("mywindow").start(), $("mywindow").end()); {code} {code:java} // test2 Table result = tenv .from("orders") .window(Tumble.over(lit(4).seconds()).on($("rowtime")).as("mywindow")) .groupBy($("it1"), $("mywindow")) .select($("it1"), $("it2").count().as("b1"), $("it2").count().as("b2"), $("mywindow").start(), $("mywindow").end());{code} Test1 throws Exception directly because in function validateAndGetUniqueNames in ProjectionOperationFactory, names will be checked to avoid duplication. Test2 enters the function the issue mentioned but _aggCalls_ has one element because in the previous optimization, the $("it2").count will be merged to one expression although they have the different name "b1" and "b2". According to the above, i think this code this issue mentioned perhaps have the logical bug indeed but won't be trigged by the real case by table/sql api. So can you have a test case used by sql/table api to help me solve the issue? > FlinkRelBuilder#windowAggregate will throw ClassCastException when function > reuse > - > > Key: FLINK-22455 > URL: https://issues.apache.org/jira/browse/FLINK-22455 > Project: Flink > Issue Type: Bug > Components: Table SQL / Planner >Reporter: tartarus >Priority: Minor > Labels: auto-deprioritized-major > Attachments: FlinkRelBuilderTest.scala > > > If the input parameter aggCalls of FlinkRelBuilder#windowAggregate contains > the same aggregate function. Then it will throw ClassCastException, because > of the optimization of aggregate function reuse. We did not judge the return > value type, but direct type conversion; > {code:java} > val aggregate = super.transform( > new UnaryOperator[RelBuilder.Config] { > override def apply(t: RelBuilder.Config) > : RelBuilder.Config = t.withPruneInputOfAggregate(false) > }) > .push(build()) > .aggregate(groupKey, aggCalls) > .build() > .asInstanceOf[LogicalAggregate] > {code} > I wrote a test that triggered this problem. > You can use the attached code to reproduce this problem. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] flinkbot edited a comment on pull request #17206: [FLINK-24199][python] Expose StreamExecutionEnvironment#configure in Python API
flinkbot edited a comment on pull request #17206: URL: https://github.com/apache/flink/pull/17206#issuecomment-915706736 ## CI report: * 169c8d343e362c0244d2c977bab4db6ac2b47ee1 Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=23808) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #17118: [FLINK-22603][table-planner]The digest can be produced by SourceAbilitySpec.
flinkbot edited a comment on pull request #17118: URL: https://github.com/apache/flink/pull/17118#issuecomment-911539514 ## CI report: * Unknown: [CANCELED](TBD) * 02e870f42aad87379e45b2aaf3fed2e8351fc0a2 UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-20370) Result is wrong when sink primary key is not the same with query
[ https://issues.apache.org/jira/browse/FLINK-20370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17412302#comment-17412302 ] Jingsong Lee commented on FLINK-20370: -- Yes, it seems that the only solution is single parallelism. A upsert sink can accept inputs: # primary key = unique key, it is OK # input is append only, sometimes is OK # primary key != unique key, the most problematic situation Maybe the third can be disabled. But maybe it will make many situations difficult to use. We need a flag... > Result is wrong when sink primary key is not the same with query > > > Key: FLINK-20370 > URL: https://issues.apache.org/jira/browse/FLINK-20370 > Project: Flink > Issue Type: Bug > Components: Table SQL / Planner >Affects Versions: 1.12.0 >Reporter: Jark Wu >Priority: Minor > Labels: auto-deprioritized-major > > Both sources are upsert-kafka which synchronizes the changes from MySQL > tables (source_city, source_customer). The sink is another MySQL table which > is in upsert mode with "city_name" primary key. The join key is "city_id". > In this case, the result will be wrong when updating > {{source_city.city_name}} column in MySQL, as the UPDATE_BEFORE is ignored > and the old city_name is retained in the sink table. > {code} > Sink(table=[default_catalog.default_database.sink_kafka_count_city], > fields=[city_name, count_customer, sum_gender], changelogMode=[NONE]) > +- Calc(select=[city_name, CAST(count_customer) AS count_customer, > CAST(sum_gender) AS sum_gender], changelogMode=[I,UA,D]) >+- Join(joinType=[InnerJoin], where=[=(city_id, id)], select=[city_id, > count_customer, sum_gender, id, city_name], > leftInputSpec=[JoinKeyContainsUniqueKey], > rightInputSpec=[JoinKeyContainsUniqueKey], changelogMode=[I,UA,D]) > :- Exchange(distribution=[hash[city_id]], changelogMode=[I,UA,D]) > : +- GlobalGroupAggregate(groupBy=[city_id], select=[city_id, > COUNT_RETRACT(count1$0) AS count_customer, SUM_RETRACT((sum$1, count$2)) AS > sum_gender], changelogMode=[I,UA,D]) > : +- Exchange(distribution=[hash[city_id]], changelogMode=[I]) > :+- LocalGroupAggregate(groupBy=[city_id], select=[city_id, > COUNT_RETRACT(*) AS count1$0, SUM_RETRACT(gender) AS (sum$1, count$2)], > changelogMode=[I]) > : +- Calc(select=[city_id, gender], changelogMode=[I,UB,UA,D]) > : +- ChangelogNormalize(key=[customer_id], > changelogMode=[I,UB,UA,D]) > : +- Exchange(distribution=[hash[customer_id]], > changelogMode=[UA,D]) > :+- MiniBatchAssigner(interval=[3000ms], > mode=[ProcTime], changelogMode=[UA,D]) > : +- TableSourceScan(table=[[default_catalog, > default_database, source_customer]], fields=[customer_id, city_id, age, > gender, update_time], changelogMode=[UA,D]) > +- Exchange(distribution=[hash[id]], changelogMode=[I,UA,D]) > +- ChangelogNormalize(key=[id], changelogMode=[I,UA,D]) > +- Exchange(distribution=[hash[id]], changelogMode=[UA,D]) >+- MiniBatchAssigner(interval=[3000ms], mode=[ProcTime], > changelogMode=[UA,D]) > +- TableSourceScan(table=[[default_catalog, > default_database, source_city]], fields=[id, city_name], changelogMode=[UA,D]) > {code} > We have suggested users to use the same key of the query as the primary key > on sink in the documentation: > https://ci.apache.org/projects/flink/flink-docs-master/dev/table/sql/queries.html#deduplication. > We should make this attention to be more highlight in CREATE TABLE page. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Closed] (FLINK-24196) Translate "EXPLAIN Statements" page of "SQL" into Chinese
[ https://issues.apache.org/jira/browse/FLINK-24196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jark Wu closed FLINK-24196. --- Fix Version/s: 1.15.0 Resolution: Fixed Fixed in master: 125cb70a232b6a6e52c099c6a280414e2f896ad5 > Translate "EXPLAIN Statements" page of "SQL" into Chinese > - > > Key: FLINK-24196 > URL: https://issues.apache.org/jira/browse/FLINK-24196 > Project: Flink > Issue Type: Sub-task > Components: Documentation >Reporter: wuguihu >Assignee: wuguihu >Priority: Minor > Labels: pull-request-available > Fix For: 1.15.0 > > > [https://ci.apache.org/projects/flink/flink-docs-master/docs/dev/table/sql/explain/] > > docs/content.zh/docs/dev/table/sql/explain.md -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] wuchong merged pull request #17195: [FLINK-24196][doc]Translate "EXPLAIN Statements" page of "SQL" into C…
wuchong merged pull request #17195: URL: https://github.com/apache/flink/pull/17195 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Comment Edited] (FLINK-24195) Translate "DESCRIBE Statements" page of "SQL" into Chinese
[ https://issues.apache.org/jira/browse/FLINK-24195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17412293#comment-17412293 ] Jark Wu edited comment on FLINK-24195 at 9/9/21, 2:42 AM: -- Fixed in master: 7b436773fc8eda6bb5f3c01b917d15d15aa70521 was (Author: jark): Fixed in maste: 7b436773fc8eda6bb5f3c01b917d15d15aa70521 > Translate "DESCRIBE Statements" page of "SQL" into Chinese > -- > > Key: FLINK-24195 > URL: https://issues.apache.org/jira/browse/FLINK-24195 > Project: Flink > Issue Type: Sub-task > Components: Documentation >Reporter: wuguihu >Assignee: wuguihu >Priority: Minor > Labels: pull-request-available > Fix For: 1.15.0 > > > [https://ci.apache.org/projects/flink/flink-docs-master/docs/dev/table/sql/describe/] > docs/content.zh/docs/dev/table/sql/describe.md -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Closed] (FLINK-24195) Translate "DESCRIBE Statements" page of "SQL" into Chinese
[ https://issues.apache.org/jira/browse/FLINK-24195?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jark Wu closed FLINK-24195. --- Fix Version/s: 1.15.0 Resolution: Fixed Fixed in maste: 7b436773fc8eda6bb5f3c01b917d15d15aa70521 > Translate "DESCRIBE Statements" page of "SQL" into Chinese > -- > > Key: FLINK-24195 > URL: https://issues.apache.org/jira/browse/FLINK-24195 > Project: Flink > Issue Type: Sub-task > Components: Documentation >Reporter: wuguihu >Assignee: wuguihu >Priority: Minor > Labels: pull-request-available > Fix For: 1.15.0 > > > [https://ci.apache.org/projects/flink/flink-docs-master/docs/dev/table/sql/describe/] > docs/content.zh/docs/dev/table/sql/describe.md -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] wuchong merged pull request #17192: [FLINK-24195][doc]Translate "DESCRIBE Statements" page of "SQL" into …
wuchong merged pull request #17192: URL: https://github.com/apache/flink/pull/17192 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #16513: [FLINK-23389][Formats] Glue schema registry JSON support
flinkbot edited a comment on pull request #16513: URL: https://github.com/apache/flink/pull/16513#issuecomment-881248601 ## CI report: * b83590dcb652ec96755c63e5ec7c9761fc96d87b UNKNOWN * 0ab16329651532d687f521ce9f6e75885f85d575 UNKNOWN * 9f5b8802bfed0310493427a8a24557ca10c271e2 Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=22805) * 0dcc76878db9d5f613a144f8b13747e541a7bdbe UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Assigned] (FLINK-24219) Translate "SET Statements" page of "SQL" into Chinese
[ https://issues.apache.org/jira/browse/FLINK-24219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jark Wu reassigned FLINK-24219: --- Assignee: wuguihu > Translate "SET Statements" page of "SQL" into Chinese > - > > Key: FLINK-24219 > URL: https://issues.apache.org/jira/browse/FLINK-24219 > Project: Flink > Issue Type: Sub-task > Components: Documentation >Reporter: wuguihu >Assignee: wuguihu >Priority: Minor > > [https://ci.apache.org/projects/flink/flink-docs-master/docs/dev/table/sql/set/] > docs/content.zh/docs/dev/table/sql/set.md -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (FLINK-24218) Translate "UNLOAD Statements" page of "SQL" into Chinese
[ https://issues.apache.org/jira/browse/FLINK-24218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jark Wu reassigned FLINK-24218: --- Assignee: wuguihu > Translate "UNLOAD Statements" page of "SQL" into Chinese > > > Key: FLINK-24218 > URL: https://issues.apache.org/jira/browse/FLINK-24218 > Project: Flink > Issue Type: Sub-task > Components: Documentation >Reporter: wuguihu >Assignee: wuguihu >Priority: Minor > > [https://ci.apache.org/projects/flink/flink-docs-master/docs/dev/table/sql/unload/] > docs/content.zh/docs/dev/table/sql/unload.md -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (FLINK-24217) Translate "LOAD Statements" page of "SQL" into Chinese
[ https://issues.apache.org/jira/browse/FLINK-24217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jark Wu reassigned FLINK-24217: --- Assignee: wuguihu > Translate "LOAD Statements" page of "SQL" into Chinese > -- > > Key: FLINK-24217 > URL: https://issues.apache.org/jira/browse/FLINK-24217 > Project: Flink > Issue Type: Sub-task > Components: Documentation >Reporter: wuguihu >Assignee: wuguihu >Priority: Minor > > [https://ci.apache.org/projects/flink/flink-docs-master/docs/dev/table/sql/unload/] > docs/content.zh/docs/dev/table/sql/unload.md -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (FLINK-24220) Translate "RESET Statements" page of "SQL" into Chinese
[ https://issues.apache.org/jira/browse/FLINK-24220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jark Wu reassigned FLINK-24220: --- Assignee: wuguihu > Translate "RESET Statements" page of "SQL" into Chinese > --- > > Key: FLINK-24220 > URL: https://issues.apache.org/jira/browse/FLINK-24220 > Project: Flink > Issue Type: Sub-task > Components: Documentation >Reporter: wuguihu >Assignee: wuguihu >Priority: Minor > > [https://ci.apache.org/projects/flink/flink-docs-master/docs/dev/table/sql/reset/] > docs/content.zh/docs/dev/table/sql/reset.md -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (FLINK-24221) Translate "JAR Statements" page of "SQL" into Chinese
[ https://issues.apache.org/jira/browse/FLINK-24221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jark Wu reassigned FLINK-24221: --- Assignee: wuguihu > Translate "JAR Statements" page of "SQL" into Chinese > - > > Key: FLINK-24221 > URL: https://issues.apache.org/jira/browse/FLINK-24221 > Project: Flink > Issue Type: Sub-task > Components: Documentation >Reporter: wuguihu >Assignee: wuguihu >Priority: Minor > > [https://ci.apache.org/projects/flink/flink-docs-master/docs/dev/table/sql/jar/] > docs/content.zh/docs/dev/table/sql/jar.md -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-24168) Rowtime type is not correct for windowTableFunction or OverAggregate follows after Match because the output type does not updated after input rowtime attribute changed
[ https://issues.apache.org/jira/browse/FLINK-24168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17412286#comment-17412286 ] Jark Wu commented on FLINK-24168: - +1 to only fix in master. Actually, we also not fix in master, but just introduce another way (i.e. new feature) for users, I don't think this is applicable for minor versions. We can update 1.13 documentation to note this limitation and let users use TIMESTAMP as rowtime instead. > Rowtime type is not correct for windowTableFunction or OverAggregate follows > after Match because the output type does not updated after input rowtime > attribute changed from rowtime to rowtime_ltz > --- > > Key: FLINK-24168 > URL: https://issues.apache.org/jira/browse/FLINK-24168 > Project: Flink > Issue Type: Bug > Components: Table SQL / Planner >Affects Versions: 1.14.0 >Reporter: JING ZHANG >Assignee: JING ZHANG >Priority: Critical > Labels: pull-request-available > Fix For: 1.15.0 > > > Rowtime type is not correct for windowTableFunction or OverAggregate on Match > because the output type does not updated after input rowtime attribute > changed from rowtime to rowtime_ltz in `RelTimeIndicator`. > The bug could be reproduced by the following two cases: > {code:java} > @Test > def testWindowTVFOnMatchRecognizeOnRowtimeLTZ(): Unit = { > val sqlQuery = > s""" >|SELECT >| * >|FROM Ticker >|MATCH_RECOGNIZE ( >| PARTITION BY symbol >| ORDER BY ts_ltz >| MEASURES >|A.price as price, >|A.tax as tax, >|MATCH_ROWTIME() as matchRowtime >| ONE ROW PER MATCH >| PATTERN (A) >| DEFINE >|A AS A.price > 0 >|) AS T >|""".stripMargin > val table = util.tableEnv.sqlQuery(sqlQuery) > util.tableEnv.registerTable("T", table) > val sqlQuery1 = > s""" >|SELECT * >|FROM TABLE(TUMBLE(TABLE T, DESCRIPTOR(matchRowtime), INTERVAL '3' > second)) >|""".stripMargin > util.verifyRelPlanWithType(sqlQuery1) > } > @Test > def testOverWindowOnMatchRecognizeOnRowtimeLTZ(): Unit = { > val sqlQuery = > s""" >|SELECT >| * >|FROM Ticker >|MATCH_RECOGNIZE ( >| PARTITION BY symbol >| ORDER BY ts_ltz >| MEASURES >|A.price as price, >|A.tax as tax, >|MATCH_ROWTIME() as matchRowtime >| ONE ROW PER MATCH >| PATTERN (A) >| DEFINE >|A AS A.price > 0 >|) AS T >|""".stripMargin > val table = util.tableEnv.sqlQuery(sqlQuery) > util.tableEnv.registerTable("T", table) > val sqlQuery1 = > """ > |SELECT > | symbol, > | price, > | tax, > | matchRowtime, > | SUM(price) OVER ( > |PARTITION BY symbol ORDER BY matchRowtime RANGE UNBOUNDED > PRECEDING) as price_sum > |FROM T > """.stripMargin > util.verifyRelPlanWithType(sqlQuery1) > } > {code} > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-24168) Rowtime type is not correct for windowTableFunction or OverAggregate follows after Match because the output type does not updated after input rowtime attribute changed f
[ https://issues.apache.org/jira/browse/FLINK-24168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] godfrey he updated FLINK-24168: --- Fix Version/s: (was: 1.14.0) 1.15.0 > Rowtime type is not correct for windowTableFunction or OverAggregate follows > after Match because the output type does not updated after input rowtime > attribute changed from rowtime to rowtime_ltz > --- > > Key: FLINK-24168 > URL: https://issues.apache.org/jira/browse/FLINK-24168 > Project: Flink > Issue Type: Bug > Components: Table SQL / Planner >Affects Versions: 1.14.0 >Reporter: JING ZHANG >Assignee: JING ZHANG >Priority: Critical > Labels: pull-request-available > Fix For: 1.15.0 > > > Rowtime type is not correct for windowTableFunction or OverAggregate on Match > because the output type does not updated after input rowtime attribute > changed from rowtime to rowtime_ltz in `RelTimeIndicator`. > The bug could be reproduced by the following two cases: > {code:java} > @Test > def testWindowTVFOnMatchRecognizeOnRowtimeLTZ(): Unit = { > val sqlQuery = > s""" >|SELECT >| * >|FROM Ticker >|MATCH_RECOGNIZE ( >| PARTITION BY symbol >| ORDER BY ts_ltz >| MEASURES >|A.price as price, >|A.tax as tax, >|MATCH_ROWTIME() as matchRowtime >| ONE ROW PER MATCH >| PATTERN (A) >| DEFINE >|A AS A.price > 0 >|) AS T >|""".stripMargin > val table = util.tableEnv.sqlQuery(sqlQuery) > util.tableEnv.registerTable("T", table) > val sqlQuery1 = > s""" >|SELECT * >|FROM TABLE(TUMBLE(TABLE T, DESCRIPTOR(matchRowtime), INTERVAL '3' > second)) >|""".stripMargin > util.verifyRelPlanWithType(sqlQuery1) > } > @Test > def testOverWindowOnMatchRecognizeOnRowtimeLTZ(): Unit = { > val sqlQuery = > s""" >|SELECT >| * >|FROM Ticker >|MATCH_RECOGNIZE ( >| PARTITION BY symbol >| ORDER BY ts_ltz >| MEASURES >|A.price as price, >|A.tax as tax, >|MATCH_ROWTIME() as matchRowtime >| ONE ROW PER MATCH >| PATTERN (A) >| DEFINE >|A AS A.price > 0 >|) AS T >|""".stripMargin > val table = util.tableEnv.sqlQuery(sqlQuery) > util.tableEnv.registerTable("T", table) > val sqlQuery1 = > """ > |SELECT > | symbol, > | price, > | tax, > | matchRowtime, > | SUM(price) OVER ( > |PARTITION BY symbol ORDER BY matchRowtime RANGE UNBOUNDED > PRECEDING) as price_sum > |FROM T > """.stripMargin > util.verifyRelPlanWithType(sqlQuery1) > } > {code} > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-24168) Rowtime type is not correct for windowTableFunction or OverAggregate follows after Match because the output type does not updated after input rowtime attribute changed
[ https://issues.apache.org/jira/browse/FLINK-24168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17412284#comment-17412284 ] godfrey he commented on FLINK-24168: I would like to fix it only in master, not including 1.13 and 1.14. Because, NO users report such issue before, only MATCH with timestamp_ltz will trigger this bug which is rare. I will change the fix version to master > Rowtime type is not correct for windowTableFunction or OverAggregate follows > after Match because the output type does not updated after input rowtime > attribute changed from rowtime to rowtime_ltz > --- > > Key: FLINK-24168 > URL: https://issues.apache.org/jira/browse/FLINK-24168 > Project: Flink > Issue Type: Bug > Components: Table SQL / Planner >Affects Versions: 1.14.0 >Reporter: JING ZHANG >Assignee: JING ZHANG >Priority: Critical > Labels: pull-request-available > Fix For: 1.14.0 > > > Rowtime type is not correct for windowTableFunction or OverAggregate on Match > because the output type does not updated after input rowtime attribute > changed from rowtime to rowtime_ltz in `RelTimeIndicator`. > The bug could be reproduced by the following two cases: > {code:java} > @Test > def testWindowTVFOnMatchRecognizeOnRowtimeLTZ(): Unit = { > val sqlQuery = > s""" >|SELECT >| * >|FROM Ticker >|MATCH_RECOGNIZE ( >| PARTITION BY symbol >| ORDER BY ts_ltz >| MEASURES >|A.price as price, >|A.tax as tax, >|MATCH_ROWTIME() as matchRowtime >| ONE ROW PER MATCH >| PATTERN (A) >| DEFINE >|A AS A.price > 0 >|) AS T >|""".stripMargin > val table = util.tableEnv.sqlQuery(sqlQuery) > util.tableEnv.registerTable("T", table) > val sqlQuery1 = > s""" >|SELECT * >|FROM TABLE(TUMBLE(TABLE T, DESCRIPTOR(matchRowtime), INTERVAL '3' > second)) >|""".stripMargin > util.verifyRelPlanWithType(sqlQuery1) > } > @Test > def testOverWindowOnMatchRecognizeOnRowtimeLTZ(): Unit = { > val sqlQuery = > s""" >|SELECT >| * >|FROM Ticker >|MATCH_RECOGNIZE ( >| PARTITION BY symbol >| ORDER BY ts_ltz >| MEASURES >|A.price as price, >|A.tax as tax, >|MATCH_ROWTIME() as matchRowtime >| ONE ROW PER MATCH >| PATTERN (A) >| DEFINE >|A AS A.price > 0 >|) AS T >|""".stripMargin > val table = util.tableEnv.sqlQuery(sqlQuery) > util.tableEnv.registerTable("T", table) > val sqlQuery1 = > """ > |SELECT > | symbol, > | price, > | tax, > | matchRowtime, > | SUM(price) OVER ( > |PARTITION BY symbol ORDER BY matchRowtime RANGE UNBOUNDED > PRECEDING) as price_sum > |FROM T > """.stripMargin > util.verifyRelPlanWithType(sqlQuery1) > } > {code} > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (FLINK-24168) Rowtime type is not correct for windowTableFunction or OverAggregate follows after Match because the output type does not updated after input rowtime attribute ch
[ https://issues.apache.org/jira/browse/FLINK-24168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17412282#comment-17412282 ] JING ZHANG edited comment on FLINK-24168 at 9/9/21, 2:29 AM: - One more question, the bug already exists in 1.13, the result `MatchRecognizeITCase#testWindowedGroupingAppliedToMatchRecognize` is wrong since 1.13. Do we need to fix the problem in 1.13? There is only a very small probability that a user would suffer the problem because the bug maybe trigger only all following condition satisfied: # Use CEP # Use timestamp_ltz as rowtime attribute # Use 1.13+ was (Author: qingru zhang): One more question, the bug already exists in 1.13, the result `MatchRecognizeITCase#testWindowedGroupingAppliedToMatchRecognize` is wrong since 1.13. Do we need to fix the problem in 1.13? > Rowtime type is not correct for windowTableFunction or OverAggregate follows > after Match because the output type does not updated after input rowtime > attribute changed from rowtime to rowtime_ltz > --- > > Key: FLINK-24168 > URL: https://issues.apache.org/jira/browse/FLINK-24168 > Project: Flink > Issue Type: Bug > Components: Table SQL / Planner >Affects Versions: 1.14.0 >Reporter: JING ZHANG >Assignee: JING ZHANG >Priority: Critical > Labels: pull-request-available > Fix For: 1.14.0 > > > Rowtime type is not correct for windowTableFunction or OverAggregate on Match > because the output type does not updated after input rowtime attribute > changed from rowtime to rowtime_ltz in `RelTimeIndicator`. > The bug could be reproduced by the following two cases: > {code:java} > @Test > def testWindowTVFOnMatchRecognizeOnRowtimeLTZ(): Unit = { > val sqlQuery = > s""" >|SELECT >| * >|FROM Ticker >|MATCH_RECOGNIZE ( >| PARTITION BY symbol >| ORDER BY ts_ltz >| MEASURES >|A.price as price, >|A.tax as tax, >|MATCH_ROWTIME() as matchRowtime >| ONE ROW PER MATCH >| PATTERN (A) >| DEFINE >|A AS A.price > 0 >|) AS T >|""".stripMargin > val table = util.tableEnv.sqlQuery(sqlQuery) > util.tableEnv.registerTable("T", table) > val sqlQuery1 = > s""" >|SELECT * >|FROM TABLE(TUMBLE(TABLE T, DESCRIPTOR(matchRowtime), INTERVAL '3' > second)) >|""".stripMargin > util.verifyRelPlanWithType(sqlQuery1) > } > @Test > def testOverWindowOnMatchRecognizeOnRowtimeLTZ(): Unit = { > val sqlQuery = > s""" >|SELECT >| * >|FROM Ticker >|MATCH_RECOGNIZE ( >| PARTITION BY symbol >| ORDER BY ts_ltz >| MEASURES >|A.price as price, >|A.tax as tax, >|MATCH_ROWTIME() as matchRowtime >| ONE ROW PER MATCH >| PATTERN (A) >| DEFINE >|A AS A.price > 0 >|) AS T >|""".stripMargin > val table = util.tableEnv.sqlQuery(sqlQuery) > util.tableEnv.registerTable("T", table) > val sqlQuery1 = > """ > |SELECT > | symbol, > | price, > | tax, > | matchRowtime, > | SUM(price) OVER ( > |PARTITION BY symbol ORDER BY matchRowtime RANGE UNBOUNDED > PRECEDING) as price_sum > |FROM T > """.stripMargin > util.verifyRelPlanWithType(sqlQuery1) > } > {code} > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] xuyangzhong commented on pull request #17118: [FLINK-22603][table-planner]The digest can be produced by SourceAbilitySpec.
xuyangzhong commented on pull request #17118: URL: https://github.com/apache/flink/pull/17118#issuecomment-915709849 @flinkbot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-24168) Rowtime type is not correct for windowTableFunction or OverAggregate follows after Match because the output type does not updated after input rowtime attribute changed
[ https://issues.apache.org/jira/browse/FLINK-24168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17412282#comment-17412282 ] JING ZHANG commented on FLINK-24168: One more question, the bug already exists in 1.13, the result `MatchRecognizeITCase#testWindowedGroupingAppliedToMatchRecognize` is wrong since 1.13. Do we need to fix the problem in 1.13? > Rowtime type is not correct for windowTableFunction or OverAggregate follows > after Match because the output type does not updated after input rowtime > attribute changed from rowtime to rowtime_ltz > --- > > Key: FLINK-24168 > URL: https://issues.apache.org/jira/browse/FLINK-24168 > Project: Flink > Issue Type: Bug > Components: Table SQL / Planner >Affects Versions: 1.14.0 >Reporter: JING ZHANG >Assignee: JING ZHANG >Priority: Critical > Labels: pull-request-available > Fix For: 1.14.0 > > > Rowtime type is not correct for windowTableFunction or OverAggregate on Match > because the output type does not updated after input rowtime attribute > changed from rowtime to rowtime_ltz in `RelTimeIndicator`. > The bug could be reproduced by the following two cases: > {code:java} > @Test > def testWindowTVFOnMatchRecognizeOnRowtimeLTZ(): Unit = { > val sqlQuery = > s""" >|SELECT >| * >|FROM Ticker >|MATCH_RECOGNIZE ( >| PARTITION BY symbol >| ORDER BY ts_ltz >| MEASURES >|A.price as price, >|A.tax as tax, >|MATCH_ROWTIME() as matchRowtime >| ONE ROW PER MATCH >| PATTERN (A) >| DEFINE >|A AS A.price > 0 >|) AS T >|""".stripMargin > val table = util.tableEnv.sqlQuery(sqlQuery) > util.tableEnv.registerTable("T", table) > val sqlQuery1 = > s""" >|SELECT * >|FROM TABLE(TUMBLE(TABLE T, DESCRIPTOR(matchRowtime), INTERVAL '3' > second)) >|""".stripMargin > util.verifyRelPlanWithType(sqlQuery1) > } > @Test > def testOverWindowOnMatchRecognizeOnRowtimeLTZ(): Unit = { > val sqlQuery = > s""" >|SELECT >| * >|FROM Ticker >|MATCH_RECOGNIZE ( >| PARTITION BY symbol >| ORDER BY ts_ltz >| MEASURES >|A.price as price, >|A.tax as tax, >|MATCH_ROWTIME() as matchRowtime >| ONE ROW PER MATCH >| PATTERN (A) >| DEFINE >|A AS A.price > 0 >|) AS T >|""".stripMargin > val table = util.tableEnv.sqlQuery(sqlQuery) > util.tableEnv.registerTable("T", table) > val sqlQuery1 = > """ > |SELECT > | symbol, > | price, > | tax, > | matchRowtime, > | SUM(price) OVER ( > |PARTITION BY symbol ORDER BY matchRowtime RANGE UNBOUNDED > PRECEDING) as price_sum > |FROM T > """.stripMargin > util.verifyRelPlanWithType(sqlQuery1) > } > {code} > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] flinkbot commented on pull request #17206: [FLINK-24199][python] Expose StreamExecutionEnvironment#configure in Python API
flinkbot commented on pull request #17206: URL: https://github.com/apache/flink/pull/17206#issuecomment-915706736 ## CI report: * 169c8d343e362c0244d2c977bab4db6ac2b47ee1 UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #17205: [FLINK-24168][table-planner] Update MATCH_ROWTIME function which could receive 0 argument or 1 argument
flinkbot edited a comment on pull request #17205: URL: https://github.com/apache/flink/pull/17205#issuecomment-915449441 ## CI report: * 911c37bbcc7b9ec9603ae72bede52bc2b464642c Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=23799) * 7e524508a999c88c22e981f9ec8a6fcdc6cb07bf Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=23807) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] beyond1920 commented on pull request #17205: [FLINK-24168][table-planner] Update MATCH_ROWTIME function which could receive 0 argument or 1 argument
beyond1920 commented on pull request #17205: URL: https://github.com/apache/flink/pull/17205#issuecomment-915706498 cc @twalthr @leonardBang @wuchong @godfreyhe . Please have a look, thanks very much. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-24168) Rowtime type is not correct for windowTableFunction or OverAggregate follows after Match because the output type does not updated after input rowtime attribute changed
[ https://issues.apache.org/jira/browse/FLINK-24168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17412276#comment-17412276 ] JING ZHANG commented on FLINK-24168: [~twalthr] [~jark] [~Leonard Xu] Thanks for all suggestions. It would be better if we could fix it in 1.14 because *MATCH_ROWTIME()* with timestamp_ltz is new introduced in Flink 1.13. I open a [PR|https://github.com/apache/flink/pull/17205] , please have a look, thanks a lot. [~twalthr] [~jark] [~Leonard Xu] [~godfreyhe]. > Rowtime type is not correct for windowTableFunction or OverAggregate follows > after Match because the output type does not updated after input rowtime > attribute changed from rowtime to rowtime_ltz > --- > > Key: FLINK-24168 > URL: https://issues.apache.org/jira/browse/FLINK-24168 > Project: Flink > Issue Type: Bug > Components: Table SQL / Planner >Affects Versions: 1.14.0 >Reporter: JING ZHANG >Assignee: JING ZHANG >Priority: Critical > Labels: pull-request-available > Fix For: 1.14.0 > > > Rowtime type is not correct for windowTableFunction or OverAggregate on Match > because the output type does not updated after input rowtime attribute > changed from rowtime to rowtime_ltz in `RelTimeIndicator`. > The bug could be reproduced by the following two cases: > {code:java} > @Test > def testWindowTVFOnMatchRecognizeOnRowtimeLTZ(): Unit = { > val sqlQuery = > s""" >|SELECT >| * >|FROM Ticker >|MATCH_RECOGNIZE ( >| PARTITION BY symbol >| ORDER BY ts_ltz >| MEASURES >|A.price as price, >|A.tax as tax, >|MATCH_ROWTIME() as matchRowtime >| ONE ROW PER MATCH >| PATTERN (A) >| DEFINE >|A AS A.price > 0 >|) AS T >|""".stripMargin > val table = util.tableEnv.sqlQuery(sqlQuery) > util.tableEnv.registerTable("T", table) > val sqlQuery1 = > s""" >|SELECT * >|FROM TABLE(TUMBLE(TABLE T, DESCRIPTOR(matchRowtime), INTERVAL '3' > second)) >|""".stripMargin > util.verifyRelPlanWithType(sqlQuery1) > } > @Test > def testOverWindowOnMatchRecognizeOnRowtimeLTZ(): Unit = { > val sqlQuery = > s""" >|SELECT >| * >|FROM Ticker >|MATCH_RECOGNIZE ( >| PARTITION BY symbol >| ORDER BY ts_ltz >| MEASURES >|A.price as price, >|A.tax as tax, >|MATCH_ROWTIME() as matchRowtime >| ONE ROW PER MATCH >| PATTERN (A) >| DEFINE >|A AS A.price > 0 >|) AS T >|""".stripMargin > val table = util.tableEnv.sqlQuery(sqlQuery) > util.tableEnv.registerTable("T", table) > val sqlQuery1 = > """ > |SELECT > | symbol, > | price, > | tax, > | matchRowtime, > | SUM(price) OVER ( > |PARTITION BY symbol ORDER BY matchRowtime RANGE UNBOUNDED > PRECEDING) as price_sum > |FROM T > """.stripMargin > util.verifyRelPlanWithType(sqlQuery1) > } > {code} > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] flinkbot edited a comment on pull request #17205: [FLINK-24168][table-planner] Update MATCH_ROWTIME function which could receive 0 argument or 1 argument
flinkbot edited a comment on pull request #17205: URL: https://github.com/apache/flink/pull/17205#issuecomment-915449441 ## CI report: * 911c37bbcc7b9ec9603ae72bede52bc2b464642c Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=23799) * 7e524508a999c88c22e981f9ec8a6fcdc6cb07bf UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot commented on pull request #17206: [FLINK-24199][python] Expose StreamExecutionEnvironment#configure in Python API
flinkbot commented on pull request #17206: URL: https://github.com/apache/flink/pull/17206#issuecomment-915684813 Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community to review your pull request. We will use this comment to track the progress of the review. ## Automated Checks Last check on commit 169c8d343e362c0244d2c977bab4db6ac2b47ee1 (Thu Sep 09 01:19:12 UTC 2021) **Warnings:** * No documentation files were touched! Remember to keep the Flink docs up to date! Mention the bot in a comment to re-run the automated checks. ## Review Progress * ❓ 1. The [description] looks good. * ❓ 2. There is [consensus] that the contribution should go into to Flink. * ❓ 3. Needs [attention] from. * ❓ 4. The change fits into the overall [architecture]. * ❓ 5. Overall code [quality] is good. Please see the [Pull Request Review Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full explanation of the review process. The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands The @flinkbot bot supports the following commands: - `@flinkbot approve description` to approve one or more aspects (aspects: `description`, `consensus`, `architecture` and `quality`) - `@flinkbot approve all` to approve all aspects - `@flinkbot approve-until architecture` to approve everything until `architecture` - `@flinkbot attention @username1 [@username2 ..]` to require somebody's attention - `@flinkbot disapprove architecture` to remove an approval you gave earlier -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (FLINK-24199) Expose StreamExecutionEnvironment#configure in Python API
[ https://issues.apache.org/jira/browse/FLINK-24199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-24199: --- Labels: pull-request-available (was: ) > Expose StreamExecutionEnvironment#configure in Python API > - > > Key: FLINK-24199 > URL: https://issues.apache.org/jira/browse/FLINK-24199 > Project: Flink > Issue Type: Improvement > Components: API / Python >Reporter: Dawid Wysakowicz >Assignee: Nicholas Jiang >Priority: Major > Labels: pull-request-available > Fix For: 1.15.0 > > > There are certain parameters that can be configured only through the > underlying configuration of StreamExecutionEnvironment e.g. > (execution.checkpointing.checkpoints-after-tasks-finish.enabled). > We should be able to set those in the Python API. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] SteNicholas opened a new pull request #17206: [FLINK-24199][python] Expose StreamExecutionEnvironment#configure in Python API
SteNicholas opened a new pull request #17206: URL: https://github.com/apache/flink/pull/17206 ## What is the purpose of the change *There are certain parameters that can be configured only through the underlying configuration of `StreamExecutionEnvironment` e.g. (`execution.checkpointing.checkpoints-after-tasks-finish.enabled`).We should be able to set those in the Python API.* ## Brief change log - *`StreamExecutionEnvironment` adds the `configure` method to configure the parameters through the underlying configuration.* ## Verifying this change - *`StreamExecutionEnvironmentTests` adds the `test_configure` to verify whether to configure the parameters through the underlying configuration in the `StreamExecutionEnvironment`.* ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): (yes / **no**) - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (yes / **no**) - The serializers: (yes / **no** / don't know) - The runtime per-record code paths (performance sensitive): (yes / **no** / don't know) - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (yes / **no** / don't know) - The S3 file system connector: (yes / **no** / don't know) ## Documentation - Does this pull request introduce a new feature? (yes / **no**) - If yes, how is the feature documented? (**not applicable** / docs / JavaDocs / not documented) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] SteNicholas commented on pull request #17105: [FLINK-23704][streaming] FLIP-27 sources are not generating LatencyMarkers
SteNicholas commented on pull request #17105: URL: https://github.com/apache/flink/pull/17105#issuecomment-915681646 > @SteNicholas , do you think we can get this into 1.14? Do you have time to finish it this week? @AHeise , of course yes. I will update this PR today. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (FLINK-23583) Easy way to remove metadata before serializing row data for connector implementations
[ https://issues.apache.org/jira/browse/FLINK-23583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Flink Jira Bot updated FLINK-23583: --- Labels: pull-request-available stale-assigned (was: pull-request-available) I am the [Flink Jira Bot|https://github.com/apache/flink-jira-bot/] and I help the community manage its development. I see this issue is assigned but has not received an update in 30 days, so it has been labeled "stale-assigned". If you are still working on the issue, please remove the label and add a comment updating the community on your progress. If this issue is waiting on feedback, please consider this a reminder to the committer/reviewer. Flink is a very active project, and so we appreciate your patience. If you are no longer working on the issue, please unassign yourself so someone else may work on it. > Easy way to remove metadata before serializing row data for connector > implementations > - > > Key: FLINK-23583 > URL: https://issues.apache.org/jira/browse/FLINK-23583 > Project: Flink > Issue Type: Improvement > Components: Table SQL / API >Reporter: Ingo Bürk >Assignee: Ingo Bürk >Priority: Minor > Labels: pull-request-available, stale-assigned > > In FLINK-23537 we made JoinedRowData a public API, which helps when > developing source connectors with format + metadata. > However, when implementing a sink connector (with format + metadata), we need > an equivalent. The connector receives a RowData with appended metadata, but > needs to pass only the row data without metadata to the SerializationSchema. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-23684) KafkaITCase.testAutoOffsetRetrievalAndCommitToKafka fails with NoSuchElementException
[ https://issues.apache.org/jira/browse/FLINK-23684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Flink Jira Bot updated FLINK-23684: --- Labels: stale-assigned test-stability (was: test-stability) I am the [Flink Jira Bot|https://github.com/apache/flink-jira-bot/] and I help the community manage its development. I see this issue is assigned but has not received an update in 30 days, so it has been labeled "stale-assigned". If you are still working on the issue, please remove the label and add a comment updating the community on your progress. If this issue is waiting on feedback, please consider this a reminder to the committer/reviewer. Flink is a very active project, and so we appreciate your patience. If you are no longer working on the issue, please unassign yourself so someone else may work on it. > KafkaITCase.testAutoOffsetRetrievalAndCommitToKafka fails with > NoSuchElementException > - > > Key: FLINK-23684 > URL: https://issues.apache.org/jira/browse/FLINK-23684 > Project: Flink > Issue Type: Bug > Components: Connectors / Kafka >Affects Versions: 1.13.2 >Reporter: Xintong Song >Assignee: Fabian Paul >Priority: Major > Labels: stale-assigned, test-stability > Fix For: 1.13.3 > > > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=21740=logs=c5612577-f1f7-5977-6ff6-7432788526f7=53f6305f-55e6-561c-8f1e-3a1dde2c77df=6572 > {code} > Aug 08 22:24:34 [ERROR] Tests run: 23, Failures: 0, Errors: 1, Skipped: 0, > Time elapsed: 184 s <<< FAILURE! - in > org.apache.flink.streaming.connectors.kafka.KafkaITCase > Aug 08 22:24:34 [ERROR] > testAutoOffsetRetrievalAndCommitToKafka(org.apache.flink.streaming.connectors.kafka.KafkaITCase) > Time elapsed: 30.621 s <<< ERROR! > Aug 08 22:24:34 java.util.NoSuchElementException > Aug 08 22:24:34 at java.util.ArrayList$Itr.next(ArrayList.java:864) > Aug 08 22:24:34 at > org.apache.flink.shaded.guava18.com.google.common.collect.Iterators.getOnlyElement(Iterators.java:302) > Aug 08 22:24:34 at > org.apache.flink.shaded.guava18.com.google.common.collect.Iterables.getOnlyElement(Iterables.java:289) > Aug 08 22:24:34 at > org.apache.flink.streaming.connectors.kafka.KafkaConsumerTestBase.runAutoOffsetRetrievalAndCommitToKafka(KafkaConsumerTestBase.java:374) > Aug 08 22:24:34 at > org.apache.flink.streaming.connectors.kafka.KafkaITCase.testAutoOffsetRetrievalAndCommitToKafka(KafkaITCase.java:178) > Aug 08 22:24:34 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native > Method) > Aug 08 22:24:34 at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > Aug 08 22:24:34 at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > Aug 08 22:24:34 at java.lang.reflect.Method.invoke(Method.java:498) > Aug 08 22:24:34 at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) > Aug 08 22:24:34 at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > Aug 08 22:24:34 at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) > Aug 08 22:24:34 at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > Aug 08 22:24:34 at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298) > Aug 08 22:24:34 at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292) > Aug 08 22:24:34 at > java.util.concurrent.FutureTask.run(FutureTask.java:266) > Aug 08 22:24:34 at java.lang.Thread.run(Thread.java:748) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-16556) TopSpeedWindowing should implement checkpointing for its source
[ https://issues.apache.org/jira/browse/FLINK-16556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Flink Jira Bot updated FLINK-16556: --- Labels: auto-deprioritized-major stale-assigned starter (was: auto-deprioritized-major starter) I am the [Flink Jira Bot|https://github.com/apache/flink-jira-bot/] and I help the community manage its development. I see this issue is assigned but has not received an update in 30 days, so it has been labeled "stale-assigned". If you are still working on the issue, please remove the label and add a comment updating the community on your progress. If this issue is waiting on feedback, please consider this a reminder to the committer/reviewer. Flink is a very active project, and so we appreciate your patience. If you are no longer working on the issue, please unassign yourself so someone else may work on it. > TopSpeedWindowing should implement checkpointing for its source > --- > > Key: FLINK-16556 > URL: https://issues.apache.org/jira/browse/FLINK-16556 > Project: Flink > Issue Type: Bug > Components: Examples >Affects Versions: 1.10.0 >Reporter: Nico Kruber >Assignee: Liebing Yu >Priority: Minor > Labels: auto-deprioritized-major, stale-assigned, starter > > {\{org.apache.flink.streaming.examples.windowing.TopSpeedWindowing.CarSource}} > does not implement checkpointing of its state, namely the current speeds and > distances per car. The main problem with this is that the window trigger only > fires if the new distance has increased by at least 50 but after restore, it > will be reset to 0 and could thus not produce output for a while. > > Either the distance calculation could use {{Math.abs}} or the source needs > proper checkpointing. Optionally with allowing the number of cars to > increase/decrease. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] flinkbot edited a comment on pull request #17188: [FLINK-23864][docs] Add pulsar connector document
flinkbot edited a comment on pull request #17188: URL: https://github.com/apache/flink/pull/17188#issuecomment-914560045 ## CI report: * 2603f6050baeaf45f3b062ae48891d00d4f86a31 Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=23801) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #17205: [FLINK-24168][table-planner] Update MATCH_ROWTIME function which could receive 0 argument or 1 argument
flinkbot edited a comment on pull request #17205: URL: https://github.com/apache/flink/pull/17205#issuecomment-915449441 ## CI report: * 911c37bbcc7b9ec9603ae72bede52bc2b464642c Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=23799) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #17184: [FLINK-24193][tests] Add ClassLoaderExtension
flinkbot edited a comment on pull request #17184: URL: https://github.com/apache/flink/pull/17184#issuecomment-914328876 ## CI report: * 39e7b6d0a0eb8f68fa58b66856694470919491d0 UNKNOWN * 96b93be4eed6db33c1638f5b6e2a0c2f508c0f05 Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=23800) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] dannycranmer commented on a change in pull request #17189: [FLINK-23528][connectors/kinesis] Improving and reenable FlinkKinesisITCase
dannycranmer commented on a change in pull request #17189: URL: https://github.com/apache/flink/pull/17189#discussion_r704784987 ## File path: flink-connectors/flink-connector-kinesis/src/main/java/org/apache/flink/streaming/connectors/kinesis/internals/KinesisDataFetcher.java ## @@ -819,7 +819,7 @@ public void shutdownFetcher() { LOG.warn("Encountered exception closing record publisher factory", e); } } finally { -shardConsumersExecutor.shutdownNow(); +shardConsumersExecutor.shutdown(); Review comment: Are you sure this is correct? `shutdown()` [will not interrupt active shard consumers](https://docs.oracle.com/javase/7/docs/api/java/util/concurrent/ExecutorService.html#shutdown()). How will the running threads get interrupted? ## File path: flink-connectors/flink-connector-kinesis/src/test/java/org/apache/flink/streaming/connectors/kinesis/internals/KinesisDataFetcherTest.java ## @@ -35,36 +40,14 @@ import org.apache.flink.streaming.connectors.kinesis.proxy.KinesisProxyV2Interface; import org.apache.flink.streaming.connectors.kinesis.serialization.KinesisDeserializationSchema; import org.apache.flink.streaming.connectors.kinesis.serialization.KinesisDeserializationSchemaWrapper; -import org.apache.flink.streaming.connectors.kinesis.testutils.AlwaysThrowsDeserializationSchema; -import org.apache.flink.streaming.connectors.kinesis.testutils.FakeKinesisBehavioursFactory; -import org.apache.flink.streaming.connectors.kinesis.testutils.KinesisShardIdGenerator; -import org.apache.flink.streaming.connectors.kinesis.testutils.TestSourceContext; -import org.apache.flink.streaming.connectors.kinesis.testutils.TestUtils; -import org.apache.flink.streaming.connectors.kinesis.testutils.TestableKinesisDataFetcher; -import org.apache.flink.streaming.connectors.kinesis.testutils.TestableKinesisDataFetcherForShardConsumerException; +import org.apache.flink.streaming.connectors.kinesis.testutils.*; import org.apache.flink.streaming.runtime.streamrecord.StreamRecord; import org.apache.flink.util.TestLogger; - -import com.amazonaws.services.kinesis.model.HashKeyRange; -import com.amazonaws.services.kinesis.model.SequenceNumberRange; -import com.amazonaws.services.kinesis.model.Shard; -import org.apache.commons.lang3.mutable.MutableBoolean; -import org.apache.commons.lang3.mutable.MutableLong; import org.junit.Assert; import org.junit.Test; import org.powermock.reflect.Whitebox; -import java.util.ArrayList; -import java.util.Arrays; -import java.util.Collections; -import java.util.HashMap; -import java.util.LinkedList; -import java.util.List; -import java.util.Map; -import java.util.Properties; -import java.util.Random; -import java.util.Set; -import java.util.UUID; +import java.util.*; Review comment: We should [not use wildcard imports](https://flink.apache.org/contributing/code-style-and-quality-formatting.html#imports) ## File path: flink-connectors/flink-connector-kinesis/src/test/java/org/apache/flink/streaming/connectors/kinesis/internals/KinesisDataFetcherTest.java ## @@ -73,14 +56,8 @@ import static java.util.Collections.singletonList; import static org.apache.flink.streaming.connectors.kinesis.config.ConsumerConfigConstants.SHARD_DISCOVERY_INTERVAL_MILLIS; -import static org.junit.Assert.assertEquals; -import static org.junit.Assert.assertFalse; -import static org.junit.Assert.assertNull; -import static org.junit.Assert.assertTrue; -import static org.junit.Assert.fail; -import static org.mockito.Mockito.mock; -import static org.mockito.Mockito.verify; -import static org.mockito.Mockito.when; +import static org.junit.Assert.*; +import static org.mockito.Mockito.*; Review comment: We should [not use wildcard imports](https://flink.apache.org/contributing/code-style-and-quality-formatting.html#imports) ## File path: flink-connectors/flink-connector-kinesis/src/test/java/org/apache/flink/streaming/connectors/kinesis/internals/KinesisDataFetcherTest.java ## @@ -35,36 +40,14 @@ import org.apache.flink.streaming.connectors.kinesis.proxy.KinesisProxyV2Interface; import org.apache.flink.streaming.connectors.kinesis.serialization.KinesisDeserializationSchema; import org.apache.flink.streaming.connectors.kinesis.serialization.KinesisDeserializationSchemaWrapper; -import org.apache.flink.streaming.connectors.kinesis.testutils.AlwaysThrowsDeserializationSchema; -import org.apache.flink.streaming.connectors.kinesis.testutils.FakeKinesisBehavioursFactory; -import org.apache.flink.streaming.connectors.kinesis.testutils.KinesisShardIdGenerator; -import org.apache.flink.streaming.connectors.kinesis.testutils.TestSourceContext; -import org.apache.flink.streaming.connectors.kinesis.testutils.TestUtils; -import org.apache.flink.streaming.connectors.kinesis.testutils.TestableKinesisDataFetcher; -import
[jira] [Commented] (FLINK-23888) Flink azure fs doesn't work
[ https://issues.apache.org/jira/browse/FLINK-23888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17412190#comment-17412190 ] Chesnay Schepler commented on FLINK-23888: -- I'm well aware of the principle behind parent-first classloading. That's also why I know that the parent class loader can only load said class if it is available in lib/, because it doesn't even have access to the plugin jar. If it is not available in lib/ then the plugin class loader will be used instead. > Flink azure fs doesn't work > --- > > Key: FLINK-23888 > URL: https://issues.apache.org/jira/browse/FLINK-23888 > Project: Flink > Issue Type: Bug > Components: FileSystems >Affects Versions: 1.13.1, 1.13.2 >Reporter: Liviu Firu >Priority: Major > Attachments: pom.xml > > > A working pipeline on AWS S3 doesn't work with BlobStorage. > Flink deployed in kubernetes with HA . The following is the configuration : > high-availability: > org.apache.flink.kubernetes.highavailability.KubernetesHaServicesFactory > high-availability.storageDir: wasbs://*** > state.backend: rocksdb > state.checkpoints.dir: wasb://*** > I always got LinkageError. > I believe the issue is that azure / hadoop classes are shaded with > "org.apache.flink" pattern and always get loaded via parent not via plugin > class loader. > Snapshot of the log : > 2021-08-18 11:56:03,557 INFO org.apache.flink.fs.azurefs.AzureFSFactory > [] - Trying to load and instantiate Azure File System > 2021-08-18 11:56:03,747 INFO > org.apache.flink.runtime.entrypoint.ClusterEntrypoint [] - Shutting > StandaloneApplicationClusterEntryPoint down with application status FAILED. > Diagnostics java.lang.LinkageError: loader > org.apache.flink.core.plugin.PluginLoader$PluginClassLoader @61bcbcce > attempted duplicate class definition for > org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.LocalFileSystem. > (org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.LocalFileSystem is > in unnamed module of loader > org.apache.flink.core.plugin.PluginLoader$PluginClassLoader @61bcbcce, parent > loader 'platform') > at java.base/java.lang.ClassLoader.defineClass1(Native Method) > at java.base/java.lang.ClassLoader.defineClass(Unknown Source) > at java.base/java.security.SecureClassLoader.defineClass(Unknown > Source) > at java.base/java.net.URLClassLoader.defineClass(Unknown Source) > at java.base/java.net.URLClassLoader$1.run(Unknown Source) > at java.base/java.net.URLClassLoader$1.run(Unknown Source) > at java.base/java.security.AccessController.doPrivileged(Native > Method) > at java.base/java.net.URLClassLoader.findClass(Unknown Source) > at java.base/java.lang.ClassLoader.loadClass(Unknown Source) > at > org.apache.flink.core.plugin.PluginLoader$PluginClassLoader.loadClass(PluginLoader.java:171) > at java.base/java.lang.ClassLoader.loadClass(Unknown Source) > at > org.apache.flink.fs.azure.common.HadoopConfigLoader.loadHadoopConfigFromFlink(HadoopConfigLoader.java:96) > at > org.apache.flink.fs.azure.common.HadoopConfigLoader.getOrLoadHadoopConfig(HadoopConfigLoader.java:82) > at > org.apache.flink.fs.azurefs.AbstractAzureFSFactory.createInitializedAzureFS(AbstractAzureFSFactory.java:85) > at > org.apache.flink.fs.azurefs.AbstractAzureFSFactory.create(AbstractAzureFSFactory.java:79) > at > org.apache.flink.core.fs.PluginFileSystemFactory.create(PluginFileSystemFactory.java:62) > at > org.apache.flink.core.fs.FileSystem.getUnguardedFileSystem(FileSystem.java:506) > at org.apache.flink.core.fs.FileSystem.get(FileSystem.java:407) > at org.apache.flink.core.fs.Path.getFileSystem(Path.java:274) > at > org.apache.flink.runtime.blob.BlobUtils.createFileSystemBlobStore(BlobUtils.java:89) > at > org.apache.flink.runtime.blob.BlobUtils.createBlobStoreFromConfig(BlobUtils.java:76) > at > org.apache.flink.kubernetes.highavailability.KubernetesHaServicesFactory.createHAServices(KubernetesHaServicesFactory.java:40) > at > org.apache.flink.runtime.highavailability.HighAvailabilityServicesUtils.createCustomHAServices(HighAvailabilityServicesUtils.java:265) > at > org.apache.flink.runtime.highavailability.HighAvailabilityServicesUtils.createHighAvailabilityServices(HighAvailabilityServicesUtils.java:124) > at > org.apache.flink.runtime.entrypoint.ClusterEntrypoint.createHaServices(ClusterEntrypoint.java:353) > at > org.apache.flink.runtime.entrypoint.ClusterEntrypoint.initializeServices(ClusterEntrypoint.java:311) > at > org.apache.flink.runtime.entrypoint.ClusterEntrypoint.runCluster(ClusterEntrypoint.java:239) > at >
[GitHub] [flink] flinkbot edited a comment on pull request #17204: [FLINK-22942] [sql/planner] Disable UPSERT INTO statement
flinkbot edited a comment on pull request #17204: URL: https://github.com/apache/flink/pull/17204#issuecomment-915424597 ## CI report: * 2b1fb823f7d9f8cd43f1e2a28762807f244bcdc8 Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=23798) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #17189: [FLINK-23528][connectors/kinesis] Improving and reenable FlinkKinesisITCase
flinkbot edited a comment on pull request #17189: URL: https://github.com/apache/flink/pull/17189#issuecomment-914637509 ## CI report: * a9fe3682e269eafffef3e4f37734996199d59d4b Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=23796) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Resolved] (FLINK-24041) [FLIP-171] Generic AsyncSinkBase
[ https://issues.apache.org/jira/browse/FLINK-24041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Danny Cranmer resolved FLINK-24041. --- Resolution: Done > [FLIP-171] Generic AsyncSinkBase > > > Key: FLINK-24041 > URL: https://issues.apache.org/jira/browse/FLINK-24041 > Project: Flink > Issue Type: New Feature > Components: Connectors / Common >Reporter: Zichen Liu >Assignee: Zichen Liu >Priority: Major > Labels: pull-request-available > Fix For: 1.15.0 > > > h2. Motivation > Apache Flink has a rich connector ecosystem that can persist data in various > destinations. Flink natively supports Apache Kafka, Amazon Kinesis Data > Streams, Elasticsearch, HBase, and many more destinations. Additional > connectors are maintained in Apache Bahir or directly on GitHub. The basic > functionality of these sinks is quite similar. They batch events according to > user defined buffering hints, sign requests and send them to the respective > endpoint, retry unsuccessful or throttled requests, and participate in > checkpointing. They primarily just differ in the way they interface with the > destination. Yet, all the above-mentioned sinks are developed and maintained > independently. > We hence propose to create a sink that abstracts away this common > functionality into a generic sink. Adding support for a new destination then > just means creating a lightweight shim that only implements the specific > interfaces of the destination using a client that supports async requests. > Having a common abstraction will reduce the effort required to maintain all > these individual sinks. It will also make it much easier and faster to create > integrations with additional destinations. Moreover, improvements or bug > fixes to the core of the sink will benefit all implementations that are based > on it. > The design of the sink focusses on extensibility and a broad support of > destinations. The core of the sink is kept generic and free of any connector > specific dependencies. The sink is designed to participate in checkpointing > to provide at-least once semantics, but it is limited to destinations that > provide a client that supports async requests. > h2. References > More details to be found > https://cwiki.apache.org/confluence/display/FLINK/FLIP-171%3A+Async+Sink -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-24041) [FLIP-171] Generic AsyncSinkBase
[ https://issues.apache.org/jira/browse/FLINK-24041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Danny Cranmer updated FLINK-24041: -- Fix Version/s: 1.15.0 > [FLIP-171] Generic AsyncSinkBase > > > Key: FLINK-24041 > URL: https://issues.apache.org/jira/browse/FLINK-24041 > Project: Flink > Issue Type: New Feature > Components: Connectors / Common >Reporter: Zichen Liu >Assignee: Zichen Liu >Priority: Major > Labels: pull-request-available > Fix For: 1.15.0 > > > h2. Motivation > Apache Flink has a rich connector ecosystem that can persist data in various > destinations. Flink natively supports Apache Kafka, Amazon Kinesis Data > Streams, Elasticsearch, HBase, and many more destinations. Additional > connectors are maintained in Apache Bahir or directly on GitHub. The basic > functionality of these sinks is quite similar. They batch events according to > user defined buffering hints, sign requests and send them to the respective > endpoint, retry unsuccessful or throttled requests, and participate in > checkpointing. They primarily just differ in the way they interface with the > destination. Yet, all the above-mentioned sinks are developed and maintained > independently. > We hence propose to create a sink that abstracts away this common > functionality into a generic sink. Adding support for a new destination then > just means creating a lightweight shim that only implements the specific > interfaces of the destination using a client that supports async requests. > Having a common abstraction will reduce the effort required to maintain all > these individual sinks. It will also make it much easier and faster to create > integrations with additional destinations. Moreover, improvements or bug > fixes to the core of the sink will benefit all implementations that are based > on it. > The design of the sink focusses on extensibility and a broad support of > destinations. The core of the sink is kept generic and free of any connector > specific dependencies. The sink is designed to participate in checkpointing > to provide at-least once semantics, but it is limited to destinations that > provide a client that supports async requests. > h2. References > More details to be found > https://cwiki.apache.org/confluence/display/FLINK/FLIP-171%3A+Async+Sink -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-24041) [FLIP-171] Generic AsyncSinkBase
[ https://issues.apache.org/jira/browse/FLINK-24041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17412180#comment-17412180 ] Danny Cranmer commented on FLINK-24041: --- [~CrynetLogistics] can you please link the related Async Sink issues to this one? > [FLIP-171] Generic AsyncSinkBase > > > Key: FLINK-24041 > URL: https://issues.apache.org/jira/browse/FLINK-24041 > Project: Flink > Issue Type: New Feature > Components: Connectors / Common >Reporter: Zichen Liu >Assignee: Zichen Liu >Priority: Major > Labels: pull-request-available > > h2. Motivation > Apache Flink has a rich connector ecosystem that can persist data in various > destinations. Flink natively supports Apache Kafka, Amazon Kinesis Data > Streams, Elasticsearch, HBase, and many more destinations. Additional > connectors are maintained in Apache Bahir or directly on GitHub. The basic > functionality of these sinks is quite similar. They batch events according to > user defined buffering hints, sign requests and send them to the respective > endpoint, retry unsuccessful or throttled requests, and participate in > checkpointing. They primarily just differ in the way they interface with the > destination. Yet, all the above-mentioned sinks are developed and maintained > independently. > We hence propose to create a sink that abstracts away this common > functionality into a generic sink. Adding support for a new destination then > just means creating a lightweight shim that only implements the specific > interfaces of the destination using a client that supports async requests. > Having a common abstraction will reduce the effort required to maintain all > these individual sinks. It will also make it much easier and faster to create > integrations with additional destinations. Moreover, improvements or bug > fixes to the core of the sink will benefit all implementations that are based > on it. > The design of the sink focusses on extensibility and a broad support of > destinations. The core of the sink is kept generic and free of any connector > specific dependencies. The sink is designed to participate in checkpointing > to provide at-least once semantics, but it is limited to destinations that > provide a client that supports async requests. > h2. References > More details to be found > https://cwiki.apache.org/confluence/display/FLINK/FLIP-171%3A+Async+Sink -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] dannycranmer merged pull request #17068: [FLINK-24041][connectors] Flip 171 - Abstract Async Sink Base
dannycranmer merged pull request #17068: URL: https://github.com/apache/flink/pull/17068 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] dannycranmer commented on pull request #17068: [FLINK-24041][connectors] Flip 171 - Abstract Async Sink Base
dannycranmer commented on pull request #17068: URL: https://github.com/apache/flink/pull/17068#issuecomment-915550882 LGTM, merging -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #17203: [WIP][FLINK-22944][state] Optimize writing state changelog
flinkbot edited a comment on pull request #17203: URL: https://github.com/apache/flink/pull/17203#issuecomment-915404842 ## CI report: * a4683e8c521180f39b7714b9884cd71520dac1c6 Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=23795) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #17199: [FLINK-23848][connector/pulsar] Fix the consumer not found error in test. [1.14]
flinkbot edited a comment on pull request #17199: URL: https://github.com/apache/flink/pull/17199#issuecomment-915288286 ## CI report: * 852571fd2ee436ef97c309efd01e4a054abf4426 Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=23784) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-23888) Flink azure fs doesn't work
[ https://issues.apache.org/jira/browse/FLINK-23888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17412172#comment-17412172 ] Liviu Firu commented on FLINK-23888: h5. classloader.parent-first-patterns.default is set to java.;scala.;org.apache.flink.;com.esotericsoftware.kryo;org.apache.hadoop.;javax.annotation.;org.xml;javax.xml;org.apache.xerces;org.w3c;org.rocksdb.;org.slf4j;org.apache.log4j;org.apache.logging;org.apache.commons.logging;ch.qos.logback" Classes are reloacated to "org.flink" by original pom therefore are loaded by parent classloader. > Flink azure fs doesn't work > --- > > Key: FLINK-23888 > URL: https://issues.apache.org/jira/browse/FLINK-23888 > Project: Flink > Issue Type: Bug > Components: FileSystems >Affects Versions: 1.13.1, 1.13.2 >Reporter: Liviu Firu >Priority: Major > Attachments: pom.xml > > > A working pipeline on AWS S3 doesn't work with BlobStorage. > Flink deployed in kubernetes with HA . The following is the configuration : > high-availability: > org.apache.flink.kubernetes.highavailability.KubernetesHaServicesFactory > high-availability.storageDir: wasbs://*** > state.backend: rocksdb > state.checkpoints.dir: wasb://*** > I always got LinkageError. > I believe the issue is that azure / hadoop classes are shaded with > "org.apache.flink" pattern and always get loaded via parent not via plugin > class loader. > Snapshot of the log : > 2021-08-18 11:56:03,557 INFO org.apache.flink.fs.azurefs.AzureFSFactory > [] - Trying to load and instantiate Azure File System > 2021-08-18 11:56:03,747 INFO > org.apache.flink.runtime.entrypoint.ClusterEntrypoint [] - Shutting > StandaloneApplicationClusterEntryPoint down with application status FAILED. > Diagnostics java.lang.LinkageError: loader > org.apache.flink.core.plugin.PluginLoader$PluginClassLoader @61bcbcce > attempted duplicate class definition for > org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.LocalFileSystem. > (org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.LocalFileSystem is > in unnamed module of loader > org.apache.flink.core.plugin.PluginLoader$PluginClassLoader @61bcbcce, parent > loader 'platform') > at java.base/java.lang.ClassLoader.defineClass1(Native Method) > at java.base/java.lang.ClassLoader.defineClass(Unknown Source) > at java.base/java.security.SecureClassLoader.defineClass(Unknown > Source) > at java.base/java.net.URLClassLoader.defineClass(Unknown Source) > at java.base/java.net.URLClassLoader$1.run(Unknown Source) > at java.base/java.net.URLClassLoader$1.run(Unknown Source) > at java.base/java.security.AccessController.doPrivileged(Native > Method) > at java.base/java.net.URLClassLoader.findClass(Unknown Source) > at java.base/java.lang.ClassLoader.loadClass(Unknown Source) > at > org.apache.flink.core.plugin.PluginLoader$PluginClassLoader.loadClass(PluginLoader.java:171) > at java.base/java.lang.ClassLoader.loadClass(Unknown Source) > at > org.apache.flink.fs.azure.common.HadoopConfigLoader.loadHadoopConfigFromFlink(HadoopConfigLoader.java:96) > at > org.apache.flink.fs.azure.common.HadoopConfigLoader.getOrLoadHadoopConfig(HadoopConfigLoader.java:82) > at > org.apache.flink.fs.azurefs.AbstractAzureFSFactory.createInitializedAzureFS(AbstractAzureFSFactory.java:85) > at > org.apache.flink.fs.azurefs.AbstractAzureFSFactory.create(AbstractAzureFSFactory.java:79) > at > org.apache.flink.core.fs.PluginFileSystemFactory.create(PluginFileSystemFactory.java:62) > at > org.apache.flink.core.fs.FileSystem.getUnguardedFileSystem(FileSystem.java:506) > at org.apache.flink.core.fs.FileSystem.get(FileSystem.java:407) > at org.apache.flink.core.fs.Path.getFileSystem(Path.java:274) > at > org.apache.flink.runtime.blob.BlobUtils.createFileSystemBlobStore(BlobUtils.java:89) > at > org.apache.flink.runtime.blob.BlobUtils.createBlobStoreFromConfig(BlobUtils.java:76) > at > org.apache.flink.kubernetes.highavailability.KubernetesHaServicesFactory.createHAServices(KubernetesHaServicesFactory.java:40) > at > org.apache.flink.runtime.highavailability.HighAvailabilityServicesUtils.createCustomHAServices(HighAvailabilityServicesUtils.java:265) > at > org.apache.flink.runtime.highavailability.HighAvailabilityServicesUtils.createHighAvailabilityServices(HighAvailabilityServicesUtils.java:124) > at > org.apache.flink.runtime.entrypoint.ClusterEntrypoint.createHaServices(ClusterEntrypoint.java:353) > at > org.apache.flink.runtime.entrypoint.ClusterEntrypoint.initializeServices(ClusterEntrypoint.java:311) > at >
[GitHub] [flink] flinkbot edited a comment on pull request #17201: [FLINK-23944][connector/pulsar] Enable PulsarSourceITCase.testTaskManagerFailure
flinkbot edited a comment on pull request #17201: URL: https://github.com/apache/flink/pull/17201#issuecomment-915343401 ## CI report: * e0146a377cd299771eb4db29143d4fd39390f135 Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=23788) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #17202: [BP-1.14][FLINK-22889][tests] Increase timeouts in JdbcExactlyOnceSinkE2eTest
flinkbot edited a comment on pull request #17202: URL: https://github.com/apache/flink/pull/17202#issuecomment-915371073 ## CI report: * 565c4103d73e5512a4ef00db59bc3de6ff4b11ae Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=23793) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #17200: [FLINK-24161] Fix interplay of stop-with-savepoint w/o drain with final checkpoints
flinkbot edited a comment on pull request #17200: URL: https://github.com/apache/flink/pull/17200#issuecomment-915343275 ## CI report: * 7f7d2b646137038a84fce12b25825e12d80ddf48 Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=23787) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #17188: [FLINK-23864][docs] Add pulsar connector document
flinkbot edited a comment on pull request #17188: URL: https://github.com/apache/flink/pull/17188#issuecomment-914560045 ## CI report: * f2302fc7d5f4c0f7d46fd9ceb76fa2813554ad90 Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=23786) * 2603f6050baeaf45f3b062ae48891d00d4f86a31 Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=23801) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org