[GitHub] [flink] luoyuxia commented on pull request #21149: [FLINK-29527][formats/parquet] Make unknownFieldsIndices work for single ParquetReader
luoyuxia commented on PR #21149: URL: https://github.com/apache/flink/pull/21149#issuecomment-1336860164 @saLeox Thanks for contribution. I had a quick look about the changes. These changes looks reasonable to me. @Tartarus0zm Could you please have a look again to check these changes since you are the author of [FLINK-23715]? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink-table-store] tsreaper commented on a diff in pull request #410: [FLINK-30247] Introduce time travel reading for table store
tsreaper commented on code in PR #410: URL: https://github.com/apache/flink-table-store/pull/410#discussion_r1039215031 ## docs/layouts/shortcodes/generated/core_configuration.html: ## @@ -8,6 +8,18 @@ + +as-of-snapshot +(none) +Long +Read snapshot specific by snapshot id. + + +as-of-timestamp-mills Review Comment: Why introduce a new configuration? Time travel reading is actually very similar to the "from-timestamp" startup mode in streaming jobs. I'm considering supporting startup mode for both batch and streaming jobs. See https://issues.apache.org/jira/browse/FLINK-30294 . ## docs/content/docs/development/query-table.md: ## @@ -126,3 +126,14 @@ SELECT * FROM MyTable$options; +++ 1 rows in set ``` + +## Time Travel reading + +You can read snapshot specific by commit time or snapshot id from a table. +```sql +-- Read snapshot specific by commit time. +SELECT * FROM T /+ OPTIONS('as-of-timestamp-mills'='121230')/; Review Comment: Should be `/*+ OPTIONS(...) */`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Created] (FLINK-30294) Change table property key 'log.scan' to 'startup.mode' and add a default startup mode in Table Store
Caizhi Weng created FLINK-30294: --- Summary: Change table property key 'log.scan' to 'startup.mode' and add a default startup mode in Table Store Key: FLINK-30294 URL: https://issues.apache.org/jira/browse/FLINK-30294 Project: Flink Issue Type: Improvement Components: Table Store Affects Versions: table-store-0.3.0 Reporter: Caizhi Weng Assignee: Caizhi Weng We're introducing time-travel reading of Table Store for batch jobs. However this reading mode is quite similar to the "from-timestamp" startup mode for streaming jobs, just that "from-timestamp" streaming jobs only consume incremental data but not history data. We can support startup mode for both batch and streaming jobs. For batch jobs, "from-timestamp" startup mode will produce all records from the last snapshot before the specified timestamp. For streaming jobs the behavior doesn't change. Previously, in order to use "from-timestamp" startup mode, users will have to specify "log.scan" and also "log.scan.timestamp-millis", which is a little inconvenient. We can introduce a "default" startup mode and its behavior will base on the execution environment and other configurations. In this way, to use "from-timestamp" startup mode, it is enough for users to specify just "startup.timestamp-millis". -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] (FLINK-28513) Flink Table API CSV streaming sink throws SerializedThrowable exception
[ https://issues.apache.org/jira/browse/FLINK-28513 ] Samrat Deb deleted comment on FLINK-28513: was (Author: samrat007): org.apache.flink.runtime.JobException: Recovery is suppressed by NoRestartBackoffTimeStrategy at org.apache.flink.runtime.executiongraph.failover.flip1.ExecutionFailureHandler.handleFailure(ExecutionFailureHandler.java:139) at org.apache.flink.runtime.executiongraph.failover.flip1.ExecutionFailureHandler.getFailureHandlingResult(ExecutionFailureHandler.java:83) at org.apache.flink.runtime.scheduler.DefaultScheduler.recordTaskFailure(DefaultScheduler.java:256) at org.apache.flink.runtime.scheduler.DefaultScheduler.handleTaskFailure(DefaultScheduler.java:247) at org.apache.flink.runtime.scheduler.DefaultScheduler.onTaskFailed(DefaultScheduler.java:240) at org.apache.flink.runtime.scheduler.SchedulerBase.onTaskExecutionStateUpdate(SchedulerBase.java:738) at org.apache.flink.runtime.scheduler.SchedulerBase.updateTaskExecutionState(SchedulerBase.java:715) at org.apache.flink.runtime.scheduler.SchedulerNG.updateTaskExecutionState(SchedulerNG.java:78) at org.apache.flink.runtime.jobmaster.JobMaster.updateTaskExecutionState(JobMaster.java:477) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.lambda$handleRpcInvocation$1(AkkaRpcActor.java:309) at org.apache.flink.runtime.concurrent.akka.ClassLoadingUtils.runWithContextClassLoader(ClassLoadingUtils.java:83) at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcInvocation(AkkaRpcActor.java:307) at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcMessage(AkkaRpcActor.java:222) at org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.handleRpcMessage(FencedAkkaRpcActor.java:84) at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleMessage(AkkaRpcActor.java:168) at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:24) at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:20) at scala.PartialFunction.applyOrElse(PartialFunction.scala:123) at scala.PartialFunction.applyOrElse$(PartialFunction.scala:122) at akka.japi.pf.UnitCaseStatement.applyOrElse(CaseStatements.scala:20) at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171) at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:172) at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:172) at akka.actor.Actor.aroundReceive(Actor.scala:537) at akka.actor.Actor.aroundReceive$(Actor.scala:535) at akka.actor.AbstractActor.aroundReceive(AbstractActor.scala:220) at akka.actor.ActorCell.receiveMessage(ActorCell.scala:580) at akka.actor.ActorCell.invoke(ActorCell.scala:548) at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:270) at akka.dispatch.Mailbox.run(Mailbox.scala:231) at akka.dispatch.Mailbox.exec(Mailbox.scala:243) at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289) at java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056) at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692) at java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:175) Caused by: java.util.concurrent.ExecutionException: java.lang.UnsupportedOperationException: Cannot sync state to system like S3. Use persist() to create a persistent recoverable intermediate point. at java.util.concurrent.FutureTask.report(FutureTask.java:122) at java.util.concurrent.FutureTask.get(FutureTask.java:192) at org.apache.flink.streaming.runtime.tasks.SourceStreamTask$LegacySourceFunctionThread.completeProcessing(SourceStreamTask.java:363) at org.apache.flink.streaming.runtime.tasks.SourceStreamTask$LegacySourceFunctionThread.run(SourceStreamTask.java:335) Caused by: java.lang.UnsupportedOperationException: Cannot sync state to system like S3. Use persist() to create a persistent recoverable intermediate point. at org.apache.flink.core.fs.RefCountedBufferingFileStream.sync(RefCountedBufferingFileStream.java:111) at org.apache.flink.fs.s3.common.writer.S3RecoverableFsDataOutputStream.sync(S3RecoverableFsDataOutputStream.java:129) at org.apache.flink.formats.csv.CsvBulkWriter.finish(CsvBulkWriter.java:111) at org.apache.flink.connector.file.table.FileSystemTableSink$ProjectionBulkFactory$1.finish(FileSystemTableSink.java:645) at
[jira] [Commented] (FLINK-28513) Flink Table API CSV streaming sink throws SerializedThrowable exception
[ https://issues.apache.org/jira/browse/FLINK-28513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17643135#comment-17643135 ] Samrat Deb commented on FLINK-28513: org.apache.flink.runtime.JobException: Recovery is suppressed by NoRestartBackoffTimeStrategy at org.apache.flink.runtime.executiongraph.failover.flip1.ExecutionFailureHandler.handleFailure(ExecutionFailureHandler.java:139) at org.apache.flink.runtime.executiongraph.failover.flip1.ExecutionFailureHandler.getFailureHandlingResult(ExecutionFailureHandler.java:83) at org.apache.flink.runtime.scheduler.DefaultScheduler.recordTaskFailure(DefaultScheduler.java:256) at org.apache.flink.runtime.scheduler.DefaultScheduler.handleTaskFailure(DefaultScheduler.java:247) at org.apache.flink.runtime.scheduler.DefaultScheduler.onTaskFailed(DefaultScheduler.java:240) at org.apache.flink.runtime.scheduler.SchedulerBase.onTaskExecutionStateUpdate(SchedulerBase.java:738) at org.apache.flink.runtime.scheduler.SchedulerBase.updateTaskExecutionState(SchedulerBase.java:715) at org.apache.flink.runtime.scheduler.SchedulerNG.updateTaskExecutionState(SchedulerNG.java:78) at org.apache.flink.runtime.jobmaster.JobMaster.updateTaskExecutionState(JobMaster.java:477) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.lambda$handleRpcInvocation$1(AkkaRpcActor.java:309) at org.apache.flink.runtime.concurrent.akka.ClassLoadingUtils.runWithContextClassLoader(ClassLoadingUtils.java:83) at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcInvocation(AkkaRpcActor.java:307) at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcMessage(AkkaRpcActor.java:222) at org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.handleRpcMessage(FencedAkkaRpcActor.java:84) at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleMessage(AkkaRpcActor.java:168) at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:24) at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:20) at scala.PartialFunction.applyOrElse(PartialFunction.scala:123) at scala.PartialFunction.applyOrElse$(PartialFunction.scala:122) at akka.japi.pf.UnitCaseStatement.applyOrElse(CaseStatements.scala:20) at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171) at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:172) at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:172) at akka.actor.Actor.aroundReceive(Actor.scala:537) at akka.actor.Actor.aroundReceive$(Actor.scala:535) at akka.actor.AbstractActor.aroundReceive(AbstractActor.scala:220) at akka.actor.ActorCell.receiveMessage(ActorCell.scala:580) at akka.actor.ActorCell.invoke(ActorCell.scala:548) at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:270) at akka.dispatch.Mailbox.run(Mailbox.scala:231) at akka.dispatch.Mailbox.exec(Mailbox.scala:243) at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289) at java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056) at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692) at java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:175) Caused by: java.util.concurrent.ExecutionException: java.lang.UnsupportedOperationException: Cannot sync state to system like S3. Use persist() to create a persistent recoverable intermediate point. at java.util.concurrent.FutureTask.report(FutureTask.java:122) at java.util.concurrent.FutureTask.get(FutureTask.java:192) at org.apache.flink.streaming.runtime.tasks.SourceStreamTask$LegacySourceFunctionThread.completeProcessing(SourceStreamTask.java:363) at org.apache.flink.streaming.runtime.tasks.SourceStreamTask$LegacySourceFunctionThread.run(SourceStreamTask.java:335) Caused by: java.lang.UnsupportedOperationException: Cannot sync state to system like S3. Use persist() to create a persistent recoverable intermediate point. at org.apache.flink.core.fs.RefCountedBufferingFileStream.sync(RefCountedBufferingFileStream.java:111) at org.apache.flink.fs.s3.common.writer.S3RecoverableFsDataOutputStream.sync(S3RecoverableFsDataOutputStream.java:129) at org.apache.flink.formats.csv.CsvBulkWriter.finish(CsvBulkWriter.java:111) at org.apache.flink.connector.file.table.FileSystemTableSink$ProjectionBulkFactory$1.finish(FileSystemTableSink.java:645)
[jira] [Updated] (FLINK-30293) Create an enumerator for static (batch)
[ https://issues.apache.org/jira/browse/FLINK-30293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-30293: --- Labels: pull-request-available (was: ) > Create an enumerator for static (batch) > --- > > Key: FLINK-30293 > URL: https://issues.apache.org/jira/browse/FLINK-30293 > Project: Flink > Issue Type: Improvement > Components: Table Store >Reporter: Jingsong Lee >Priority: Major > Labels: pull-request-available > Fix For: table-store-0.3.0 > > > In FLINK-30207, we have created enumerator for continuous. > We should have an enumerator for static (batch). > For example, for the current read-compacted, time traveling may specify the > commit time to read snapshots in the future. > I think these capabilities need to be in the core, but should they be in > scan? (It seems that it should not) -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [flink-table-store] zjureel opened a new pull request, #421: [FLINK-30293] Add static snapshot enumerator
zjureel opened a new pull request, #421: URL: https://github.com/apache/flink-table-store/pull/421 Add a `StaticSnapshotEnumerator` to generate snapshots for `StaticFileStoreSource` in scan -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-30254) Sync Pulsar updates to external Pulsar connector repository
[ https://issues.apache.org/jira/browse/FLINK-30254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17643129#comment-17643129 ] Martijn Visser commented on FLINK-30254: [~syhily] This will probably be done today or tomorrow at latest > Sync Pulsar updates to external Pulsar connector repository > --- > > Key: FLINK-30254 > URL: https://issues.apache.org/jira/browse/FLINK-30254 > Project: Flink > Issue Type: Sub-task > Components: Connectors / Pulsar >Reporter: Martijn Visser >Assignee: Martijn Visser >Priority: Major > > Currently the external Pulsar repository contains the code from the > {release-1.16} branch. This should be synced with the changes that are merged > into {master} since. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-30254) Sync Pulsar updates to external Pulsar connector repository
[ https://issues.apache.org/jira/browse/FLINK-30254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17643128#comment-17643128 ] Yufan Sheng commented on FLINK-30254: - [~martijnvisser] I want to migrate all the pending PRs to new repository. When can we get this resolved? > Sync Pulsar updates to external Pulsar connector repository > --- > > Key: FLINK-30254 > URL: https://issues.apache.org/jira/browse/FLINK-30254 > Project: Flink > Issue Type: Sub-task > Components: Connectors / Pulsar >Reporter: Martijn Visser >Assignee: Martijn Visser >Priority: Major > > Currently the external Pulsar repository contains the code from the > {release-1.16} branch. This should be synced with the changes that are merged > into {master} since. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [flink-ml] jiangxin369 commented on a diff in pull request #183: [FLINK-29603] Add Transformer for StopWordsRemover
jiangxin369 commented on code in PR #183: URL: https://github.com/apache/flink-ml/pull/183#discussion_r1039164531 ## docs/content/docs/operators/feature/stopwordsremover.md: ## @@ -0,0 +1,165 @@ +--- +title: "StopWordsRemover" +weight: 1 +type: docs +aliases: +- /operators/feature/stopwordsremover.html +--- + + + +## StopWordsRemover + +A feature transformer that filters out stop words from input. + +Note: null values from input array are preserved unless adding null to stopWords +explicitly. + +See Also: http://en.wikipedia.org/wiki/Stop_words;>Stop words +(Wikipedia) + +### Input Columns + +| Param name | Type | Default | Description | +|:---|:-|:|:---| +| inputCols | String[] | `null` | Arrays of strings containing stop words to remove. | + +### Output Columns + +| Param name | Type | Default | Description | +|:---|:-|:|:---| +| outputCols | String[] | `null` | Arrays of strings with stop words removed. | + +### Parameters + +| Key | Default| Type | Required | Description | +|---||--|--|| +| inputCols | `null` | String[] | yes | Input column names. | +| outputCols| `null` | String[] | yes | Output column name. | +| stopWords | `StopWordsRemover.loadDefaultStopWords("english")` | String[] | no | The words to be filtered out. | +| caseSensitive | `false`| Boolean | no | Whether to do a case-sensitive comparison over the stop words. | +| locale| `StopWordsRemover.getDefaultOrUS().toString()` | String | no | Locale of the input for case insensitive matching. Ignored when caseSensitive is true. | Review Comment: But we always write the markdown docs in Java style instead of Python, just like the default value of locale `StopWordsRemover.getDefaultOrUS().toString()`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink-ml] jiangxin369 commented on a diff in pull request #183: [FLINK-29603] Add Transformer for StopWordsRemover
jiangxin369 commented on code in PR #183: URL: https://github.com/apache/flink-ml/pull/183#discussion_r1039164531 ## docs/content/docs/operators/feature/stopwordsremover.md: ## @@ -0,0 +1,165 @@ +--- +title: "StopWordsRemover" +weight: 1 +type: docs +aliases: +- /operators/feature/stopwordsremover.html +--- + + + +## StopWordsRemover + +A feature transformer that filters out stop words from input. + +Note: null values from input array are preserved unless adding null to stopWords +explicitly. + +See Also: http://en.wikipedia.org/wiki/Stop_words;>Stop words +(Wikipedia) + +### Input Columns + +| Param name | Type | Default | Description | +|:---|:-|:|:---| +| inputCols | String[] | `null` | Arrays of strings containing stop words to remove. | + +### Output Columns + +| Param name | Type | Default | Description | +|:---|:-|:|:---| +| outputCols | String[] | `null` | Arrays of strings with stop words removed. | + +### Parameters + +| Key | Default| Type | Required | Description | +|---||--|--|| +| inputCols | `null` | String[] | yes | Input column names. | +| outputCols| `null` | String[] | yes | Output column name. | +| stopWords | `StopWordsRemover.loadDefaultStopWords("english")` | String[] | no | The words to be filtered out. | +| caseSensitive | `false`| Boolean | no | Whether to do a case-sensitive comparison over the stop words. | +| locale| `StopWordsRemover.getDefaultOrUS().toString()` | String | no | Locale of the input for case insensitive matching. Ignored when caseSensitive is true. | Review Comment: But we always write the markdown docs in Java style instead of Python, just like the default value of `StopWordsRemover.getDefaultOrUS().toString()`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] fsk119 commented on pull request #21133: [FLINK-29732][sql-gateway] support configuring session with SQL statement.
fsk119 commented on PR #21133: URL: https://github.com/apache/flink/pull/21133#issuecomment-1336779615 I have added a commit to improve the codes. Could you take a time to look, @yuzelin ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] (FLINK-27176) FLIP-220: Binary Sorted State
[ https://issues.apache.org/jira/browse/FLINK-27176 ] Echo Lee deleted comment on FLINK-27176: -- was (Author: echo lee): Hi [~danderson] Is there any progress on this feature? We want to optimize temporal join operator. I think this idea is very good. I tried your POC, but found that the performance is lower than before. What do you think? > FLIP-220: Binary Sorted State > - > > Key: FLINK-27176 > URL: https://issues.apache.org/jira/browse/FLINK-27176 > Project: Flink > Issue Type: Improvement > Components: Runtime / State Backends >Reporter: David Anderson >Assignee: David Anderson >Priority: Major > > Task for implementing FLIP-220: Binary Sorted State: > [https://cwiki.apache.org/confluence/x/Xo_FD] -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [flink] flinkbot commented on pull request #21453: [hotfix][table-planner] fix tests for FLINK-28988 merged(no code conflict with latest master but a related test case failed)
flinkbot commented on PR #21453: URL: https://github.com/apache/flink/pull/21453#issuecomment-1336746345 ## CI report: * 812fe5920a60dfc085c592b4da3802f88201605a UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] lincoln-lil opened a new pull request, #21453: [hotfix][table-planner] fix tests for FLINK-28988 merged(no code conflict with latest master but a related test case failed)
lincoln-lil opened a new pull request, #21453: URL: https://github.com/apache/flink/pull/21453 FLINK-28988 fixed the incorrectly filter pushdown issue, but when merged into master it affects a related test case (added by FLINK-29781 recently), this pr will fix this fail case. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink-table-store] zjureel commented on pull request #410: [FLINK-30247] Introduce time travel reading for table store
zjureel commented on PR #410: URL: https://github.com/apache/flink-table-store/pull/410#issuecomment-1336743159 > Pretty good! I will fix https://issues.apache.org/jira/browse/FLINK-30293 first, please assign it to me, THX -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot commented on pull request #21452: [FLINK-30282] Fix Logical type ROW lost inner field's nullability after converting to RelDataType
flinkbot commented on PR #21452: URL: https://github.com/apache/flink/pull/21452#issuecomment-1336741875 ## CI report: * c3bdb35efe0e90698a18302794ed37e9449b5279 UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] LadyForest opened a new pull request, #21452: [FLINK-30282] Fix Logical type ROW lost inner field's nullability after converting to RelDataType
LadyForest opened a new pull request, #21452: URL: https://github.com/apache/flink/pull/21452 ## What is the purpose of the change This pull request fixes the issue that Flink SQL's logical type `Row` lost the inner field's nullability after converting to `RelDataType`. E.g. `a Row` is converted to `a Row` ## Brief change log - Let`ExtendedSqlRowTypeNameSpec` pass the inner field's nullability when deriving type. - Specify the `StructKind.PEEK_FIELDS_NO_EXPAND` to create a struct type. This is required because in `FlinkTypeFactory#createTypeWithNullability`, only `PEEK_FIELDS_NO_EXPAND` preserves the inner field's nullability. ## Verifying this change This change is already covered by existing tests `SqlToOperationConverterTest#testCreateTableWithFullDataTypes`. The nullability check has been tweaked. However, the issue that the precision of the `TIME` and `Row` type's inner comment loss still exists, so the `TODO` is not removed. Beyond this, the `table.q` file also covers the change check. ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): no - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: no - The serializers: no - The runtime per-record code paths (performance sensitive): no - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: no - The S3 file system connector: no ## Documentation - Does this pull request introduces a new feature? no - If yes, how is the feature documented? not applicable -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (FLINK-30282) Logical type ROW lost inner field's nullability after convert to RelDataType
[ https://issues.apache.org/jira/browse/FLINK-30282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-30282: --- Labels: pull-request-available (was: ) > Logical type ROW lost inner field's nullability after convert to RelDataType > > > Key: FLINK-30282 > URL: https://issues.apache.org/jira/browse/FLINK-30282 > Project: Flink > Issue Type: Sub-task > Components: Table SQL / API >Affects Versions: 1.16.0, 1.16.1 >Reporter: Jane Chan >Priority: Major > Labels: pull-request-available > > h3. Issue History > This is not a new issue, FLINK-13604 has tracked it before, and FLINK-16344 > spared efforts to fix it (but did not tweak the ut case mentioned in > FLINK-13604, i.e. > SqlToOperationConverterTest#testCreateTableWithFullDataTypes). Nevertheless, > the FunctionITCase added by FLINK-16344, which validates the fix, has been > removed in FLINK-16377. > h3. How to Reproduce > c.c2 lost nullability > {code:java} > Flink SQL> create table dummy (a array not null, b array not null>, c row) with ('connector' = 'datagen'); > [INFO] Execute statement succeed. > Flink SQL> desc dummy; > +--++---+-++---+ > | name | type | null | key | extras | watermark | > +--++---+-++---+ > | a | ARRAY | FALSE | | | | > | b | ARRAY | TRUE | | | | > | c | ROW<`c1` INT, `c2` DOUBLE> | TRUE | | | | > +--++---+-++---+ > 3 rows in set > {code} > h3. Root Cause > Two places are causing this problem in ExtendedSqlRowTypeNameSpec. > 1. dt.deriveType should also pass dt's nullability as well. See > [https://github.com/apache/flink/blob/fb27e6893506006b9a3b1ac3e9b878fb6cad061a/flink-table/flink-sql-parser/src/main/java/org/apache/flink/sql/parser/type/ExtendedSqlRowTypeNameSpec.java#L159] > > 2. StructKind should be PEEK_FIELDS_NO_EXPAND instead of FULLY_QUALIFIED(see > [https://github.com/apache/calcite/blob/main/core/src/main/java/org/apache/calcite/rel/type/StructKind.java]), > so that FlinkTypeFactory#createTypeWithNullability will not fall back to > super implement. See > [https://github.com/apache/flink/blob/fb27e6893506006b9a3b1ac3e9b878fb6cad061a/flink-table/flink-table-planner/src/main/scala/org/apache/flink/table/planner/calcite/FlinkTypeFactory.scala#L417] -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [flink] maosuhan commented on a diff in pull request #21436: [FLINK-30093] [formats] Protobuf Timestamp Compile Error
maosuhan commented on code in PR #21436: URL: https://github.com/apache/flink/pull/21436#discussion_r1039121017 ## flink-formats/flink-protobuf/src/test/java/org/apache/flink/formats/protobuf/Pb3ToRowTest.java: ## @@ -85,6 +88,8 @@ public void testDeserialization() throws Exception { assertEquals(1, rowData.getInt(0)); assertEquals(2L, rowData.getLong(1)); + +assertEquals(ts.getNanos(), row.getTimestamp(11, 3).getNanoOfMillisecond()); Review Comment: @hdulay Yes, I agree with your idea. RowType must be doable to reflect a Timestamp pb type. If you want to make it more easy to use, you can treat `google.protobuf.Timestamp` a special type and manually convert Timestamp to TimestampData in code. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot commented on pull request #21451: [hotfix][docs] Update the link for `concepts'
flinkbot commented on PR #21451: URL: https://github.com/apache/flink/pull/21451#issuecomment-1336707580 ## CI report: * 811579374aa63e4a411ce2b1d552758cdd0b48b2 UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] ChenZhongPu opened a new pull request, #21451: [hotfix][docs] Update the link for `concepts'
ChenZhongPu opened a new pull request, #21451: URL: https://github.com/apache/flink/pull/21451 - The `concepts` link in try-flink/local_installation#Summary is broken, and it leads an empty page. (ref "docs/concepts" -> ref "docs/concepts/overview") - For consistency, the first letter should be in upper case. (concepts -> Concepts) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] (FLINK-29527) Make unknownFieldsIndices work for single ParquetReader
[ https://issues.apache.org/jira/browse/FLINK-29527 ] Sun Shun deleted comment on FLINK-29527: -- was (Author: JIRAUSER283602): [~lirui] could you please help take a look at this PR when you are free, thanks > Make unknownFieldsIndices work for single ParquetReader > --- > > Key: FLINK-29527 > URL: https://issues.apache.org/jira/browse/FLINK-29527 > Project: Flink > Issue Type: Bug > Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile) >Affects Versions: 1.16.0 >Reporter: Sun Shun >Assignee: Sun Shun >Priority: Major > Labels: pull-request-available > > Currently, from the improvement FLINK-23715, Flink use a collection named > `unknownFieldsIndices` to track the nonexistent fields, and it is kept inside > the `ParquetVectorizedInputFormat`, and applied to all parquet files under > given path. > However, some fields may only be nonexistent in some of the historical > parquet files, while exist in latest ones. And based on > `unknownFieldsIndices`, flink will always skip these fields, even thought > they are existing in the later parquets. > As a result, the value of these fields will become empty when they are > nonexistent in some historical parquet files. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [flink-ml] yunfengzhou-hub commented on a diff in pull request #183: [FLINK-29603] Add Transformer for StopWordsRemover
yunfengzhou-hub commented on code in PR #183: URL: https://github.com/apache/flink-ml/pull/183#discussion_r1039104387 ## docs/content/docs/operators/feature/stopwordsremover.md: ## @@ -0,0 +1,165 @@ +--- +title: "StopWordsRemover" +weight: 1 +type: docs +aliases: +- /operators/feature/stopwordsremover.html +--- + + + +## StopWordsRemover + +A feature transformer that filters out stop words from input. + +Note: null values from input array are preserved unless adding null to stopWords +explicitly. + +See Also: http://en.wikipedia.org/wiki/Stop_words;>Stop words +(Wikipedia) + +### Input Columns + +| Param name | Type | Default | Description | +|:---|:-|:|:---| +| inputCols | String[] | `null` | Arrays of strings containing stop words to remove. | + +### Output Columns + +| Param name | Type | Default | Description | +|:---|:-|:|:---| +| outputCols | String[] | `null` | Arrays of strings with stop words removed. | + +### Parameters + +| Key | Default| Type | Required | Description | +|---||--|--|| +| inputCols | `null` | String[] | yes | Input column names. | +| outputCols| `null` | String[] | yes | Output column name. | +| stopWords | `StopWordsRemover.loadDefaultStopWords("english")` | String[] | no | The words to be filtered out. | +| caseSensitive | `false`| Boolean | no | Whether to do a case-sensitive comparison over the stop words. | +| locale| `StopWordsRemover.getDefaultOrUS().toString()` | String | no | Locale of the input for case insensitive matching. Ignored when caseSensitive is true. | Review Comment: It should be avoided to let python users learn Java. Can we solve this problem by providing `StopWordsRemover.get_available_locales()`? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink-ml] yunfengzhou-hub commented on a diff in pull request #183: [FLINK-29603] Add Transformer for StopWordsRemover
yunfengzhou-hub commented on code in PR #183: URL: https://github.com/apache/flink-ml/pull/183#discussion_r1039103892 ## flink-ml-lib/src/main/resources/org/apache/flink/ml/feature/stopwords/README: ## @@ -0,0 +1,12 @@ +Stopwords Corpus Review Comment: When the amount of words is relatively small, I think the plain text has equivalent or better readability. Existing practice can be found at https://github.com/apache/flink/blob/master/flink-dist/src/main/flink-bin/README.txt. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] dupen01 closed pull request #16947: update hive_function.md
dupen01 closed pull request #16947: update hive_function.md URL: https://github.com/apache/flink/pull/16947 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] godfreyhe merged pull request #20745: [FLINK-28988] Don't push filters down into the right table for temporal join
godfreyhe merged PR #20745: URL: https://github.com/apache/flink/pull/20745 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink-ml] yunfengzhou-hub commented on pull request #171: [FLINK-29911] Improve performance of AgglomerativeClustering
yunfengzhou-hub commented on PR #171: URL: https://github.com/apache/flink-ml/pull/171#issuecomment-1336653558 Thanks for the update! It LGTM. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink-table-store] JingsongLi commented on a diff in pull request #414: [FLINK-30263] Introduce schemas table in table store
JingsongLi commented on code in PR #414: URL: https://github.com/apache/flink-table-store/pull/414#discussion_r1039094224 ## flink-table-store-core/src/main/java/org/apache/flink/table/store/table/metadata/SchemasTable.java: ## @@ -0,0 +1,258 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.table.store.table.metadata; + +import org.apache.flink.core.fs.Path; +import org.apache.flink.table.data.GenericArrayData; +import org.apache.flink.table.data.GenericMapData; +import org.apache.flink.table.data.GenericRowData; +import org.apache.flink.table.data.RowData; +import org.apache.flink.table.data.StringData; +import org.apache.flink.table.store.file.predicate.Predicate; +import org.apache.flink.table.store.file.schema.SchemaManager; +import org.apache.flink.table.store.file.schema.TableSchema; +import org.apache.flink.table.store.file.utils.IteratorRecordReader; +import org.apache.flink.table.store.file.utils.RecordReader; +import org.apache.flink.table.store.file.utils.SerializationUtils; +import org.apache.flink.table.store.table.Table; +import org.apache.flink.table.store.table.source.Split; +import org.apache.flink.table.store.table.source.TableRead; +import org.apache.flink.table.store.table.source.TableScan; +import org.apache.flink.table.store.utils.ProjectedRowData; +import org.apache.flink.table.types.logical.ArrayType; +import org.apache.flink.table.types.logical.BigIntType; +import org.apache.flink.table.types.logical.IntType; +import org.apache.flink.table.types.logical.MapType; +import org.apache.flink.table.types.logical.RowType; + +import org.apache.flink.shaded.guava30.com.google.common.collect.Iterators; + +import java.io.IOException; +import java.util.ArrayList; +import java.util.Arrays; +import java.util.Collections; +import java.util.HashMap; +import java.util.Iterator; +import java.util.LinkedHashMap; +import java.util.List; +import java.util.Map; +import java.util.Objects; + +import static org.apache.flink.table.store.file.catalog.Catalog.METADATA_TABLE_SPLITTER; + +/** A {@link Table} for showing schemas of table. */ +public class SchemasTable implements Table { + +private static final long serialVersionUID = 1L; + +public static final String SCHEMAS = "schemas"; + +public static final RowType TABLE_TYPE = +new RowType( +Arrays.asList( +new RowType.RowField("schema_id", new BigIntType(false)), +new RowType.RowField( +"fields", +new ArrayType( +new RowType( +Arrays.asList( +new RowType.RowField( +"id", new IntType(false)), +new RowType.RowField( +"name", + SerializationUtils + .newStringType(false)), +new RowType.RowField( +"type", + SerializationUtils + .newStringType(false)), +new RowType.RowField( + "description", + SerializationUtils + .newStringType( + true)), +new RowType.RowField( +"partition_keys", +
[jira] [Closed] (FLINK-30207) Move split initialization and discovery logic fully into SnapshotEnumerator in Table Store
[ https://issues.apache.org/jira/browse/FLINK-30207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jingsong Lee closed FLINK-30207. Resolution: Fixed master: 23dad789879735249651992ab9fe23b986ef1564 > Move split initialization and discovery logic fully into SnapshotEnumerator > in Table Store > -- > > Key: FLINK-30207 > URL: https://issues.apache.org/jira/browse/FLINK-30207 > Project: Flink > Issue Type: Sub-task > Components: Table Store >Affects Versions: table-store-0.3.0 >Reporter: Caizhi Weng >Assignee: Caizhi Weng >Priority: Major > Labels: pull-request-available > Fix For: table-store-0.3.0 > > > It is possible that the separated compact job is started long after the write > jobs. so compact job sources need a special split initialization logic: we > will find the latest COMPACT snapshot, and start compacting right after this > snapshot. > However, split initialization logic are currently coded into > {{FileStoreSource}}. We should extract these logic into > {{SnapshotEnumerator}} so that we can create our special > {{SnapshotEnumerator}} for compact sources. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [flink-table-store] JingsongLi merged pull request #411: [FLINK-30207] Move split initialization and discovery logic fully into SnapshotEnumerator in Table Store
JingsongLi merged PR #411: URL: https://github.com/apache/flink-table-store/pull/411 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink-table-store] JingsongLi commented on a diff in pull request #410: [FLINK-30247] Introduce time travel reading for table store
JingsongLi commented on code in PR #410: URL: https://github.com/apache/flink-table-store/pull/410#discussion_r1039089223 ## flink-table-store-core/src/main/java/org/apache/flink/table/store/file/operation/AbstractFileStoreScan.java: ## @@ -184,10 +203,14 @@ public Plan plan() { Long snapshotId = specifiedSnapshotId; if (manifests == null) { if (snapshotId == null) { -snapshotId = -readCompacted -? snapshotManager.latestCompactedSnapshotId() -: snapshotManager.latestSnapshotId(); +if (timestampMills == null) { +snapshotId = +readCompacted +? snapshotManager.latestCompactedSnapshotId() +: snapshotManager.latestSnapshotId(); +} else { +snapshotId = snapshotManager.earlierThanTimeMills(timestampMills); Review Comment: Maybe we should use `earlier Than or equals`? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Created] (FLINK-30293) Create an enumerator for static (batch)
Jingsong Lee created FLINK-30293: Summary: Create an enumerator for static (batch) Key: FLINK-30293 URL: https://issues.apache.org/jira/browse/FLINK-30293 Project: Flink Issue Type: Improvement Components: Table Store Reporter: Jingsong Lee Fix For: table-store-0.3.0 In FLINK-30207, we have created enumerator for continuous. We should have an enumerator for static (batch). For example, for the current read-compacted, time traveling may specify the commit time to read snapshots in the future. I think these capabilities need to be in the core, but should they be in scan? (It seems that it should not) -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [flink-table-store] JingsongLi commented on a diff in pull request #416: [FLINK-30271] Introduce Table.copy from dynamic options
JingsongLi commented on code in PR #416: URL: https://github.com/apache/flink-table-store/pull/416#discussion_r1039087564 ## flink-table-store-core/src/main/java/org/apache/flink/table/store/table/AbstractFileStoreTable.java: ## @@ -40,6 +47,43 @@ public AbstractFileStoreTable(Path path, TableSchema tableSchema) { protected abstract FileStore store(); +protected abstract FileStoreTable copy(TableSchema newTableSchema); + +@Override +public FileStoreTable copy(Map dynamicOptions) { +// check option is not immutable +Map options = tableSchema.options(); +dynamicOptions.forEach( +(k, v) -> { +if (!Objects.equals(v, options.get(k))) { Review Comment: The immutable field may be exist in the dynamic parameters. For example, for Flink, Flink planner just pass a options merged table options and dynamic options. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] Myracle commented on pull request #21347: [FLINK-29640][Client/Job Submission]Enhance the function configured by execution.shutdown-on-attached-exi…
Myracle commented on PR #21347: URL: https://github.com/apache/flink/pull/21347#issuecomment-1336607971 @flinkbot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Created] (FLINK-30292) Better support for conversion between DataType and TypeInformation
Yunfeng Zhou created FLINK-30292: Summary: Better support for conversion between DataType and TypeInformation Key: FLINK-30292 URL: https://issues.apache.org/jira/browse/FLINK-30292 Project: Flink Issue Type: Improvement Components: Table SQL / API Affects Versions: 1.15.3 Reporter: Yunfeng Zhou In Flink 1.15, we have the following ways to convert a DataType to a TypeInformation. Each of them has some disadvantages. * `TypeConversions.fromDataTypeToLegacyInfo` It might lead to precision losses in face of some data types like timestamp. It has been deprecated. * `ExternalTypeInfo.of` It cannot be used to get detailed type information like `RowTypeInfo` It might bring some serialization overhead. Given that the ways mentioned above are both not perfect, Flink SQL should provide a better API to support DataType-TypeInformation conversions, and thus better support Table-DataStream conversions. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [flink] flinkbot commented on pull request #21450: [FLINK-30291][Connector/DynamoDB] Update docs to render DynamoDB conn…
flinkbot commented on PR #21450: URL: https://github.com/apache/flink/pull/21450#issuecomment-1336574515 ## CI report: * c89ac59bba265345ff5d3757492ea9bd8698e3df UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] dannycranmer opened a new pull request, #21450: [FLINK-30291][Connector/DynamoDB] Update docs to render DynamoDB conn…
dannycranmer opened a new pull request, #21450: URL: https://github.com/apache/flink/pull/21450 ## What is the purpose of the change Integrate flink-connector-aws into Flink docs Related https://github.com/apache/flink/pull/21449 ## Brief change log - Add a new short_code (`sql_connector_download_table`) to render SQL connector info for externalized connectors - Integrate AWS connectors, flink-connector-dynamodb ## Verifying this change Rendered web pages locally ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): no - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: no - The serializers: no - The runtime per-record code paths (performance sensitive): no - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: no - The S3 file system connector: no ## Documentation - Does this pull request introduce a new feature? no - If yes, how is the feature documented? not applicable -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot commented on pull request #21449: [FLINK-30291][Connector/DynamoDB] Update docs to render DynamoDB conn…
flinkbot commented on PR #21449: URL: https://github.com/apache/flink/pull/21449#issuecomment-1336572479 ## CI report: * 73129e07b3e7076ed0b0bd2dbe9c89ac8b860972 UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (FLINK-30291) Integrate flink-connector-aws into Flink docs
[ https://issues.apache.org/jira/browse/FLINK-30291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-30291: --- Labels: pull-request-available (was: ) > Integrate flink-connector-aws into Flink docs > - > > Key: FLINK-30291 > URL: https://issues.apache.org/jira/browse/FLINK-30291 > Project: Flink > Issue Type: Technical Debt > Components: Connectors / AWS, Documentation >Reporter: Danny Cranmer >Assignee: Danny Cranmer >Priority: Major > Labels: pull-request-available > Fix For: 1.17.0, 1.16.1 > > > Update the docs render to integrate {{{}flink-connector-aws{}}}. > Add a new shortcode to handle rendering the SQL connector correctly -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [flink] dannycranmer opened a new pull request, #21449: [FLINK-30291][Connector/DynamoDB] Update docs to render DynamoDB conn…
dannycranmer opened a new pull request, #21449: URL: https://github.com/apache/flink/pull/21449 ## What is the purpose of the change Integrate flink-connector-aws into Flink docs ## Brief change log - Add a new short_code (`sql_connector_download_table`) to render SQL connector info for externalized connectors - Integrate AWS connectors, flink-connector-dynamodb ## Verifying this change Rendered web pages locally ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): no - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: no - The serializers: no - The runtime per-record code paths (performance sensitive): no - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: no - The S3 file system connector: no ## Documentation - Does this pull request introduce a new feature? no - If yes, how is the feature documented? not applicable -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Assigned] (FLINK-30291) Integrate flink-connector-aws into Flink docs
[ https://issues.apache.org/jira/browse/FLINK-30291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Danny Cranmer reassigned FLINK-30291: - Assignee: Danny Cranmer > Integrate flink-connector-aws into Flink docs > - > > Key: FLINK-30291 > URL: https://issues.apache.org/jira/browse/FLINK-30291 > Project: Flink > Issue Type: Technical Debt > Components: Connectors / AWS, Documentation >Reporter: Danny Cranmer >Assignee: Danny Cranmer >Priority: Major > Fix For: 1.17.0, 1.16.1 > > > Update the docs render to integrate {{{}flink-connector-aws{}}}. > Add a new shortcode to handle rendering the SQL connector correctly -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-30291) Integrate flink-connector-aws into Flink docs
Danny Cranmer created FLINK-30291: - Summary: Integrate flink-connector-aws into Flink docs Key: FLINK-30291 URL: https://issues.apache.org/jira/browse/FLINK-30291 Project: Flink Issue Type: Technical Debt Components: Connectors / AWS, Documentation Reporter: Danny Cranmer Fix For: 1.17.0, 1.16.1 Update the docs render to integrate {{{}flink-connector-aws{}}}. Add a new shortcode to handle rendering the SQL connector correctly -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [flink-connector-aws] dannycranmer opened a new pull request, #35: [FLINK-29907] Externalize Kinesis/Firehose documentation
dannycranmer opened a new pull request, #35: URL: https://github.com/apache/flink-connector-aws/pull/35 Copy over documentation for Kinesis and Firehose from the Flink repo -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Resolved] (FLINK-29109) Checkpoint path conflict with stateless upgrade mode
[ https://issues.apache.org/jira/browse/FLINK-29109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Weise resolved FLINK-29109. -- Resolution: Fixed > Checkpoint path conflict with stateless upgrade mode > > > Key: FLINK-29109 > URL: https://issues.apache.org/jira/browse/FLINK-29109 > Project: Flink > Issue Type: Bug > Components: Kubernetes Operator >Affects Versions: kubernetes-operator-1.1.0 >Reporter: Thomas Weise >Assignee: Thomas Weise >Priority: Major > Labels: pull-request-available > Fix For: kubernetes-operator-1.3.0, kubernetes-operator-1.2.0 > > > A stateful job with stateless upgrade mode (yes, there are such use cases) > fails with checkpoint path conflict due to constant jobId and FLINK-19358 > (applies to Flink < 1.16x). Since with stateless upgrade mode the checkpoint > id resets on restart the job is going to write to previously used locations > and fail. The workaround is to rotate the jobId on every redeploy when the > upgrade mode is stateless. While this can be worked around externally it is > best done in the operator itself because reconciliation resolves when a > restart is actually required while rotating jobId externally may trigger > unnecessary restarts. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-29109) Checkpoint path conflict with stateless upgrade mode
[ https://issues.apache.org/jira/browse/FLINK-29109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Weise updated FLINK-29109: - Fix Version/s: kubernetes-operator-1.3.0 > Checkpoint path conflict with stateless upgrade mode > > > Key: FLINK-29109 > URL: https://issues.apache.org/jira/browse/FLINK-29109 > Project: Flink > Issue Type: Bug > Components: Kubernetes Operator >Affects Versions: kubernetes-operator-1.1.0 >Reporter: Thomas Weise >Assignee: Thomas Weise >Priority: Major > Labels: pull-request-available > Fix For: kubernetes-operator-1.2.0, kubernetes-operator-1.3.0 > > > A stateful job with stateless upgrade mode (yes, there are such use cases) > fails with checkpoint path conflict due to constant jobId and FLINK-19358 > (applies to Flink < 1.16x). Since with stateless upgrade mode the checkpoint > id resets on restart the job is going to write to previously used locations > and fail. The workaround is to rotate the jobId on every redeploy when the > upgrade mode is stateless. While this can be worked around externally it is > best done in the operator itself because reconciliation resolves when a > restart is actually required while rotating jobId externally may trigger > unnecessary restarts. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-30287) Configmaps get cleaned up when upgrading standalone Flink cluster
[ https://issues.apache.org/jira/browse/FLINK-30287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chesnay Schepler updated FLINK-30287: - Affects Version/s: (was: 1.2) > Configmaps get cleaned up when upgrading standalone Flink cluster > - > > Key: FLINK-30287 > URL: https://issues.apache.org/jira/browse/FLINK-30287 > Project: Flink > Issue Type: Bug > Components: Kubernetes Operator >Reporter: Steven Zhang >Priority: Major > Labels: pull-request-available > > I started up a Standalone session Flink cluster and ran one job on it. I > checked the configMaps and see the HA data. > > {code:java} > kubectl get configmaps -n flink-operator > NAME > DATA AGE > flink-config-sql-example-deployment-s3-testing > 2 5m41s > flink-operator-config > 3 42h > kube-root-ca.crt > 1 42h > sql-example-deployment-s3-testing-3f57cd5f0002-config-map > 0 11m > sql-example-deploymnt-s3-testing-cluster-config-map > 5 12m > {code} > > I then update the FlinkDep image field and the Flink cluster gets restarted. > The HA configmap for the job is now gone. > {code:java} > kubectl get configmaps -n flink-operator > NAME DATA AGE > flink-config-sql-example-deployment-s3-testing 2 18m > flink-operator-config 3 43h > kube-root-ca.crt 1 43h > sql-example-deployment-s3-testing-cluster-config-map 3 31m {code} > > I think this is due to a race condition where the TM first terminates which > causes the JM to interpret the Job entering a failed state which causes it to > clean up the configmaps. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-30290) IteratorSourceReaderBase should report END_OF_INPUT sooner
Chesnay Schepler created FLINK-30290: Summary: IteratorSourceReaderBase should report END_OF_INPUT sooner Key: FLINK-30290 URL: https://issues.apache.org/jira/browse/FLINK-30290 Project: Flink Issue Type: Technical Debt Components: API / Core Affects Versions: 1.17.0 Reporter: Chesnay Schepler Assignee: Chesnay Schepler Fix For: 1.17.0 The iterator source reader base does not report end_of_input when the last value was emitted, but instead requires an additional call to pollNext to be made. This is fine functionality-wise, and allowed by the the source reader api contracts, but it's not intuitive behavior and leaks into tests for the datagen source. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-30289) RateLimitedSourceReader uses wrong signal for checkpoint rate-limiting
Chesnay Schepler created FLINK-30289: Summary: RateLimitedSourceReader uses wrong signal for checkpoint rate-limiting Key: FLINK-30289 URL: https://issues.apache.org/jira/browse/FLINK-30289 Project: Flink Issue Type: Bug Components: API / Core Affects Versions: 1.17.0 Reporter: Chesnay Schepler Assignee: Chesnay Schepler Fix For: 1.17.0 The checkpoint rate limiter is notified when the checkpoint is complete, but since this signal comes at some point in the future (or not at all) it can result in no records being emitted for a checkpoint, or more records than expected being emitted. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (FLINK-30202) RateLimitedSourceReader may emit one more record than permitted
[ https://issues.apache.org/jira/browse/FLINK-30202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chesnay Schepler closed FLINK-30202. Fix Version/s: 1.17.0 Resolution: Fixed master: 81ed6c649e1a219c457628c54a7165d75b803474 > RateLimitedSourceReader may emit one more record than permitted > --- > > Key: FLINK-30202 > URL: https://issues.apache.org/jira/browse/FLINK-30202 > Project: Flink > Issue Type: Bug > Components: Connectors / Common >Affects Versions: 1.17.0 >Reporter: Matthias Pohl >Assignee: Chesnay Schepler >Priority: Major > Labels: pull-request-available, test-stability > Fix For: 1.17.0 > > > [This > build|https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=43483=logs=5cae8624-c7eb-5c51-92d3-4d2dacedd221=5acec1b4-945b-59ca-34f8-168928ce5199=24747] > failed due to a failed assertion in > {{{}DataGeneratorSourceITCase.testGatedRateLimiter{}}}: > {code:java} > Nov 25 03:26:45 org.opentest4j.AssertionFailedError: > Nov 25 03:26:45 > Nov 25 03:26:45 expected: 2 > Nov 25 03:26:45 but was: 1 {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [flink] zentol merged pull request #21416: [FLINK-30202][tests] Do not assert on checkpointId
zentol merged PR #21416: URL: https://github.com/apache/flink/pull/21416 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (FLINK-27184) Optimize IntervalJoinOperator by using binary sorted state
[ https://issues.apache.org/jira/browse/FLINK-27184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Anderson updated FLINK-27184: --- Summary: Optimize IntervalJoinOperator by using binary sorted state (was: Optimize IntervalJoinOperator by using temporal state) > Optimize IntervalJoinOperator by using binary sorted state > -- > > Key: FLINK-27184 > URL: https://issues.apache.org/jira/browse/FLINK-27184 > Project: Flink > Issue Type: Sub-task >Reporter: David Anderson >Priority: Major > > The performance of interval joins on RocksDB can be optimized by using > temporal state. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-27184) Optimize IntervalJoinOperator by using binary sorted state
[ https://issues.apache.org/jira/browse/FLINK-27184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Anderson updated FLINK-27184: --- Description: The performance of interval joins on RocksDB can be optimized by using binary sorted state. (was: The performance of interval joins on RocksDB can be optimized by using temporal state.) > Optimize IntervalJoinOperator by using binary sorted state > -- > > Key: FLINK-27184 > URL: https://issues.apache.org/jira/browse/FLINK-27184 > Project: Flink > Issue Type: Sub-task >Reporter: David Anderson >Priority: Major > > The performance of interval joins on RocksDB can be optimized by using binary > sorted state. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-27182) Optimize RowTimeSortOperator by using binary sorted state
[ https://issues.apache.org/jira/browse/FLINK-27182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Anderson updated FLINK-27182: --- Summary: Optimize RowTimeSortOperator by using binary sorted state (was: Optimize RowTimeSortOperator by using temporal state) > Optimize RowTimeSortOperator by using binary sorted state > - > > Key: FLINK-27182 > URL: https://issues.apache.org/jira/browse/FLINK-27182 > Project: Flink > Issue Type: Sub-task > Components: Table SQL / Runtime >Reporter: David Anderson >Priority: Major > > The performance of the RowTimeSortOperator can be significantly improved by > using temporal state. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-27183) Optimize CepOperator by using temporal state
[ https://issues.apache.org/jira/browse/FLINK-27183?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Anderson updated FLINK-27183: --- Description: The performance of CEP on RocksDB can be significantly improved by having it use binary sorted state. (was: The performance of CEP on RocksDB can be significantly improved by having it use temporal state.) > Optimize CepOperator by using temporal state > > > Key: FLINK-27183 > URL: https://issues.apache.org/jira/browse/FLINK-27183 > Project: Flink > Issue Type: Sub-task > Components: Library / CEP >Reporter: David Anderson >Priority: Major > > The performance of CEP on RocksDB can be significantly improved by having it > use binary sorted state. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-27183) Optimize CepOperator by using binary sorted state
[ https://issues.apache.org/jira/browse/FLINK-27183?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Anderson updated FLINK-27183: --- Summary: Optimize CepOperator by using binary sorted state (was: Optimize CepOperator by using temporal state) > Optimize CepOperator by using binary sorted state > - > > Key: FLINK-27183 > URL: https://issues.apache.org/jira/browse/FLINK-27183 > Project: Flink > Issue Type: Sub-task > Components: Library / CEP >Reporter: David Anderson >Priority: Major > > The performance of CEP on RocksDB can be significantly improved by having it > use binary sorted state. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-27182) Optimize RowTimeSortOperator by using binary sorted state
[ https://issues.apache.org/jira/browse/FLINK-27182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Anderson updated FLINK-27182: --- Description: The performance of the RowTimeSortOperator can be significantly improved by using binary sorted state. (was: The performance of the RowTimeSortOperator can be significantly improved by using temporal state.) > Optimize RowTimeSortOperator by using binary sorted state > - > > Key: FLINK-27182 > URL: https://issues.apache.org/jira/browse/FLINK-27182 > Project: Flink > Issue Type: Sub-task > Components: Table SQL / Runtime >Reporter: David Anderson >Priority: Major > > The performance of the RowTimeSortOperator can be significantly improved by > using binary sorted state. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-27181) Optimize TemporalRowTimeJoinOperator by using temporal state
[ https://issues.apache.org/jira/browse/FLINK-27181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Anderson updated FLINK-27181: --- Description: The throughput of the TemporalRowTimeJoinOperator can be significantly improved by using binary sorted state in its implementation. (was: The throughput of the TemporalRowTimeJoinOperator can be significantly improved by using temporal state in its implementation.) > Optimize TemporalRowTimeJoinOperator by using temporal state > > > Key: FLINK-27181 > URL: https://issues.apache.org/jira/browse/FLINK-27181 > Project: Flink > Issue Type: Sub-task > Components: Table SQL / Runtime >Reporter: David Anderson >Priority: Major > > The throughput of the TemporalRowTimeJoinOperator can be significantly > improved by using binary sorted state in its implementation. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-27181) Optimize TemporalRowTimeJoinOperator by using binary sorted state
[ https://issues.apache.org/jira/browse/FLINK-27181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Anderson updated FLINK-27181: --- Summary: Optimize TemporalRowTimeJoinOperator by using binary sorted state (was: Optimize TemporalRowTimeJoinOperator by using temporal state) > Optimize TemporalRowTimeJoinOperator by using binary sorted state > - > > Key: FLINK-27181 > URL: https://issues.apache.org/jira/browse/FLINK-27181 > Project: Flink > Issue Type: Sub-task > Components: Table SQL / Runtime >Reporter: David Anderson >Priority: Major > > The throughput of the TemporalRowTimeJoinOperator can be significantly > improved by using binary sorted state in its implementation. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (FLINK-27179) Update training example to use temporal state
[ https://issues.apache.org/jira/browse/FLINK-27179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Anderson closed FLINK-27179. -- Resolution: Won't Do > Update training example to use temporal state > - > > Key: FLINK-27179 > URL: https://issues.apache.org/jira/browse/FLINK-27179 > Project: Flink > Issue Type: Sub-task > Components: Documentation / Training >Reporter: David Anderson >Priority: Major > > [https://nightlies.apache.org/flink/flink-docs-master/docs/learn-flink/event_driven/#example] > is a good use case for temporal state (this example is doing windowing in a > keyed process function). -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-27180) Docs for binary sorted state
[ https://issues.apache.org/jira/browse/FLINK-27180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Anderson updated FLINK-27180: --- Description: Update [https://nightlies.apache.org/flink/flink-docs-master/docs/dev/datastream/fault-tolerance/state/#using-keyed-state] to include binary sorted state. was: Update [https://nightlies.apache.org/flink/flink-docs-master/docs/dev/datastream/fault-tolerance/state/#using-keyed-state] to include temporal state. > Docs for binary sorted state > > > Key: FLINK-27180 > URL: https://issues.apache.org/jira/browse/FLINK-27180 > Project: Flink > Issue Type: Sub-task > Components: Documentation >Reporter: David Anderson >Priority: Major > > Update > [https://nightlies.apache.org/flink/flink-docs-master/docs/dev/datastream/fault-tolerance/state/#using-keyed-state] > > to include binary sorted state. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-27180) Docs for binary sorted state
[ https://issues.apache.org/jira/browse/FLINK-27180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Anderson updated FLINK-27180: --- Summary: Docs for binary sorted state (was: Docs for temporal state) > Docs for binary sorted state > > > Key: FLINK-27180 > URL: https://issues.apache.org/jira/browse/FLINK-27180 > Project: Flink > Issue Type: Sub-task > Components: Documentation >Reporter: David Anderson >Priority: Major > > Update > [https://nightlies.apache.org/flink/flink-docs-master/docs/dev/datastream/fault-tolerance/state/#using-keyed-state] > > to include temporal state. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-27178) create examples that use binary sorted state
[ https://issues.apache.org/jira/browse/FLINK-27178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Anderson updated FLINK-27178: --- Summary: create examples that use binary sorted state (was: create examples that use temporal state) > create examples that use binary sorted state > > > Key: FLINK-27178 > URL: https://issues.apache.org/jira/browse/FLINK-27178 > Project: Flink > Issue Type: Sub-task > Components: Examples >Reporter: David Anderson >Priority: Major > > Add examples showing how to use temporal state. E.g., sorting and/or a > temporal join. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-27178) create examples that use binary sorted state
[ https://issues.apache.org/jira/browse/FLINK-27178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Anderson updated FLINK-27178: --- Description: Add examples showing how to use binary sorted state. E.g., for custom windowing, or sorting. (was: Add examples showing how to use temporal state. E.g., sorting and/or a temporal join.) > create examples that use binary sorted state > > > Key: FLINK-27178 > URL: https://issues.apache.org/jira/browse/FLINK-27178 > Project: Flink > Issue Type: Sub-task > Components: Examples >Reporter: David Anderson >Priority: Major > > Add examples showing how to use binary sorted state. E.g., for custom > windowing, or sorting. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-27177) Implement Binary Sorted State
[ https://issues.apache.org/jira/browse/FLINK-27177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Anderson updated FLINK-27177: --- Summary: Implement Binary Sorted State (was: Implement Temporal State) > Implement Binary Sorted State > - > > Key: FLINK-27177 > URL: https://issues.apache.org/jira/browse/FLINK-27177 > Project: Flink > Issue Type: Sub-task > Components: Runtime / State Backends >Reporter: David Anderson >Priority: Major > > Following the plan in > [FLIP-220|[https://cwiki.apache.org/confluence/x/Xo_FD]|https://cwiki.apache.org/confluence/x/Xo_FD],] > * add methods to the RuntimeContext and KeyedStateStore interfaces for > registering TemporalValueState and TemporalListState > * > h3. implement TemporalValueState and TemporalListState -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-27177) Implement Binary Sorted State
[ https://issues.apache.org/jira/browse/FLINK-27177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Anderson updated FLINK-27177: --- Description: Following the plan in [FLIP-220|[https://cwiki.apache.org/confluence/x/Xo_FD]|https://cwiki.apache.org/confluence/x/Xo_FD],] (was: Following the plan in [FLIP-220|[https://cwiki.apache.org/confluence/x/Xo_FD]|https://cwiki.apache.org/confluence/x/Xo_FD],] * add methods to the RuntimeContext and KeyedStateStore interfaces for registering TemporalValueState and TemporalListState * h3. implement TemporalValueState and TemporalListState) > Implement Binary Sorted State > - > > Key: FLINK-27177 > URL: https://issues.apache.org/jira/browse/FLINK-27177 > Project: Flink > Issue Type: Sub-task > Components: Runtime / State Backends >Reporter: David Anderson >Priority: Major > > Following the plan in > [FLIP-220|[https://cwiki.apache.org/confluence/x/Xo_FD]|https://cwiki.apache.org/confluence/x/Xo_FD],] -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-27176) FLIP-220: Binary Sorted State
[ https://issues.apache.org/jira/browse/FLINK-27176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Anderson updated FLINK-27176: --- Description: Task for implementing FLIP-220: Binary Sorted State: [https://cwiki.apache.org/confluence/x/Xo_FD] (was: Task for implementing FLIP-220: Temporal State: [https://cwiki.apache.org/confluence/x/Xo_FD]) > FLIP-220: Binary Sorted State > - > > Key: FLINK-27176 > URL: https://issues.apache.org/jira/browse/FLINK-27176 > Project: Flink > Issue Type: Improvement > Components: Runtime / State Backends >Reporter: David Anderson >Assignee: David Anderson >Priority: Major > > Task for implementing FLIP-220: Binary Sorted State: > [https://cwiki.apache.org/confluence/x/Xo_FD] -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-27176) FLIP-220: Binary Sorted State
[ https://issues.apache.org/jira/browse/FLINK-27176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Anderson updated FLINK-27176: --- Release Note: Added BinarySortedState to optimize state accesses and updates used for sorting, temporal joins, and custom windowing. (was: Added TemporalValueState and TemporalListState to optimize state accesses and updates used for sorting, joins, and windowing. ) > FLIP-220: Binary Sorted State > - > > Key: FLINK-27176 > URL: https://issues.apache.org/jira/browse/FLINK-27176 > Project: Flink > Issue Type: Improvement > Components: Runtime / State Backends >Reporter: David Anderson >Assignee: David Anderson >Priority: Major > > Task for implementing FLIP-220: Temporal State: > [https://cwiki.apache.org/confluence/x/Xo_FD] -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-27176) FLIP-220: Binary Sorted State
[ https://issues.apache.org/jira/browse/FLINK-27176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Anderson updated FLINK-27176: --- Summary: FLIP-220: Binary Sorted State (was: FLIP-220: Temporal State) > FLIP-220: Binary Sorted State > - > > Key: FLINK-27176 > URL: https://issues.apache.org/jira/browse/FLINK-27176 > Project: Flink > Issue Type: Improvement > Components: Runtime / State Backends >Reporter: David Anderson >Priority: Major > > Task for implementing FLIP-220: Temporal State: > [https://cwiki.apache.org/confluence/x/Xo_FD] -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-27176) FLIP-220: Binary Sorted State
[ https://issues.apache.org/jira/browse/FLINK-27176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Anderson reassigned FLINK-27176: -- Assignee: David Anderson > FLIP-220: Binary Sorted State > - > > Key: FLINK-27176 > URL: https://issues.apache.org/jira/browse/FLINK-27176 > Project: Flink > Issue Type: Improvement > Components: Runtime / State Backends >Reporter: David Anderson >Assignee: David Anderson >Priority: Major > > Task for implementing FLIP-220: Temporal State: > [https://cwiki.apache.org/confluence/x/Xo_FD] -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [flink] hdulay commented on a diff in pull request #21436: [FLINK-30093] [formats] Protobuf Timestamp Compile Error
hdulay commented on code in PR #21436: URL: https://github.com/apache/flink/pull/21436#discussion_r1039021869 ## flink-formats/flink-protobuf/src/test/java/org/apache/flink/formats/protobuf/Pb3ToRowTest.java: ## @@ -85,6 +88,8 @@ public void testDeserialization() throws Exception { assertEquals(1, rowData.getInt(0)); assertEquals(2L, rowData.getLong(1)); + +assertEquals(ts.getNanos(), row.getTimestamp(11, 3).getNanoOfMillisecond()); Review Comment: @MartijnVisser Hi. I'm suggesting to add support for 3rd party protobuf datatype. Which needs a mapping to either a RowType or a SimpleType in the protobuf codegen components. I'm assuming this is the correct approach. Thanks -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] MartijnVisser commented on a diff in pull request #21436: [FLINK-30093] [formats] Protobuf Timestamp Compile Error
MartijnVisser commented on code in PR #21436: URL: https://github.com/apache/flink/pull/21436#discussion_r1039012360 ## flink-formats/flink-protobuf/src/test/java/org/apache/flink/formats/protobuf/Pb3ToRowTest.java: ## @@ -85,6 +88,8 @@ public void testDeserialization() throws Exception { assertEquals(1, rowData.getInt(0)); assertEquals(2L, rowData.getLong(1)); + +assertEquals(ts.getNanos(), row.getTimestamp(11, 3).getNanoOfMillisecond()); Review Comment: Just to step in: adding a new datatype requires a FLIP and a discussion. Why can't any of the existing datatypes be reused? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Closed] (FLINK-30186) Bump Flink version to 1.15.3
[ https://issues.apache.org/jira/browse/FLINK-30186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gyula Fora closed FLINK-30186. -- Resolution: Fixed Merged to main 3c5ec6cd86de61752c708eba385646a7c1164880 > Bump Flink version to 1.15.3 > > > Key: FLINK-30186 > URL: https://issues.apache.org/jira/browse/FLINK-30186 > Project: Flink > Issue Type: Improvement > Components: Kubernetes Operator >Affects Versions: kubernetes-operator-1.3.0 >Reporter: Jeff Yang >Assignee: Jeff Yang >Priority: Major > Labels: pull-request-available > Fix For: kubernetes-operator-1.3.0 > > > A new stable version of flink has been released, Bump Flink version to 1.15.3. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [flink-kubernetes-operator] gyfora merged pull request #451: [FLINK-30186] Bump Flink version to 1.15.3
gyfora merged PR #451: URL: https://github.com/apache/flink-kubernetes-operator/pull/451 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] hdulay commented on a diff in pull request #21436: [FLINK-30093] [formats] Protobuf Timestamp Compile Error
hdulay commented on code in PR #21436: URL: https://github.com/apache/flink/pull/21436#discussion_r1039003583 ## flink-formats/flink-protobuf/src/test/java/org/apache/flink/formats/protobuf/Pb3ToRowTest.java: ## @@ -85,6 +88,8 @@ public void testDeserialization() throws Exception { assertEquals(1, rowData.getInt(0)); assertEquals(2L, rowData.getLong(1)); + +assertEquals(ts.getNanos(), row.getTimestamp(11, 3).getNanoOfMillisecond()); Review Comment: @maosuhan Yes as I stepped through the ProtobufSQLITCaseTest to add a timestamp field, I realized this issue goes beyond just source generation and compilation errors. Is there an example of how to add new data type? I'm assuming I'll need to somehow map it to the TimestampType logical type. Thanks -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink-table-store] zjureel commented on a diff in pull request #416: [FLINK-30271] Introduce Table.copy from dynamic options
zjureel commented on code in PR #416: URL: https://github.com/apache/flink-table-store/pull/416#discussion_r1038981579 ## flink-table-store-core/src/main/java/org/apache/flink/table/store/table/AbstractFileStoreTable.java: ## @@ -40,6 +47,43 @@ public AbstractFileStoreTable(Path path, TableSchema tableSchema) { protected abstract FileStore store(); +protected abstract FileStoreTable copy(TableSchema newTableSchema); + +@Override +public FileStoreTable copy(Map dynamicOptions) { +// check option is not immutable +Map options = tableSchema.options(); +dynamicOptions.forEach( +(k, v) -> { +if (!Objects.equals(v, options.get(k))) { Review Comment: Should the immutable field be exist in the dynamic parameters? Maybe it's better to `SchemaManager.checkAlterTableOption(k)` without comparing the value -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] maosuhan commented on a diff in pull request #21436: [FLINK-30093] [formats] Protobuf Timestamp Compile Error
maosuhan commented on code in PR #21436: URL: https://github.com/apache/flink/pull/21436#discussion_r1038980183 ## flink-formats/flink-protobuf/src/test/java/org/apache/flink/formats/protobuf/Pb3ToRowTest.java: ## @@ -85,6 +88,8 @@ public void testDeserialization() throws Exception { assertEquals(1, rowData.getInt(0)); assertEquals(2L, rowData.getLong(1)); + +assertEquals(ts.getNanos(), row.getTimestamp(11, 3).getNanoOfMillisecond()); Review Comment: @hdulay I think there is still gap between protobuf and flink internal memory layout. I tried to build a concrete Timestamp like `Timestamp ts =Timestamp.newBuilder().setSeconds(xxx).setNanos(yyy).build()` in unit test. And you can easily find the difference. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink-kubernetes-operator] gyfora commented on pull request #457: [docs] Add controller flow docs
gyfora commented on PR #457: URL: https://github.com/apache/flink-kubernetes-operator/pull/457#issuecomment-1336426415 I will remove them once back from holiday , this is not urgent. Thank you! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink-table-store] zjureel commented on a diff in pull request #418: [FLINK-30273] Introduce RecordReaderUtils.transform to transform RecordReader
zjureel commented on code in PR #418: URL: https://github.com/apache/flink-table-store/pull/418#discussion_r1038979729 ## flink-table-store-core/src/main/java/org/apache/flink/table/store/file/utils/RecordReaderUtils.java: ## @@ -44,4 +45,49 @@ public static void forEachRemaining( reader.close(); } } + +/** + * Returns a {@link RecordReader} that applies {@code function} to each element of {@code + * fromReader}. + */ +public static RecordReader transform( +RecordReader fromReader, Function function) { +return new RecordReader() { +@Override +public RecordIterator readBatch() throws IOException { +RecordIterator iterator = fromReader.readBatch(); +if (iterator == null) { +return null; +} +return transform(iterator, function); +} + +@Override +public void close() throws IOException { +fromReader.close(); +} +}; +} + +/** + * Returns an iterator that applies {@code function} to each element of {@code fromIterator}. + */ +public static RecordReader.RecordIterator transform( +RecordReader.RecordIterator fromIterator, Function function) { +return new RecordReader.RecordIterator() { +@Override +public R next() throws IOException { Review Comment: Add @Nullable -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink-table-store] zjureel commented on a diff in pull request #418: [FLINK-30273] Introduce RecordReaderUtils.transform to transform RecordReader
zjureel commented on code in PR #418: URL: https://github.com/apache/flink-table-store/pull/418#discussion_r1038979666 ## flink-table-store-core/src/main/java/org/apache/flink/table/store/file/utils/RecordReaderUtils.java: ## @@ -44,4 +45,49 @@ public static void forEachRemaining( reader.close(); } } + +/** + * Returns a {@link RecordReader} that applies {@code function} to each element of {@code + * fromReader}. + */ +public static RecordReader transform( +RecordReader fromReader, Function function) { +return new RecordReader() { +@Override +public RecordIterator readBatch() throws IOException { Review Comment: Add @Nullable? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Closed] (FLINK-28766) UnalignedCheckpointStressITCase.runStressTest failed with NoSuchFileException
[ https://issues.apache.org/jira/browse/FLINK-28766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Piotr Nowojski closed FLINK-28766. -- Resolution: Fixed Merged to master as 1ddaa2a460d and 9e1cefe2509 > UnalignedCheckpointStressITCase.runStressTest failed with NoSuchFileException > - > > Key: FLINK-28766 > URL: https://issues.apache.org/jira/browse/FLINK-28766 > Project: Flink > Issue Type: Bug > Components: Runtime / Checkpointing >Affects Versions: 1.16.0, 1.17.0 >Reporter: Huang Xingbo >Assignee: Anton Kalashnikov >Priority: Critical > Labels: pull-request-available, test-stability > Fix For: 1.17.0 > > > {code:java} > 2022-08-01T01:36:16.0563880Z Aug 01 01:36:16 [ERROR] > org.apache.flink.test.checkpointing.UnalignedCheckpointStressITCase.runStressTest > Time elapsed: 12.579 s <<< ERROR! > 2022-08-01T01:36:16.0565407Z Aug 01 01:36:16 java.io.UncheckedIOException: > java.nio.file.NoSuchFileException: > /tmp/junit1058240190382532303/f0f99754a53d2c4633fed75011da58dd/chk-7/61092e4a-5b9a-4f56-83f7-d9960c53ed3e > 2022-08-01T01:36:16.0566296Z Aug 01 01:36:16 at > java.nio.file.FileTreeIterator.fetchNextIfNeeded(FileTreeIterator.java:88) > 2022-08-01T01:36:16.0566972Z Aug 01 01:36:16 at > java.nio.file.FileTreeIterator.hasNext(FileTreeIterator.java:104) > 2022-08-01T01:36:16.0567600Z Aug 01 01:36:16 at > java.util.Iterator.forEachRemaining(Iterator.java:115) > 2022-08-01T01:36:16.0568290Z Aug 01 01:36:16 at > java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801) > 2022-08-01T01:36:16.0569172Z Aug 01 01:36:16 at > java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482) > 2022-08-01T01:36:16.0569877Z Aug 01 01:36:16 at > java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472) > 2022-08-01T01:36:16.0570554Z Aug 01 01:36:16 at > java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708) > 2022-08-01T01:36:16.0571371Z Aug 01 01:36:16 at > java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) > 2022-08-01T01:36:16.0572417Z Aug 01 01:36:16 at > java.util.stream.ReferencePipeline.reduce(ReferencePipeline.java:546) > 2022-08-01T01:36:16.0573618Z Aug 01 01:36:16 at > org.apache.flink.test.checkpointing.UnalignedCheckpointStressITCase.discoverRetainedCheckpoint(UnalignedCheckpointStressITCase.java:289) > 2022-08-01T01:36:16.0575187Z Aug 01 01:36:16 at > org.apache.flink.test.checkpointing.UnalignedCheckpointStressITCase.runAndTakeExternalCheckpoint(UnalignedCheckpointStressITCase.java:262) > 2022-08-01T01:36:16.0576540Z Aug 01 01:36:16 at > org.apache.flink.test.checkpointing.UnalignedCheckpointStressITCase.runStressTest(UnalignedCheckpointStressITCase.java:158) > 2022-08-01T01:36:16.0577684Z Aug 01 01:36:16 at > sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > 2022-08-01T01:36:16.0578546Z Aug 01 01:36:16 at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > 2022-08-01T01:36:16.0579374Z Aug 01 01:36:16 at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > 2022-08-01T01:36:16.0580298Z Aug 01 01:36:16 at > java.lang.reflect.Method.invoke(Method.java:498) > 2022-08-01T01:36:16.0581243Z Aug 01 01:36:16 at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) > 2022-08-01T01:36:16.0582029Z Aug 01 01:36:16 at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > 2022-08-01T01:36:16.0582766Z Aug 01 01:36:16 at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) > 2022-08-01T01:36:16.0583488Z Aug 01 01:36:16 at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > 2022-08-01T01:36:16.0584203Z Aug 01 01:36:16 at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > 2022-08-01T01:36:16.0585087Z Aug 01 01:36:16 at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > 2022-08-01T01:36:16.0585778Z Aug 01 01:36:16 at > org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54) > 2022-08-01T01:36:16.0586482Z Aug 01 01:36:16 at > org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54) > 2022-08-01T01:36:16.0587155Z Aug 01 01:36:16 at > org.apache.flink.util.TestNameProvider$1.evaluate(TestNameProvider.java:45) > 2022-08-01T01:36:16.0587809Z Aug 01 01:36:16 at > org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:61) > 2022-08-01T01:36:16.0588434Z Aug 01 01:36:16 at > org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) > 2022-08-01T01:36:16.0589203Z Aug 01 01:36:16 at >
[GitHub] [flink] pnowojski merged pull request #21440: [FLINK-28766][tests] Safe iterate over checkpoint files in order to avoid exception during parallel savepoint deletion
pnowojski merged PR #21440: URL: https://github.com/apache/flink/pull/21440 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink-kubernetes-operator] morhidi commented on pull request #457: [docs] Add controller flow docs
morhidi commented on PR #457: URL: https://github.com/apache/flink-kubernetes-operator/pull/457#issuecomment-1336415811 > > +1 I was unable to render the pictures on this PR. > > Sorry @morhidi , I should have posted some screenshots. Unfortunately I don’t have the laptop on me hm.. it is supposed to work without screenshots. I checked and the svgs are not formatted properly. The `` tag is duplicated in them. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink-kubernetes-operator] morhidi merged pull request #420: FLINK-29536 - Add WATCH_NAMESPACE env var to operator
morhidi merged PR #420: URL: https://github.com/apache/flink-kubernetes-operator/pull/420 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink-kubernetes-operator] gyfora commented on pull request #457: [docs] Add controller flow docs
gyfora commented on PR #457: URL: https://github.com/apache/flink-kubernetes-operator/pull/457#issuecomment-1336413490 > +1 I was unable to render the pictures on this PR. Sorry @morhidi , I should have posted some screenshots. Unfortunately I don’t have the laptop on me -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink-kubernetes-operator] morhidi merged pull request #469: [FLINK-29109] Generate random jobId for stateless upgrade mode irrespective of Flink version
morhidi merged PR #469: URL: https://github.com/apache/flink-kubernetes-operator/pull/469 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink-kubernetes-operator] tagarr commented on a diff in pull request #420: FLINK-29536 - Add WATCH_NAMESPACE env var to operator
tagarr commented on code in PR #420: URL: https://github.com/apache/flink-kubernetes-operator/pull/420#discussion_r1038965294 ## helm/flink-kubernetes-operator/templates/flink-operator.yaml: ## @@ -79,7 +79,9 @@ spec: {{- end }} env: - name: OPERATOR_NAMESPACE Review Comment: So it is a more consistent way of defining the operator namespace and works for all install methods. That's why I made the change. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Created] (FLINK-30288) Use visitor to convert predicate for orc
Shammon created FLINK-30288: --- Summary: Use visitor to convert predicate for orc Key: FLINK-30288 URL: https://issues.apache.org/jira/browse/FLINK-30288 Project: Flink Issue Type: Improvement Components: Table Store Affects Versions: table-store-0.3.0 Reporter: Shammon Use `PredicateVisitor` to convert `Predicate` in table store for orc -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [flink-ml] jiangxin369 commented on a diff in pull request #183: [FLINK-29603] Add Transformer for StopWordsRemover
jiangxin369 commented on code in PR #183: URL: https://github.com/apache/flink-ml/pull/183#discussion_r1038916503 ## flink-ml-python/pyflink/ml/lib/feature/stopwordsremover.py: ## @@ -0,0 +1,135 @@ + +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import typing +from typing import Tuple + +from pyflink.java_gateway import get_gateway +from pyflink.ml.core.param import Param, StringArrayParam, BooleanParam, StringParam +from pyflink.ml.core.wrapper import JavaWithParams +from pyflink.ml.lib.feature.common import JavaFeatureTransformer +from pyflink.ml.lib.param import HasInputCols, HasOutputCols + + +def _load_default_stop_words(language: str): +return get_gateway().jvm.org.apache.flink.ml.feature. \ +stopwordsremover.StopWordsRemover.loadDefaultStopWords(language) + + +def _get_default_or_us(): +return get_gateway().jvm.org.apache.flink.ml.feature. \ +stopwordsremover.StopWordsRemover.getDefaultOrUS() + + +class _StopWordsRemoverParams( +JavaWithParams, +HasInputCols, +HasOutputCols +): +""" +Params for :class:`StopWordsRemover`. +""" + +STOP_WORDS: Param[Tuple[str, ...]] = StringArrayParam( +"stop_words", +"The words to be filtered out.", +_load_default_stop_words('english')) + +CASE_SENSITIVE: Param[bool] = BooleanParam( +"case_sensitive", +"Whether to do a case-sensitive comparison over the stop words.", +False +) + +LOCALE: Param[str] = StringParam( +"locale", +"Locale of the input for case insensitive matching. Ignored when caseSensitive is true.", +_get_default_or_us()) + +def __init__(self, java_params): +super(_StopWordsRemoverParams, self).__init__(java_params) + +def set_stop_words(self, value: Tuple[str, ...]): Review Comment: Let's replace the param with `*value: str` to set `StringArrayParam`. ## flink-ml-lib/src/main/resources/org/apache/flink/ml/feature/stopwords/README: ## @@ -0,0 +1,12 @@ +Stopwords Corpus Review Comment: Would it be better to write this file in markdown syntax like `## Stopwords Corpus` for better readability? ## docs/content/docs/operators/feature/stopwordsremover.md: ## @@ -0,0 +1,165 @@ +--- +title: "StopWordsRemover" +weight: 1 +type: docs +aliases: +- /operators/feature/stopwordsremover.html +--- + + + +## StopWordsRemover + +A feature transformer that filters out stop words from input. + +Note: null values from input array are preserved unless adding null to stopWords +explicitly. + +See Also: http://en.wikipedia.org/wiki/Stop_words;>Stop words +(Wikipedia) + +### Input Columns + +| Param name | Type | Default | Description | +|:---|:-|:|:---| +| inputCols | String[] | `null` | Arrays of strings containing stop words to remove. | + +### Output Columns + +| Param name | Type | Default | Description | +|:---|:-|:|:---| +| outputCols | String[] | `null` | Arrays of strings with stop words removed. | + +### Parameters + +| Key | Default| Type | Required | Description | +|---||--|--|| +| inputCols | `null` | String[] | yes | Input column names. | +| outputCols| `null` | String[] | yes | Output column name. | +| stopWords | `StopWordsRemover.loadDefaultStopWords("english")` | String[] |