Re: Using Flink k8s operator on OKD

2023-10-05 Thread Krzysztof Chmielewski
operator? I haven't found it in operator docs - is not there or I have missed it? Thanks. Krzysztof Chmielewski śr., 20 wrz 2023 o 22:32 Krzysztof Chmielewski < krzysiek.chmielew...@gmail.com> napisał(a): > Thank you Zach, > our flink-operator and flink deployments are in same namespa

Re: Using Flink k8s operator on OKD

2023-09-20 Thread Krzysztof Chmielewski
###* [1] https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-release-1.6/docs/operations/rbac/ Thanks, Krzysztof Chmielewski śr., 20 wrz 2023 o 05:40 Zach Lorimer napisał(a): > I haven’t used OKD but it sounds like OLM. If that’s the case, I’m > assuming the operator w

Test message

2023-09-20 Thread Krzysztof Chmielewski
Community, please forgive me for this message. This is a test, because all day, my replays to my other user thread are being rejected by email server. Sincerely apologies Krzysztof

Using Flink k8s operator on OKD

2023-09-19 Thread Krzysztof Chmielewski
ndleDispatch(ReconciliationDispatcher.java:89) at io.javaoperatorsdk.operator.processing.event.ReconciliationDispatcher.handleExecution(ReconciliationDispatcher.java:62) at io.javaoperatorsdk.operator.processing.event.EventProcessor$ReconcilerExecutor.run(EventProcessor.java:414) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.base/java.lang.Thread.run(Unknown Source) The deployment we are trying to run is this: apiVersion: flink.apache.org/v1beta1 kind: FlinkDeployment metadata: namespace: flink name: test spec: mode: native image: flink:1.17 flinkVersion: v1_17 flinkConfiguration: taskmanager.numberOfTaskSlots: "2" serviceAccount: flink jobManager: resource: memory: "2048m" cpu: 1 taskManager: resource: memory: "2048m" cpu: 1 Regards, Krzysztof Chmielewski [1] https://lists.apache.org/thread/07d46txb6vttw7c8oyr6z4n676vgqh28

Re: HA in k8s operator

2023-09-17 Thread Krzysztof Chmielewski
Hi Chen, I now see what you was trying to tell me. The problem was on my end... sorry for that. The job I was using for session cluster had NoRestart() set as Restart Strategy, whereas Application Cluster was execution job with some "proper" restart strategy. Thanks. Krzysztof Chmielew

Re: HA in k8s operator

2023-09-17 Thread Krzysztof Chmielewski
ource/overview/#session-cluster-deployments niedz., 17 wrz 2023 o 10:32 Krzysztof Chmielewski < krzysiek.chmielew...@gmail.com> napisał(a): > Thank you, > so in other words to have TM HA on k8s I have to configure [1] correct? > > [1] > https://nightlies.apache.org/fl

Re: HA in k8s operator

2023-09-17 Thread Krzysztof Chmielewski
ails. > > [1] > https://nightlies.apache.org/flink/flink-docs-release-1.17/docs/ops/state/task_failure_recovery/#restart-strategies > > Best, > Zhanghao Chen > ---------- > *发件人:* Krzysztof Chmielewski > *发送时间:* 2023年9月17日 7:58 > *收件人:* user >

HA in k8s operator

2023-09-16 Thread Krzysztof Chmielewski
ocs-main/docs/operations/configuration/#leader-election-and-high-availability Regards, Krzysztof Chmielewski

Re: using CheckpointedFunction on a keyed state

2023-09-08 Thread Krzysztof Chmielewski
on that is the right place to do this, > > you just wouldn’t want to store values in the state from within the open() > function. > > > > InitializeState() and snapshotState() are mainly used to initialize > operator state, not keyed state … refer to the relevant documentat

Re: updating keyed state in open method.

2023-09-07 Thread Krzysztof Chmielewski
Thanks, that helped. Regards, Krzysztof Chmielewski czw., 7 wrz 2023 o 09:52 Schwalbe Matthias napisał(a): > Hi Krzysztof, > > > > You cannot access keyed state in open(). > > Keyed state has a value per key. > > In theory you would have to initialize per p

using CheckpointedFunction on a keyed state

2023-09-07 Thread Krzysztof Chmielewski
Hi, I have a toy Flink job [1] where I have a KeyedProcessFunction implementation [2] that also implements the CheckpointedFunction. My stream definition has .keyBy(...) call as you can see in [1]. However when I'm trying to run this toy job I'm getting an exception from

updating keyed state in open method.

2023-09-07 Thread Krzysztof Chmielewski
Hi, I'm having a problem with my toy flink job where I would like to access a ValueState of a keyed stream. The Job setup can be found here [1], it is fairly simple env .addSource(new CheckpointCountingSource(100, 60)) .keyBy(value -> value) .process(new KeyCounter()) .addSink(new ConsoleSink());

k8s operator - clearing operator state

2023-09-04 Thread Krzysztof Chmielewski
Hi community, I would like to ask how one can clear Flink's k8s operator state. I have a sandbox k8s cluster with Flink k8s operator where I've deployed Flink session cluster with few Session jobs. After some play around, and braking few things here and there I see this log: 023-09-04

Re: K8s operator - Stop Job with savepoint on session cluster via Java API

2023-09-01 Thread Krzysztof Chmielewski
able/sqlclient/#terminating-a-job > > Best, > Shammon FY > > On Fri, Sep 1, 2023 at 6:35 AM Krzysztof Chmielewski < > krzysiek.chmielew...@gmail.com> wrote: > >> Hi community, >> I would like to ask what is the recommended way to stop Flink job with >> sa

K8s operator - Stop Job with savepoint on session cluster via Java API

2023-08-31 Thread Krzysztof Chmielewski
-master/docs/ops/state/savepoints/#stopping-a-job-with-savepoint Thanks, Krzysztof Chmielewski

Re: flink k8s operator - problem with patching seession cluster

2023-08-31 Thread Krzysztof Chmielewski
any case the native mode is probably better fit for your use-case. > > Gyula > > On Thu, Aug 31, 2023 at 2:42 AM Krzysztof Chmielewski < > krzysiek.chmielew...@gmail.com> wrote: > >> Just want to broth this up in case it was missed in the other >> messages/queries :)

Re: flink k8s operator - problem with patching seession cluster

2023-08-30 Thread Krzysztof Chmielewski
Just want to broth this up in case it was missed in the other messages/queries :) TL:DR How to add TM to Flink Session cluster via Java K8s client if Session Cluster has running jobs? Thanks, Krzysztof pt., 25 sie 2023 o 23:48 Krzysztof Chmielewski < krzysiek.chmielew...@gmail.com> n

flink k8s operator - problem with patching seession cluster

2023-08-25 Thread Krzysztof Chmielewski
Hi community, I have a use case where I would like to add an extra TM) to a running Flink session cluster that has Flink jobs deployed. Session cluster creation, job submission and cluster patching is done using flink k8s operator Java API. The Details of this are presented here [1] I would like

Re: How to use pipeline.jobvertex-parallelism-overrides property.

2023-08-25 Thread Krzysztof Chmielewski
ely. >> >> [1] >> https://github.com/apache/flink/blob/b3fb1421fe86129a4e0b10bf3a46704b7132e775/flink-runtime/src/main/java/org/apache/flink/runtime/dispatcher/Dispatcher.java#L1591 >> >> Best, >> Ron >> >> Krzysztof Chmielewski 于2023年8月24日周四 >>

How to use pipeline.jobvertex-parallelism-overrides property.

2023-08-24 Thread Krzysztof Chmielewski
Hi, have someone used pipeline.jobvertex-parallelism-overrides [1] property? I wonder what actually should be a key here? Operator name? What if my operators are chained and I want to override only one of its elements. For example Source -> (Map1 chained with Map2) -> Sink. Can I override Map2

Flink k8s operator - managde from java microservice

2023-08-16 Thread Krzysztof Chmielewski
Hi, I have a use case where I would like to run Flink jobs using Apache Flink k8s operator. Where actions like job submission (new and from save point), Job cancel with save point, cluster creations will be triggered from Java based micro service. Is there any recommended/Dedicated Java API for

Re: Custom catalog implementation - getting table schema for computed columns

2023-01-20 Thread Krzysztof Chmielewski
Ok, so now I see that runtime type of "table" parameter is ResolvedCatalogTable that has method getResolvedSchema. So I guess my question is, can I assume that ResolvedCatalogTable will be always a runtime type? pt., 20 sty 2023 o 19:27 Krzysztof Chmielewski < krzysiek.chmielew

Custom catalog implementation - getting table schema for computed columns

2023-01-20 Thread Krzysztof Chmielewski
Hi, I'm implementing a custom Catalog where for "create table" I need to get tables schema, both column names and types from DDL. Now the Catalog's createTable method has "CatalogBaseTable table" argument. The CatalogBaseTable has a deprecated "getSchema" and suggest to use getUnresolvedSchema

Detect Table options override by Query Dynamic options

2022-12-29 Thread Krzysztof Chmielewski
/hints/#dynamic-table-options Regards, Krzysztof Chmielewski

Re: Sending descriptive alerts from Flink

2022-12-28 Thread Krzysztof Chmielewski
/flink/flink-docs-master/docs/dev/datastream/side_output/ [2] https://nightlies.apache.org/flink/flink-docs-master/docs/dev/datastream/operators/overview/#union [3] https://github.com/getindata/flink-http-connector Regards, Krzysztof Chmielewski śr., 28 gru 2022 o 18:43 Pratham Kumar P via user

SQL Lookup join on nested field

2022-10-18 Thread Krzysztof Chmielewski
org/thread/o3fc5lrqf6dbkl9pm0rp2mqyt7mcnsv3 Best Regards, Krzysztof Chmielewski SQL join on nested field

Re: question about Async IO

2022-10-14 Thread Krzysztof Chmielewski
e were able to use keyed operators with access to keyed state after Async operators. Hope that helped. [1] https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/dev/datastream/experimental/ Regards, Krzysztof Chmielewski pt., 14 paź 2022 o 19:11 Galen Warren napisał(a): > I have a

Re: Flink FaultTolerant at operator level

2022-10-05 Thread Krzysztof Chmielewski
I had a similar use case. What we did is that we decided that data for enrichment must be versioned, for example our enrichment data was "refreshed" once a day and we kept old data. During the enrichment process we lookup data for given version based on record's metadata. Regards.

Access to Table environent properties/Job arguents from DynamicTableFactory

2022-09-09 Thread Krzysztof Chmielewski
Hi, is there a way to access a Table Environment configuration or Job arguments from DynamicTableFactory (Sink/Source)? I'm guessing no but I just want to double check that I'm not missing anything here. For my understanding we have access only to Table definition. Thanks, Krzysztof Chmielewski

Passing Dynamic Table Options to Catalog's getTable()

2022-08-16 Thread Krzysztof Chmielewski
Hi, I'm working on my own Catalog implementation and I'm wondering if there is any way to pas Dynamic Table Options [1] to Catalog's getTable(...) method. [1] https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/sql/queries/hints/#dynamic-table-options Regards, Krzysztof

Re: Flink 1.13 with GCP Object storage

2022-06-29 Thread Krzysztof Chmielewski
Ok, the problem was that I had few conflicting versions of hadoop-common lib on class path. When I fix the dependencies, it start working. Cheers. śr., 29 cze 2022 o 18:20 Krzysztof Chmielewski < krzysiek.chmielew...@gmail.com> napisał(a): > Hi, > I'm trying to read data from GCP

Flink 1.13 with GCP Object storage

2022-06-29 Thread Krzysztof Chmielewski
Hi, I'm trying to read data from GCP Object Store with Flink's File source 1.13.6 I followed instruction from [1] But when I started my job on Flink cluster I have this exception: No FileSystem for scheme: gs Any ideas? Cheers [1]

length value for some classes extending LogicalType.

2022-05-25 Thread Krzysztof Chmielewski
ult length value 1 but the actual length of the data will be bigger than 1? For example: RowType.of("col1", new CharType()) <- this will use default length value 1. Regards, Krzysztof Chmielewski

Re: ParquetColumnarRowInputFormat - parameter description

2022-02-02 Thread Krzysztof Chmielewski
Thank you Fabian, I have one followup question. You wrote: *isUtcTimestamp denotes whether timestamps should be represented asSQL UTC timestamps.* Quetion: So, if *isUtcTimestamp *is set to false, how timestamps are represented? Regards, Krzysztof Chmielewski wt., 25 sty 2022 o 11:56 Fabian

Re: ParquetColumnarRowInputFormat - parameter description

2022-01-24 Thread Krzysztof Chmielewski
timestamp flab does with some examples? Regards, Krzysztof Chmielewski pon., 10 sty 2022 o 14:59 Krzysztof Chmielewski < krzysiek.chmielew...@gmail.com> napisał(a): > Hi, > I would like to ask for some more details regarding > three ParquetColumnarRowInputFormat contruc

Re: Hive Source - SplitEnumeratorContext and its callAsync method - possible bug

2022-01-19 Thread Krzysztof Chmielewski
of this task takes longer than its period, then subsequent executions may start late, but will not concurrently execute. So I guess we are safe here, it will be executed task, by task and sensationally passed back to handleNewSplits method. śr., 19 sty 2022 o 16:16 Krzysztof Chmielewski

Hive Source - SplitEnumeratorContext and its callAsync method - possible bug

2022-01-19 Thread Krzysztof Chmielewski
assignment. Regards, Krzysztof Chmielewski

Sorting/grouping keys and State management in BATCH mode

2022-01-11 Thread Krzysztof Chmielewski
ORT, SPILL? Regards, Krzysztof Chmielewski [1] https://nightlies.apache.org/flink/flink-docs-master/docs/dev/datastream/execution_mode/ [2] https://cwiki.apache.org/confluence/display/FLINK/FLIP-140%3A+Introduce+batch-style+execution+for+bounded+keyed+streams#FLIP140:Introducebatchstyleexecutionfo

Re: StaticFileSplitEnumerator - keeping track 9f already processed files

2022-01-11 Thread Krzysztof Chmielewski
. Thanks, Krzysztof Chmielewski czw., 6 sty 2022 o 09:29 Fabian Paul napisał(a): > Hi, > > I think your analysis is correct. One thing to note here is that I > guess when implementing the StaticFileSplitEnumerator we only thought > about the batch case where no checkpoints exist [

Re: RowType for complex types in Parquet File

2022-01-10 Thread Krzysztof Chmielewski
Hi, Isn't this actually already implemented and planed for version 1.15? https://issues.apache.org/jira/browse/FLINK-17782 Regards, Krzysztof Chmielewski pt., 7 sty 2022 o 16:20 Jing Ge napisał(a): > Hi Meghajit, > > like the exception described, parquet schema with neste

ParquetColumnarRowInputFormat - parameter description

2022-01-10 Thread Krzysztof Chmielewski
you provide me some information about the batching process and other two boolean flags? Regards, Krzysztof Chmielewski

Re: StaticFileSplitEnumerator - keeping track 9f already processed files

2022-01-06 Thread Krzysztof Chmielewski
will reprocess files that were already used for Split creation. In case of Checkpoint restoration it does not check if that file was already processed. Regards, Krzysztof Chmielewski czw., 6 sty 2022, 03:21 użytkownik Caizhi Weng napisał: > Hi! > > Do you mean the pathsAlreadyProc

StaticFileSplitEnumerator - keeping track 9f already processed files

2022-01-05 Thread Krzysztof Chmielewski
there will be a Job/cluster restart then we will process same files again. Regards, Krzysztof Chmielewski

Re: Operator state in New Source API

2021-12-23 Thread Krzysztof Chmielewski
-tolerance/broadcast_state/#important-considerations [2] https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/dev/datastream/fault-tolerance/state/#stateful-source-functions [3] https://nightlies.apache.org/flink/flink-docs-master/docs/dev/datastream/sources/ Thanks, Krzysztof Chmielewski

Operator state in New Source API

2021-12-22 Thread Krzysztof Chmielewski
splits in batches, periodically adding new files to the alreadyProcessedPaths collection. Regards, Krzysztof Chmielewski [1] https://nightlies.apache.org/flink/flink-docs-master/docs/dev/datastream/sources/

Defining a RowType for FileSource - Parquet

2021-12-17 Thread Krzysztof Chmielewski
() }; I'm wondering what is the recommend pattern for cases where Parquet row has many columns, for example 100. Do we have to define them all by hand? Regards, Krzysztof Chmielewski [1] https://nightlies.apache.org/flink/flink-docs-master/docs/connectors/datastream/formats/parquet/

Unified Source Interface in flink 1.12

2021-12-16 Thread Krzysztof Chmielewski
for using Source interface implementation for Table Source as it is for version 1.14 or it was added in post 1.12 versions? After my very quick research based on FileSystemTable source it seems it is possible in version 1.12. Please correct me if I'm wrong. Regards, Krzysztof Chmielewski [1] https

Antlr usage in FLink

2021-12-16 Thread Krzysztof Chmielewski
Hi, I was cruising through Flink's source code and I have noticed that one of the modules contains a lexer and parser g4 files for Java. I'm fairly familiar with Antlr4 btw. and I was wondering for what Flink uses Antlr4 with Java g4 files. Regards, Krzysztof Chmielewski

Re: UDF and Broadcast State Pattern

2021-12-15 Thread Krzysztof Chmielewski
Thank you very much for the clarification Seth. Best Regards Krzysztof Chmielewski śr., 15 gru 2021, 16:12 użytkownik Seth Wiesman napisał: > Hi Krzysztof, > > There is a difference in semantics here between yourself and Caizhi. SQL > UDFs can be used statefully - see Aggre

Re: UDF and Broadcast State Pattern

2021-12-15 Thread Krzysztof Chmielewski
to change it? Regards, Krzysztof Chmielewski śr., 15 gru 2021 o 02:36 Caizhi Weng napisał(a): > Hi! > > Currently you can't use broadcast state in Flink SQL UDF because UDFs are > all stateless. > > However you mentioned your use case that you want to control the logic in

Re: FileSource with Parquet Format - parallelism level

2021-12-14 Thread Krzysztof Chmielewski
? Regards, Krzysztof Chmielewski pt., 10 gru 2021 o 15:29 Arvid Heise napisał(a): > Yes, Parquet files can be read in splits (=in parallel). Which enumerator > is used is determined here [1]. > > [1] > https://github.com/apache/flink/blob/master/flink-formats/flink-parquet/src/main/

UDF and Broadcast State Pattern

2021-12-14 Thread Krzysztof Chmielewski
Hi, Is there a way to build an UDF [1] for FLink SQL that can be used with Broadcast State Pattern [2]? I have a use case, where I would like to be able to use broadcast control stream to change logic in UDF. Regards, Krzysztof Chmielewski [1] https://nightlies.apache.org/flink/flink-docs

Re: FileSource with Parquet Format - parallelism level

2021-12-10 Thread Krzysztof Chmielewski
dering if BlockSplittingRecursiveEnumerator can be used for Parquet file. Actually does Parquet format supports reading file in blocks by different threads. Do those blocks have to be "merged" later or can I just read them row by row. Regards, Krzysztof Chmielewski pt., 10 gru 2021 o 09:27 R

FileSource with Parquet Format - parallelism level

2021-12-09 Thread Krzysztof Chmielewski
Hi, can I have a File DataStream Source that will work with Parquet Format and have parallelism level higher than one? Is it possible to read Parquet file in chunks by multiple threads? Regards, Krzysztof Chmielewski

Creating custom connector lib - dependency scope

2021-12-07 Thread Krzysztof Chmielewski
in the connector's pom.xml or it does not matter if they are in default scope? Regards, Krzysztof Chmielewski

Re: Access to GlobalJobParameters From DynamicTableSourceFactory

2021-11-10 Thread Krzysztof Chmielewski
use a some kind of "mock" implementation on test env. Its just an example Ideally you would not want to change the code of the SQL or table definition or even expose such parameter to the end user. Best, Krzysztof Chmielewski śr., 10 lis 2021 o 02:24 Caizhi Weng napisał(a): > Hi!

Re: Access to GlobalJobParameters From DynamicTableSourceFactory

2021-11-09 Thread Krzysztof Chmielewski
have access only to Table definition fields. Regards, Krzysztof Chmielewski wt., 9 lis 2021 o 19:02 Francesco Guardiani napisał(a): > Have you tried this? > > > context.getConfiguration().get(org.apache.flink.configuration.PipelineOptions.GLOBAL_JOB_PARAMETERS) > > > >

Access to GlobalJobParameters From DynamicTableSourceFactory

2021-11-09 Thread Krzysztof Chmielewski
has access only to ReadableConfig object which does not have GlobalParameters. Cheers, Krzysztof Chmielewski

Re: Dependency injection for TypeSerializer?

2021-11-09 Thread Krzysztof Chmielewski
ded. Because of that, we were not be able to use things like AWS EMR or Kinesis. Cheers, Krzysztof Chmielewski wt., 9 lis 2021 o 06:46 Thomas Weise napisał(a): > Hi, > > I was looking into a problem that requires a configurable type > serializer for communication with a schema registry. The

Re: Implementing a Custom Source Connector for Table API and SQL

2021-11-05 Thread Krzysztof Chmielewski
Thanks Fabian, I was looking forward to use the unified Source interface in my use case. The implementation was very intuitive with this new design. I will try with TableFunction then. Best. Krzysztof Chmielewski pt., 5 lis 2021 o 14:20 Fabian Paul napisał(a): > Hi Krzysztof, > >

Re: Implementing a Custom Source Connector for Table API and SQL

2021-11-05 Thread Krzysztof Chmielewski
Ok, I think there is some misunderstanding here. As it is presented in [1] for implementing Custom Source Connector for Table API and SQL: *"You first need to have a source connector which can be used in Flink’s runtime system (...)* *For complex connectors, you may want to implement the

Re: Implementing a Custom Source Connector for Table API and SQL

2021-11-05 Thread Krzysztof Chmielewski
Thank you Fabian, what if I would rewrite my custom Source to use old RichSourchFunction instead unified Source Interface? Would it work then as Lookup? Best, Krzysztof pt., 5 lis 2021 o 11:18 Fabian Paul napisał(a): > Hi Krzysztof, > > The unified Source is meant to be used for the DataStream

Re: Implementing a Custom Source Connector for Table API and SQL

2021-11-05 Thread Krzysztof Chmielewski
BTW, @Ingo Burk You wrote that "*the new, unified Source interface can only work as a scan source.*" Is there any special design reason behind it or its just simply not yet implemented? Thanks, Krzysztof Chmielewski czw., 4 lis 2021 o 16:27 Krzysztof Chmielewski < krzy

Re: Implementing a Custom Source Connector for Table API and SQL

2021-11-04 Thread Krzysztof Chmielewski
> you intend to have it work as a lookup source? > > > Best > Ingo > > On Thu, Nov 4, 2021 at 4:11 PM Krzysztof Chmielewski < > krzysiek.chmielew...@gmail.com> wrote: > >> Thanks Fabian and Ingo, >> yes I forgot to add the refrence links, so here they are: &

Re: Implementing a Custom Source Connector for Table API and SQL

2021-11-04 Thread Krzysztof Chmielewski
/datastream/sources/ In my case I would really need a LookupTableSource and not ScanTableSource since by use-case and source will get data for given parameters and I don't need to scan the entire resource. Cheers, czw., 4 lis 2021 o 15:48 Krzysztof Chmielewski < krzysiek.chmielew...@gmail.

Implementing a Custom Source Connector for Table API and SQL

2021-11-04 Thread Krzysztof Chmielewski
or ScanTableSource like it is presented in [1][2] It seems I need to have a SourceFunction object to To be able to use ScanRuntimeProvider or LookupRuntimeProvider. In other words how can I use Source interface implementation in TableSource? Regards, Krzysztof Chmielewski

Re: SSL configuration - default behaviour

2020-02-10 Thread Krzysztof Chmielewski
Thanks Robert, just a small suggestion maybe to change the documentation a little bit. I'm not sure if its only my impression but from sentence: *" All internal connections are SSL authenticated and encrypted"* initially I thought that this is the default configuration. Thanks, Krzysztof pon.,

Re: Running Flink on java 11

2020-01-10 Thread Krzysztof Chmielewski
Hi, Thank you for your answer. Btw it seams that you send the replay only to my address and not to the mailing list :) I'm looking forward to try out 1.10-rc then. Regarding second thing you wrote, that *"on Java 11, all the tests(including end to end tests) would be run with Java 11 profile