Re: StopWithSavepoint() method doesn't work in Java based flink native k8s operator

2021-04-30 Thread Fuyao Li
Hello Community, Yang, I have one more question for logging. I also noticed that if I execute kubectl logs command to the JM. The pods provisioned by the operator can’t print out the internal Flink logs in the kubectl logs. I can only get something like the logs below. No actual flink logs is

Re: Protobuf support with Flink SQL and Kafka Connector

2021-04-30 Thread Fuyao Li
Hello Shipeng, I am not an expert in Flink, just want to share some of my thoughts. Maybe others can give you better ideas. I think there is no directly available Protobuf support for Flink SQL. However, you can write a user-defined format to support it [1]. If you use DataStream API, you can le

StopWithSavepoint() method doesn't work in Java based flink native k8s operator

2021-04-30 Thread Fuyao Li
Hello Community, Yang, I am trying to extend the flink native Kubernetes operator by adding some new features based on the repo [1]. I wrote a method to release the image update functionality. [2] I added the triggerImageUpdate(oldFlinkApp, flinkApp, effectiveConfig); under the existing method.

Re: Contiguity in SQL vs CEP

2021-04-30 Thread tbud
Hi, In that case what's the difference between reluctant quantifier like (B*?) in SQL and relaxed contiguity in CEP ? -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Protobuf support with Flink SQL and Kafka Connector

2021-04-30 Thread Shipeng Xie
Hi, In https://ci.apache.org/projects/flink/flink-docs-release-1.12/dev/table/connectors/formats/, it does not mention protobuf format. Does Flink SQL support protobuf format? If not, is there any plan to support it in the near future? Thanks!

remote task manager netty exception

2021-04-30 Thread Sihan You
Hi, We are experiencing some netty issue with our Flink cluster, which we couldn't figure the cause. Below is the stack trace of exceptions from TM's and JM's perspectives. we have 85 TMs and one JM in HA mode. The strange thing is that only 23 of the TM are complaining about the connection issu

Re: Contiguity and state storage in CEP library

2021-04-30 Thread tbud
Thanks Dawid. As I see in the code the buffered storage in between watermarks is stored in /MapState> elementQueueState /variable in /class CepOperator/. My question is, if we use rocksDb or some other state backend then would this state be stored on that and checkpointed ? or is it always in the h

Re: Question about snapshot file

2021-04-30 Thread Abdullah bin Omar
Thank you so much for your reply. I apologise I did not mention multiple savepoint files in my last question. I understand the part. I did not ask the question (only for one savepoint file) exactly. When we run a job, we have obviously many savepoint files (by using a manual command repeatedly)

Re: Question about snapshot file

2021-04-30 Thread David Anderson
> > So, can't we extract all previous savepoint data by using > ExistingSavepoint? You can extract all of the data from any specific savepoint. Or nearly all data, anyway. There is at least one corner case that isn't covered -- ListCheckpointed state -- which has been deprecated and isn't suppor

Re: Writing ARRAY type through JDBC:PostgreSQL

2021-04-30 Thread fgahan
Hi Timo, I´m attaching the stacktrace. I did try the array syntax with a few different options (string, varchar, character varying..) and they all end up with the same error. thanks... Federico flink_stacktrace.log

Re: Question about snapshot file

2021-04-30 Thread Abdullah bin Omar
Hi, So, can't we extract all previous savepoint data by using ExistingSavepoint? Thank you On Fri, Apr 30, 2021 at 10:25 AM David Anderson wrote: > Abdullah, > > The example you are studying -- the one using the state processor API -- > can be used with any retained checkpoint or savepo

Re: "myuid" in snapshot.readingstate

2021-04-30 Thread Abu Bakar Siddiqur Rahman Rocky
Thank you, David. On Fri, Apr 30, 2021 at 10:31 AM David Anderson wrote: > What is dependency (in pom.xml) for the org.apache.flink.training? > > > We don't publish artifacts for this repository. > > David > > On Fri, Apr 30, 2021 at 5:19 PM Abu Bakar Siddiqur Rahman Rocky < > bakar121...@gmail

Re: "myuid" in snapshot.readingstate

2021-04-30 Thread David Anderson
> > What is dependency (in pom.xml) for the org.apache.flink.training? We don't publish artifacts for this repository. David On Fri, Apr 30, 2021 at 5:19 PM Abu Bakar Siddiqur Rahman Rocky < bakar121...@gmail.com> wrote: > Hi David, > > A quick question more > > I am trying to *import* > org.

Re: Question about snapshot file

2021-04-30 Thread David Anderson
Abdullah, The example you are studying -- the one using the state processor API -- can be used with any retained checkpoint or savepoint created while running the RidesAndFaresSolution job. But this is a very special use of checkpoints and savepoints that shows how to extract data from them. Norm

Re: "myuid" in snapshot.readingstate

2021-04-30 Thread Abu Bakar Siddiqur Rahman Rocky
I am asking on behalf of my colleagues the previous question. Though we have a couple of questions that were in another email On Fri, Apr 30, 2021 at 10:18 AM Abu Bakar Siddiqur Rahman Rocky < bakar121...@gmail.com> wrote: > Hi David, > > A quick question more > > I am trying to *import* > org.a

Re: "myuid" in snapshot.readingstate

2021-04-30 Thread Abu Bakar Siddiqur Rahman Rocky
Hi David, A quick question more I am trying to *import* org.apache.flink.training.exercises.common.sources.TaxiFareGenerator; However, it can not resolve. What is dependency (in pom.xml) for the org.apache.flink.training? Thank you On Fri, Apr 30, 2021 at 10:12 AM David Anderson wrote: > Yo

Re: "myuid" in snapshot.readingstate

2021-04-30 Thread David Anderson
You can read about assigning unique IDs to stateful operators in the docs [1][2]. What the uid() method does is to establish a stable and unique identifier for a stateful operator. Then as you evolve your application, this helps ensure that future versions of your job will be able to restore savepo

"myuid" in snapshot.readingstate

2021-04-30 Thread Abdullah bin Omar
Hi, when we readstate of of savepooint, we use, "myuid" as a argument of the function. For example, DataSet keyedState = savepoint.readKeyedState("my-uid", new ReaderFunction()); *Question 1:* In [1] (line no 79), we get the "uid" with datastream. Then in [2] (line no 45), *how can we use the

Re: Data type serialization and testing

2021-04-30 Thread Dave Maughan
Hi Timo, Thanks for your suggestions. I did notice the chaining option. I'll give them a try. Is there an established pattern for running tests against a local cluster? Or any examples you could point me to? I did notice a FlinkContainer (testcontainers) but it appears to be in a module that is n

Writing ARRAY type through JDBC:PostgreSQL

2021-04-30 Thread fgahan
Hi, I´trying to write to a postgres table but the job fails on a column of type varchar[]. I get the following error: Caused by: java.lang.IllegalStateException: Writing ARRAY type is not yet supported in JDBC:PostgreSQL. After getting data from a kafka topic, my code looks like this: tableE

Re: Data type serialization and testing

2021-04-30 Thread Timo Walther
Hi Dave, maybe it would be better to execute your tests against a local cluster instead of the mini cluster. Also object reuse should be disabled and chaining should be disabled to force serialization. Maybe others have better ideas. Regards, Timo On 30.04.21 10:25, Dave Maughan wrote: Hi,

Re: Question about snapshot file

2021-04-30 Thread Abdullah bin Omar
Hi, Please answer me some of my below question whether my understanding correct or not, and please answer the direct ask questions. *Question no 1 (about dependency):* *What is dependency (in pom.xml) for the org.apache.flink.training?* I am trying to *import* org.apache.flink.training.exercise

Re: TypeSerializer Example

2021-04-30 Thread Timo Walther
I also found these pages: https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/state/schema_evolution.html https://flink.apache.org/news/2020/04/15/flink-serialization-tuning-vol-1.html#avro I hope this helps. Regards, Timo On 30.04.21 13:20, Sandeep khanzode wrote: Hi Timo, T

Re: Writing ARRAY type through JDBC:PostgreSQL

2021-04-30 Thread Timo Walther
Hi Federico, could you also share the full stack trace with us? According to the docs, the ARRAY type should be supported: https://ci.apache.org/projects/flink/flink-docs-stable/dev/table/connectors/jdbc.html#data-type-mapping Can you also try to use `cities ARRAY` in your CREATE TABLE, maybe

Re: TypeSerializer Example

2021-04-30 Thread Sandeep khanzode
Hi Timo, Thanks! I will take a look at the links. Can you please share if you have any simple (or complex) example of Avro state data structures? Thanks, Sandeep > On 30-Apr-2021, at 4:46 PM, Timo Walther wrote: > > Hi Sandeep, > > did you have a chance to look at this documentation page? >

Re: TypeSerializer Example

2021-04-30 Thread Timo Walther
Hi Sandeep, did you have a chance to look at this documentation page? https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/state/custom_serialization.html The interfaces might not be easy to implement but are very powerful to address compatibility issues. You can also look into Fl

Re: Guaranteeing event order in a KeyedProcessFunction

2021-04-30 Thread Timo Walther
Hi Miguel, your initial idea sounds not too bad but why do you want to key by timestamp? Usually, you can simply key your stream by a custom key and store the events in a ListState until a watermark comes in. But if you really want to have some kind of global event-time order, you have two c

Setup of Scala/Flink project using Bazel

2021-04-30 Thread Salva Alcántara
I am trying to setup a simple flink application from scratch using Bazel. I've bootstrapped the project by running ``` sbt new tillrohrmann/flink-project.g8 ``` and after that I have added some files in order for Bazel to take control of the building (i.e., migrate from sbt). This is how the `WOR

Guaranteeing event order in a KeyedProcessFunction

2021-04-30 Thread Miguel Araújo
Hi everyone, I have a KeyedProcessFunction whose events I would like to process in event-time order. My initial idea was to use a Map keyed by timestamp and, when a new event arrives, iterate over the Map to process events older than the current watermark. The issue is that I obviously can't use

Re: How to implement a window that emits events at regular intervals and on specific events

2021-04-30 Thread Till Rohrmann
Hi Tim, The way session windows work is by first creating a new window for every incoming event and then merging overlapping windows. That's why you see that the end time of a window increases with every new incoming event. I hope this explains what you are seeing. Apart from that, I think the Ses

Re: How to create a java unit test for Flink SQL with nested field and watermark?

2021-04-30 Thread Svend
Thanks for the feedback. The CSV is a good idea and will make my tests more readable, I'll use that. Looking forward to Flink 1.13 ! Svend On Fri, 30 Apr 2021, at 9:09 AM, Timo Walther wrote: > Hi, > > there are multiple ways to create a table for testing: > > - use the datagen connecto

Data type serialization and testing

2021-04-30 Thread Dave Maughan
Hi, I recently encountered a scenario where the data type being passed between operators in my streaming job was modified such that it broke serialization. This was due to a non-Avro top-level data type containing an Avro field. The existing integration test (mini cluster) continued to work and un

Re: How to implement a window that emits events at regular intervals and on specific events

2021-04-30 Thread Tim Josefsson
Thanks! I've managed to implement a working solution with the trigger API, but I'm not exactly sure why it works. I'm doing the following: DataStream summaries = env .addSource(kafkaConsumer, "playerEvents(Kafka)") .name("EP - Read player events from Kafka") .uid("EP - Read

Re: KafkaSourceBuilder causing invalid negative offset on checkpointing

2021-04-30 Thread Till Rohrmann
Thanks for the stack traces Lars. With them I could confirm that the problem should be fixed with FLINK-20114 [1]. The fixes will be contained in the 1.12.4 and 1.13.0 release. Sorry for the inconveniences. [1] https://issues.apache.org/jira/browse/FLINK-20114 Cheers, Till On Thu, Apr 29, 2021 a

Re: How to create a java unit test for Flink SQL with nested field and watermark?

2021-04-30 Thread Timo Walther
Hi, there are multiple ways to create a table for testing: - use the datagen connector - use the filesystem connector with CSV data - and beginning from Flink 1.13 your code snippets becomes much simpler Regards, Timo On 29.04.21 20:35, Svend wrote: I found an answer to my own question! For