[Avro] Re: TypeSerializer Example

2021-05-05 Thread Sandeep khanzode
Hi, Is there a working example somewhere that I can refer for writing Avro entities in Flink state as well as Avro serializaition in KafkaConsumer/Producer? I tried to use Avro entities directly but there is an issue beyond Apache Avro 1.7.7 in that the entities created have a

Read kafka offsets from checkpoint - state processor

2021-05-05 Thread bat man
Hi Users, Is there a way that Flink 1.9 the checkpointed data can be read using the state processor api. Docs [1] says - When reading operator state, users specify the operator uid, the state name, and the type information. What is the type for the kafka operator, which needs to be specified

Re: [ANNOUNCE] Apache Flink 1.13.0 released

2021-05-05 Thread Xintong Song
Thanks Dawid & Guowei as the release managers, and everyone who has contributed to this release. Thank you~ Xintong Song On Thu, May 6, 2021 at 9:51 AM Leonard Xu wrote: > Thanks Dawid & Guowei for the great work, thanks everyone involved. > > Best, > Leonard > > 在 2021年5月5日,17:12,Theo

Re: savepoint command in code

2021-05-05 Thread Yun Tang
Hi, You could trigger savepoint via rest API [1] or refer to SavepointITCase[2] to see how to trigger savepoint in test code. [1] https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/ops/rest_api/#jobs-jobid-savepoints [2]

Re: How to tell between a local mode run vs. remote mode run?

2021-05-05 Thread Yik San Chan
Hi Xingbo, Thank you! On Thu, May 6, 2021 at 10:01 AM Xingbo Huang wrote: > Hi Yik San, > You can check whether the execution environment used is > `LocalStreamEnvironment` and you can get the class object corresponding to > the corresponding java object through py4j in PyFlink. You can take a

Re: Define rowtime on intermediate table field

2021-05-05 Thread Yun Gao
Hi Sumeet, I think you might first convert the table back to the DataStream [1], then define the timestamp and watermark with `assignTimestampsAndWatermarks(...)`, and then convert it back to table[2]. Best, Yun [1]

Re: Flink : Caused by: java.lang.IllegalStateException: No ExecutorFactory found to execute the application.

2021-05-05 Thread Yun Gao
Hi Ragini, How did you submit your job ? The exception here is mostly cuased that the `flink-client` is not included in the classpath at the client side. If the job is submitted via the flink cli, namely `flink run -c xx.jar`, it should be included by default, and if some programming way is

Re: Interacting with flink-jobmanager via CLI in separate pod

2021-05-05 Thread Yang Wang
It seems that you are using the NodePort to expose the rest service. If you only want to access the Flink UI/rest in the K8s cluster, then I would suggest to set "kubernetes.rest-service.exposed.type" to "ClusterIP". Because we are using the K8s master node to construct the JobManager rest

Re: Protobuf support with Flink SQL and Kafka Connector

2021-05-05 Thread Jark Wu
Hi Shipeng, Matthias is correct. FLINK-18202 should address this topic. There is already a pull request there which is in good shape. You can also download the PR and build the format jar yourself, and then it should work with Flink 1.12. Best, Jark On Mon, 3 May 2021 at 21:41, Matthias Pohl

Re: How to tell between a local mode run vs. remote mode run?

2021-05-05 Thread Xingbo Huang
Hi Yik San, You can check whether the execution environment used is `LocalStreamEnvironment` and you can get the class object corresponding to the corresponding java object through py4j in PyFlink. You can take a look at the example I wrote below, I hope it will help you ``` from pyflink.table

Re: [ANNOUNCE] Apache Flink 1.13.0 released

2021-05-05 Thread Leonard Xu
Thanks Dawid & Guowei for the great work, thanks everyone involved. Best, Leonard > 在 2021年5月5日,17:12,Theo Diefenthal 写道: > > Thanks for managing the release. +1. I like the focus on improving operations > with this version. > > Von: "Matthias Pohl" > An: "Etienne Chauchot" > CC: "dev" ,

Re: Upsert kafka 作为 source 的几个问题

2021-05-05 Thread macdoor
我也想知道 flink 在对 kafka 消息进行 join 时,是否对按主键分区有要求,KSQL有强制性的要求 -- Sent from: http://apache-flink.147419.n8.nabble.com/

Re: Session Windows - not working as expected

2021-05-05 Thread Swagat Mishra
This seems to be working fine in processing time but doesn't work in event time. Is there an issue with the way the water mark is defined or do we need to set up timers? Please advise. WORKING: customerStream .keyBy((KeySelector) Customer::getIdentifier)

Re: Session Windows - not working as expected

2021-05-05 Thread Sam
Adding the code for CustomWatermarkGenerator . @Override public void onEvent(Customer customer, long l, WatermarkOutput watermarkOutput) { currentMaxTimestamp = Math.max(currentMaxTimestamp, customer.getEventTime() ); } @Override public void onPeriodicEmit(WatermarkOutput

Re: Flink Event specific window

2021-05-05 Thread Swagat Mishra
Hi Arvid, I sent a separate mail titled - Session Windows - not working as expected ( to the user community ) All other details are here if you need, closing this thread. Please have a look when you have a few minutes, much appreciated. Regards, Swagat On Thu, May 6, 2021 at 1:50 AM Swagat

Re: Flink Event specific window

2021-05-05 Thread Swagat Mishra
Hi Arvid, I sent a separate mail titled - Session Windows - not working as expected closing this thread. Please have a look when you have a few minutes, much appreciated. Regards, Swagat On Wed, May 5, 2021 at 7:24 PM Swagat Mishra wrote: > Hi Arvid, > > Tried a small POC to reproduce the

Re: Flink auto-scaling feature and documentation suggestions

2021-05-05 Thread vishalovercome
Yes. While back-pressure would eventually ensure high throughput, hand tuning parallelism became necessary because the job with high source parallelism would immediately bring down our internal services - not giving enough time to flink to adjust the in-rate. Plus running all operators at such a

Session Windows - not working as expected

2021-05-05 Thread Swagat Mishra
Hi, Bit of background, I have a stream of customers who have purchased some product, reading these transactions on a KAFKA topic. I want to aggregate the number of products the customer has purchased in a particular duration ( say 10 seconds ) and write to a sink. I am using session windows to

Re: org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 1 action: IOException: 1 time,

2021-05-05 Thread Robert Metzger
Hi Ragini, Since this exception is coming from the Hbase client, I assume the issue has nothing to do with Flink directly. I would recommend carefully studying the HBase client configuration parameters, maybe setup a simple Java application that "hammers" data into Hbase at a maximum rate to

Re: Interacting with flink-jobmanager via CLI in separate pod

2021-05-05 Thread Robert Metzger
Okay, it appears to have resolved 10.43.0.1:30081 as the address of the JobManager. Most likely, the container can not access this address. Can you validate this from within the container? If I understand the Flink documentation correctly, you should be able to manually specify rest.address,

Re: Flink auto-scaling feature and documentation suggestions

2021-05-05 Thread Ken Krugler
Hi Vishal, WRT “bring down our internal services” - a common pattern with making requests to external services is to measure latency, and throttle (delay) requests in response to increased latency. You’ll see this discussed frequently on web crawling forums as an auto-tuning approach.

Re: Flink auto-scaling feature and documentation suggestions

2021-05-05 Thread David Anderson
Well, I was thinking you could have avoided overwhelming your internal services by using something like Flink's async i/o operator, tuned to limit the total number of concurrent requests. That way the pipeline could have uniform parallelism without overwhelming those services, and then you'd rely

Re: Interacting with flink-jobmanager via CLI in separate pod

2021-05-05 Thread Robert Cullen
Thanks for the reply. Here is an updated exception with DEBUG on. It appears to be timing out: 2021-05-05 16:56:19,700 DEBUG org.apache.flink.kubernetes.kubeclient.FlinkKubeClientFactory [] - Setting namespace of Kubernetes client to cmdaa 2021-05-05 16:56:19,700 DEBUG

Re: NPE when aggregate window.

2021-05-05 Thread Arvid Heise
Hi, I'm assuming it's just a workaround for changing fields. The string representation happens to be stable while the underlying values change. It's best practice to use completely immutable types if you have similar issues, you should double-check that nothing can be changed in your data type

Re: NPE when aggregate window.

2021-05-05 Thread tuk
Can some provide a bit more explanation why replacing /com.google.common.base.Objects.hashCode with toString().hashCode(),/ with /toString().hashCode()/ making it work? -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

[NOTICE] Flink 1.12.3 artifacts for Scala 2.12 were built against Scala 2.11

2021-05-05 Thread Chesnay Schepler
To all Scala 2.12 users, Due to a mistake during the release process of Flink 1.12.3 jars intended to be built against Scala 2.12 were actually built against Scala 2.11 . This affects all jars published to maven central; the convenience binaries are not affected. Scala 2.12 users are

Re: remote task manager netty exception

2021-05-05 Thread Roman Khachatryan
Hi, > Could it be somehow partition info isn't up to date on TM when job is > restarting? Partition info should be up to date or become so eventually - but this is assuming that JM is able to detect the failure. The latter may not be the case, as Sihan You wrote previously: > The strange thing

Re: Presence of Jars in Flink reg security

2021-05-05 Thread Chesnay Schepler
One of these (plexus-utils) is afaik used by maven, so the scanner is potentially scanning the wrong thing. Or you are scanning all dependencies downloaded during the build of Flink, including everything used by various plugins of the build process & maven itself. On 5/5/2021 11:08 AM, Till

Re: Interacting with flink-jobmanager via CLI in separate pod

2021-05-05 Thread Robert Metzger
Hi, can you check the client log in the "log/" directory? The Flink client will try to access the K8s API server to retrieve the endpoint of the jobmanager. For that, the pod needs to have permissions (through a service account) to make such calls to K8s. My hope is that the logs or previous

Re: Question about state processor data outputs

2021-05-05 Thread Chen-Che Huang
Hi Robert, Due to the performance issue of using state processor, I probably would like to give up state processor and am trying StreamingFileSink in a streaming manner. However, I need to store the files on GCS. However, I encountered the error below. It looks like Flink hasn't support GCS

Re: PyFlink: Split input table stream using filter()

2021-05-05 Thread Dian Fu
Hi Sumeet, Yes, this approach also works in Table API. Could you share which API you use to execute the job? For jobs with multiple sinks, StatementSet should be used. You could refer to [1] for more details on this. Regards, Dian [1]

PyFlink: Split input table stream using filter()

2021-05-05 Thread Sumeet Malhotra
Hi, I would like to split streamed data from Kafka into 2 streams based on some filter criteria, using PyFlink Table API. As described here [1], a way to do this is to use .filter() which should split the stream for parallel processing. Does this approach work in Table API as well? I'm doing the

Re: Is keyed state supported in PyFlink?

2021-05-05 Thread Sumeet Malhotra
Thanks Dian. Yes, I hadn't looked at the 1.13.0 documentation earlier. On Wed, May 5, 2021 at 1:46 PM Dian Fu wrote: > Hi Sumeet, > > This feature is supported in 1.13.0 which was just released and so there > is no documentation about it in 1.12. > > Regards, > Dian > > 2021年5月4日 上午2:09,Sumeet

Re: [ANNOUNCE] Apache Flink 1.13.0 released

2021-05-05 Thread Theo Diefenthal
Thanks for managing the release. +1. I like the focus on improving operations with this version. Von: "Matthias Pohl" An: "Etienne Chauchot" CC: "dev" , "Dawid Wysakowicz" , "user" , annou...@apache.org Gesendet: Dienstag, 4. Mai 2021 21:53:31 Betreff: Re: [ANNOUNCE] Apache Flink 1.13.0

Re: Presence of Jars in Flink reg security

2021-05-05 Thread Till Rohrmann
Hi Prasanna, in the latest Flink version (1.13.0) I couldn't find these dependencies. Which version of Flink are you looking at? What you could check is whether one of these dependencies is contained in one of Flink's shaded dependencies [1]. [1] https://github.com/apache/flink-shaded Cheers,

Re: Is keyed state supported in PyFlink?

2021-05-05 Thread Dian Fu
Hi Sumeet, This feature is supported in 1.13.0 which was just released and so there is no documentation about it in 1.12. Regards, Dian > 2021年5月4日 上午2:09,Sumeet Malhotra 写道: > > Hi, > > Is keyed state [1] supported by PyFlink yet? I can see some code for it in > the Flink master branch,

Re: Flink auto-scaling feature and documentation suggestions

2021-05-05 Thread David Anderson
Interesting. So if I understand correctly, basically you limited the parallelism of the sources in order to avoid running the job with constant backpressure, and then scaled up the windows to maximize throughput. On Tue, May 4, 2021 at 11:23 PM vishalovercome wrote: > In one of my jobs,