Continuous Reading of File using FileSource does not process the existing files in version 1.17

2024-01-05 Thread Prasanna kumar
Hi Flink Community, I hope this email finds you well. I am currently in the process of migrating my Flink application from version 1.12.7 to 1.17.2 and have encountered a behavior issue with the FileSource while reading data from an S3 bucket. In the previous version (1.12.7), I was utilizing

Java 21 for flink

2023-07-07 Thread Prasanna kumar
Hi all, Java 21 plans to support light weight thread called fiber based on Project LOOM which will increase the concurrency to great extent. Is there any plan for flink to leverage it? Thanks, Prasanna.

Snappy Compression for Checkpoints

2023-01-05 Thread Prasanna kumar
Hello Flink Community , We are running Jobs in flink version 1.12.7 which reads from Kafka , apply some rules(stored in broadcast state) and then writes to kafka. This is a very low latency and high throughput and we have set up at least one semantics. Checkpoint Configuration Used 1. We

Regarding Flink Upgrades

2022-11-02 Thread Prasanna kumar
Hi Community, Currently we are using version 1.12.7 and it is running without any issue. And we see that version 1.17 is set to release early next year. That means we would be 5 versions behind. 1) So how far can we lag behind the current flink version ? 2) If we face any issues like log4j

Re: [Security] - Critical OpenSSL Vulnerability

2022-11-01 Thread Prasanna kumar
Could we also get an emergency patch to 1.12 version as well , because upgrading flink to a newer version on production in a short time would be high in effort and longer in duration as well . Thanks, Prasanna On Tue, Nov 1, 2022 at 11:30 AM Prasanna kumar < prasannakumarram...@gmail.com>

Re: [Security] - Critical OpenSSL Vulnerability

2022-11-01 Thread Prasanna kumar
If flink version 1.12 also affected ? Thanks, Prasanna. On Tue, Nov 1, 2022 at 10:40 AM Mason Chen wrote: > Hi Tamir and Martjin, > > We have also noticed this internally. So far, we have found that the > *latest* Flink Java 11/Scala 2.12 docker images *1.14, 1.15, and 1.16* > are affected,

Re: Question About Histograms

2022-04-04 Thread Prasanna kumar
Anil, Flink Histograms are actually summaries .. You need to override the Prometheus Histogram class provided to write it into different buckets to Prometheus .. Then you can write prom queries to calculate different quantiles accordingly ... Checkpointing The histograms is not a recommended

Re: CVE-2021-44228 - Log4j2 vulnerability

2021-12-13 Thread Prasanna kumar
Chesnay Thank you for the clarification. On Mon, Dec 13, 2021 at 6:55 PM Chesnay Schepler wrote: > The flink-shaded-zookeeper jars do not contain log4j. > > On 13/12/2021 14:11, Prasanna kumar wrote: > > Does Zookeeper have this vulnerability dependency ? I see references to &g

Re: CVE-2021-44228 - Log4j2 vulnerability

2021-12-13 Thread Prasanna kumar
Does Zookeeper have this vulnerability dependency ? I see references to log4j in Shaded Zookeeper jar included as part of the flink distribution. On Mon, Dec 13, 2021 at 1:40 PM Timo Walther wrote: > While we are working to upgrade the affected dependencies of all > components, we recommend

Re: Kryo Serialization issues in Flink Jobs.

2021-11-01 Thread Prasanna kumar
Any thoughts on these ? Thanks, Prasanna. On Sat, Oct 30, 2021 at 7:25 PM Prasanna kumar < prasannakumarram...@gmail.com> wrote: > Hi , > > We have the following Flink Job that processes records from kafka based on > the rules we get from S3 files into broadcasted state. >

Kryo Serialization issues in Flink Jobs.

2021-10-30 Thread Prasanna kumar
Hi , We have the following Flink Job that processes records from kafka based on the rules we get from S3 files into broadcasted state. Earlier we were able to spin a job with any number of task parallelism without any issues. Recently we made changes to the Broadcast state Structure and it is

Re: Flink support for Kafka versions

2021-10-26 Thread Prasanna kumar
8 for better >> visibility. >> >> Best, >> Austin >> >> [1]: https://issues.apache.org/jira/browse/FLINK-13414 >> [2]: https://issues.apache.org/jira/browse/FLINK-20845 >> >> On Tue, Apr 20, 2021 at 9:08 AM Prasanna kumar < >> prasannakumarram..

Re: How to refresh topics to ingest with KafkaSource?

2021-10-14 Thread Prasanna kumar
Yes you are right. We tested recently to find that the flink jobs do not pick up the new topics that got created with the same pattern provided to flink kafka consumer. The topics are set only during the start of the jobs. Prasanna. On Fri, 15 Oct 2021, 05:44 Preston Price, wrote: > Okay so

Does Flink 1.12.2 support Zookeeper version 3.6+

2021-09-30 Thread Prasanna kumar
Hi , Does Flink 1.12.2 support Zookeeper version 3.6+ ? If we add zookeeper version 3.6 jar in the flink image ,would it be able to connect ? The following link mentions only zk 3.5 or 3.4

Exploring Flink for a HTTP delivery service.

2021-08-14 Thread Prasanna kumar
Hi, Aim: Building an event delivery service Scale : Peak load 50k messages/sec. Average load 5k messages/sec Expected to grow every passing month Unique Customer Endpoints : 10k+ Unique events(kafka topics) : 500+ Unique

Re: Topic assignment across Flink Kafka Consumer

2021-08-04 Thread Prasanna kumar
ing your data in your Flink job, and this is causing the > data distribution issues you are observing? > > > On Wed, Aug 4, 2021 at 4:00 PM Prasanna kumar < > prasannakumarram...@gmail.com> wrote: > >> Robert >> >> When we apply a rebalance method to the kafka consumer,

Re: Topic assignment across Flink Kafka Consumer

2021-08-04 Thread Prasanna kumar
Robert When we apply a rebalance method to the kafka consumer, it is assigning partitions of various topics evenly. But my only concern is that the rebalance method might have a performance impact . Thanks, Prasanna. On Wed, Aug 4, 2021 at 5:55 PM Prasanna kumar wrote: > Robert, > &

Re: Topic assignment across Flink Kafka Consumer

2021-08-04 Thread Prasanna kumar
the name of the topics or increase the parallelism of > your consumer. > > > > > On Tue, Jul 20, 2021 at 7:53 AM Prasanna kumar < > prasannakumarram...@gmail.com> wrote: > >> Hi, >> >> We have a Flink job reading from multiple Kafka topics based on a regex >&g

Topic assignment across Flink Kafka Consumer

2021-07-19 Thread Prasanna kumar
Hi, We have a Flink job reading from multiple Kafka topics based on a regex pattern. What we have found out is that the topics are not shared between the kafka consumers in an even manner . Example if there are 8 topics and 4 kafka consumer operators . 1 consumer is assigned 6 topics , 2

Re: Metric counter gets reset when leader jobmanager changes in Flink native K8s HA solution

2021-06-14 Thread Prasanna kumar
amit, This is expected behaviour from counter . If the total count irrespective of the restarts needed to be found, aggregate functions need to be applied on the counter . Example sum(Rate(counter)) https://prometheus.io/docs/prometheus/latest/querying/functions/ Prasanna. On Tue, Jun 15, 2021

Sometimes Counter Metrics getting Stuck and not increasing

2021-05-21 Thread Prasanna kumar
Hi, We are publishing around 200 kinds of events for 15000 customers. Source Kafka Topics , Sink Amazon SNS Topic. We are collecting metrics in the following combination [Event , Consumer, PublishResult]. (Publish Result could be published or error). So Metrics count is in the order of

Presence of Jars in Flink reg security

2021-05-04 Thread Prasanna kumar
Hi Flinksters, Our repo which is a maven based java project(flink) went through SCA scan using WhiteSource tool and following are the HIGH severity issues reported. The target vulnerable jar is not found when we build the dependency tree of the project. Could any one let us know if flink uses

Flink support for Kafka versions

2021-04-20 Thread Prasanna kumar
Hi Flinksters, We are researching about if we could use the latest version of kafka (2.6.1 or 2.7.0) Since we are using Flink as a processor , we came across this https://issues.apache.org/jira/browse/FLINK-19168. It says that it does not support version 2.5.0 and beyond. That was created 8

Flink Metrics

2021-02-28 Thread Prasanna kumar
Hi flinksters, Scenario: We have cdc messages from our rdbms(various tables) flowing to Kafka. Our flink job reads the CDC messages and creates events based on certain rules. I am using Prometheus and grafana. Following are there metrics that i need to calculate A) Number of CDC messages wrt

Re: Using Prometheus Client Metrics in Flink

2021-02-27 Thread Prasanna kumar
Rion, Regarding the second question , you can aggregate by using sum function sum(metric_name{jobb_name="JOBNAME"}) . This works is you are using the metric counter. Prasanna. On Sat, Feb 27, 2021 at 9:01 PM Rion Williams wrote: > Hi folks, > > I’ve just recently started working with Flink

Re: Routing events to different kafka topics dynamically

2020-12-03 Thread Prasanna kumar
description of the problem this > should actually be pretty straightforward as you can deduce the topic from > the message. Hence, you just need to create the ProducerRecord with the > right target topic you extracted from the record/message. > > Cheers, > Till > > On Wed, D

Routing events to different kafka topics dynamically

2020-12-02 Thread Prasanna kumar
Hi, Events need to be routed to different kafka topics dynamically based upon some info in the message. We have implemented using KeyedSerializationSchema similar to https://stackoverflow.com/questions/49508508/apache-flink-how-to-sink-events-to-different-kafka-topics-depending-on-the-even. But

Re: Caching

2020-11-26 Thread Prasanna kumar
Navneeth, Thanks for posting this question. This looks like our future scenario where we might end up with. We are working on a Similar problem statement with two differences. 1) The cache items would not change frequently say max of once per month or few times per year and the number of

Dynamic Kafka Source

2020-09-26 Thread Prasanna kumar
Hi, My requirement has been captured by the following stack overflow question. https://stackoverflow.com/questions/61876849/custom-kafka-source-on-apache-flink Could anyone take a shot at it ? Thanks, Prasanna.

Re: SDK vs Connectors

2020-08-23 Thread Prasanna kumar
Thanks for the Reply Yun, I see that when I publish the messages to SNS from map operator, in case of any errors I find the checkpointing mechanism takes care of "no data loss". One scenario I could not replicate is that, the method from SDK unable to send messages to SNS but remains silent not

Re: AWS EMR deployment error : NoClassDefFoundError org/apache/flink/api/java/typeutils/ResultTypeQueryable

2020-08-21 Thread Prasanna kumar
Manas, One option you could try is to set the scope in the dependencies as compile for the required artifacts rather than provided. Prasanna. On Fri, Aug 21, 2020 at 1:47 PM Chesnay Schepler wrote: > If this class cannot be found on the classpath then chances are Flink is > completely

SDK vs Connectors

2020-08-21 Thread Prasanna kumar
Hi Team, Following is the pipeline Kafka => Processing => SNS Topics . Flink Does not provide a SNS connector out of the box. a) I implemented the above by using AWS SDK and published the messages in the Map operator itself. The pipeline is working well. I see messages flowing to SNS topics.

Flink Sinks

2020-07-17 Thread Prasanna kumar
Hi , I did not find out of box flink sink connector for http and SQS mechanism. Has anyone implemented it? Wanted to know if we are writing a custom sink function , whether it would affect semantic exactly one guarantees ? Thanks , Prasanna

Re: Performance test Flink vs Storm

2020-07-16 Thread Prasanna kumar
give you larger JVM heap > space, thus lesser GC pressure. > > Thank you~ > > Xintong Song > > > > On Thu, Jul 16, 2020 at 10:38 PM Prasanna kumar < > prasannakumarram...@gmail.com> wrote: > >> >> Xintong Song, >> >> >>- Which ve

Re: Performance test Flink vs Storm

2020-07-16 Thread Prasanna kumar
you~ > > Xintong Song > > > [1] > https://ci.apache.org/projects/flink/flink-docs-release-1.11/ops/memory/mem_tuning.html#heap-state-backend > > On Thu, Jul 16, 2020 at 10:35 AM Prasanna kumar < > prasannakumarram...@gmail.com> wrote: > >> Hi >> >>

Performance test Flink vs Storm

2020-07-15 Thread Prasanna kumar
Hi, We are testing flink and storm for our streaming pipelines on various features. In terms of Latency,i see the flink comes up short on storm even if more CPU is given to it. Will Explain in detail. *Machine*. t2.large 4 core 16 gb. is used for Used for flink task manager and storm supervisor

Check pointing for simple pipeline

2020-07-07 Thread Prasanna kumar
Hi , I have pipeline. Source-> Map(JSON transform)-> Sink.. Both source and sink are Kafka. What is the best checkpoint ing mechanism? Is setting checkpoints incremental a good option? What should be careful of? I am running it on aws emr. Will checkpoint slow the speed? Thanks, Prasanna.

Flink Parallelism for various type of transformation

2020-07-06 Thread Prasanna kumar
Hi , I used t2.medium machines for the task manager nodes. It has 2 CPU and 4GB memory. But the task manager screen shows that there are 4 slots. Generally we should match the number of slots to the number of cores. [image: image.png] Our pipeline is Source -> Simple Transform -> Sink. What

Re: Is Flink HIPAA certified

2020-07-01 Thread Prasanna kumar
ecome HIPAA-compliant with Flink [1]. > > Marta > > [1] > https://docs.aws.amazon.com/kinesisanalytics/latest/java/akda-java-compliance.html > > On Sat, Jun 27, 2020 at 9:41 AM Prasanna kumar < > prasannakumarram...@gmail.com> wrote: > >> Hi Community , >

Is Flink HIPAA certified

2020-06-27 Thread Prasanna kumar
Hi Community , Could anyone let me know if Flink is used in US healthcare tech space ? Thanks, Prasanna.

Re: Dynamic rescaling in Flink

2020-06-14 Thread Prasanna kumar
tes). AFAIK, this is still in > the design discussion. > > Thank you~ > > Xintong Song > > > > On Wed, Jun 10, 2020 at 2:44 AM Prasanna kumar < > prasannakumarram...@gmail.com> wrote: > >> Hi all, >> >> Does flink support dynamic scaling. Say try t

Dynamic rescaling in Flink

2020-06-09 Thread Prasanna kumar
Hi all, Does flink support dynamic scaling. Say try to add/reduce nodes based upon incoming load. Because our use case is such that we get peak loads for 4 hours and then medium loads for 8 hours and then light to no load for rest 2 hours. Or peak load would be atleast 5 times the medium load.

Re: Multiple Sinks for a Single Soure

2020-06-03 Thread Prasanna kumar
reate an array/map/collection of OutputTags corresponding to > the the sinks/topics combinations. One OutputTag per sink(/topic) and use > this array/map/collection inside your process function? > > Piotrek > > On 2 Jun 2020, at 13:49, Prasanna kumar > wrote: > > Hi , > > I have a Ev

NoResourceAvailableException and JobNotFound Errors

2020-06-02 Thread Prasanna kumar
Hi , I am running flink locally in my machine with following configurations. # The RPC port where the JobManager is reachable. jobmanager.rpc.port: 6123 # The heap size for the JobManager JVM jobmanager.heap.size: 1024m # The heap size for the TaskManager JVM taskmanager.heap.size: 1024m

Re: Multiple Sinks for a Single Soure

2020-06-02 Thread Prasanna kumar
emit data to side output ctx.output(OutputTag, value); } }); for (eventRouterRegistry record : registryList) { System.out.print(record.getEventType() + " <==> " + record.getOutputTopic()) ; FlinkKafkaProducer011 fkp = new FlinkKafkaProducer011<>(record.getOutputTopic(), new Si

Re: Creating Kafka Topic dynamically in Flink

2020-06-01 Thread Prasanna kumar
> https://github.com/apache/flink/blob/master/flink-connectors/flink-connector-kafka-0.11/src/test/java/org/apache/flink/streaming/connectors/kafka/Kafka011ITCase.java#L126 > > > 在 2020年6月1日,15:35,Prasanna kumar 写道: > > Hi, > > I have Use Case where i read events from a Sin

Creating Kafka Topic dynamically in Flink

2020-06-01 Thread Prasanna kumar
Hi, I have Use Case where i read events from a Single kafka Stream comprising of JSON messages. Requirement is to split the stream into multiple output streams based on some criteria say based on Type of Event or Based on Type and Customer associated with the event. We could achieve the

Re: Need Help on Flink suitability to our usecase

2020-05-29 Thread Prasanna kumar
you'll be able to recover in < 500 milliseconds, > but within a few seconds. > I don't think that the other frameworks you are looking at are going to be > much better at this. > > Best, > Robert > > On Tue, May 19, 2020 at 1:28 PM Prasanna kumar < > prasannakuma

Re: Multiple Sinks for a Single Soure

2020-05-28 Thread Prasanna kumar
Flink Forward <https://flink-forward.org/> - The Apache Flink > Conference > > Stream Processing | Event Driven | Real Time > > > On Tue, May 26, 2020 at 2:57 PM Prasanna kumar < > prasannakumarram...@gmail.com> wrote: > >> Piotr, >> >> There is a

Re: Multiple Sinks for a Single Soure

2020-05-26 Thread Prasanna kumar
"topicname": "USERTOPIC" } }, { "customername": "c4", "method": "Kafka", "methodparams": { "topicname": &

Re: Multiple Sinks for a Single Soure

2020-05-26 Thread Prasanna kumar
the records, while sink1 only a > portion of them. > > Piotrek > > > On 26 May 2020, at 06:45, Prasanna kumar > wrote: > > Piotr, > > Thanks for the reply. > > There is one other case, where some events have to be written to multiple > sinks and while other

Re: Multiple Sinks for a Single Soure

2020-05-25 Thread Prasanna kumar
tream` would be passed to each of the sinks. > > Piotrek > > > On 24 May 2020, at 19:34, Prasanna kumar > wrote: > > > > Hi, > > > > There is a single source of events for me in my system. > > > > I need to process and send the events to multiple destin

Multiple Sinks for a Single Soure

2020-05-24 Thread Prasanna kumar
Hi, There is a single source of events for me in my system. I need to process and send the events to multiple destination/sink at the same time.[ kafka topic, s3, kinesis and POST to HTTP endpoint ] I am able send to one sink. By adding more sink stream to the source stream could we achieve it

Need Help on Flink suitability to our usecase

2020-05-19 Thread Prasanna kumar
Hi, I have the following usecase to implement in my organization. Say there is huge relational database(1000 tables for each of our 30k customers) in our monolith setup We want to reduce the load on the DB and prevent the applications from hitting it for latest events. So an extract is done

flink setup errors

2020-05-17 Thread Prasanna kumar
I tried to setup flink locally as mentioned in the link https://ci.apache.org/projects/flink/flink-docs-stable/dev/projectsetup/java_api_quickstart.html . I ended getting the following error [INFO] Generating project in Interactive mode [WARNING] No archetype found in remote catalog. Defaulting