Re: [External Sender] Re: [Question] Apache Beam Pipeline AWS Credentials

2024-01-10 Thread Sachin Mittal
Hi, Check this class org.apache.beam.sdk.io.aws2.options.AwsOptions It tells you how to set AwsCredentialsProvider used to configure AWS service clients. For example you can run beam application with following application properties: --awsCredentialsProvider={"@type":

Re: [Question] Does SnowflakeIO connector support for AWS and Azure?

2023-12-04 Thread Sachin Mittal
5sbnjPf7WA+9Vbz6zfwmnz8GDChNgkPhf0Wzk9MRBBNl9KqsjHDDB/Jt1r4=; > Proxy: null), S3 Extended Request ID: > foh3wogVMGT7+kQ+5sbnjPf7WA+9Vbz6zfwmnz8GDChNgkPhf0Wzk9MRBBNl9KqsjHDDB/Jt1r4= > > I would appreciate it if you could share with me the steps that you > followed to do the test using the SnowflakeIO connector and

Re: [Question] Does SnowflakeIO connector support for AWS and Azure?

2023-12-02 Thread Sachin Mittal
it? I would really > appreciate it. Thanks. > > > Regards, > Xinmin > > > On Thu, Oct 26, 2023 at 9:42 AM mybeam wrote: > >> Hi Sachin, >> >> Thanks for your information. >> >> On Tue, Oct 24, 2023 at 11:02 PM Sachin Mittal >> wrot

Re: Pipeline Stalls at GroupByKey Step

2023-11-16 Thread Sachin Mittal
Do you add time stamp to every record you output in ConvertFromKafkaRecord step or any step before that. On Fri, 17 Nov 2023 at 4:07 AM, Sigalit Eliazov wrote: > Hi, > > In our pipeline, we've encountered an issue with the GroupByKey step. > After some time of running, it seems that messages

Re: [Question] Does SnowflakeIO connector support for AWS and Azure?

2023-10-24 Thread Sachin Mittal
I think AWS is supported and I was able to configure snowflake io with S3 buckets. On Wed, 25 Oct 2023 at 9:05 AM, mybeam wrote: > Hello, > > As per the javadoc below, only the GCP bucket is supported currently by > SnowflakeIO connector? Can you please confirm that AWS and Azure are >

Re: implementing cool down?

2023-10-19 Thread Sachin Mittal
You can try approach mentioned in here: https://beam.apache.org/blog/timely-processing/ On Thu, Oct 19, 2023 at 12:36 PM Balogh, György wrote: > Hi, > I'm using beam from java. My pipeline generates alerts. > I'd like to filter alerts with a cool down period: if an alert is emitted, > disable

Re: EFO KinesisIO watermarking doubt

2023-10-02 Thread Sachin Mittal
s > > Best Regards, > Pavel Solomin > > Tel: +351 962 950 692 | Skype: pavel_solomin | Linkedin > <https://www.linkedin.com/in/pavelsolomin> > > > > > > On Fri, 21 Jul 2023 at 07:19, Sachin Mittal wrote: > >> Hi, >> We are implementing EFO

Re: Issue with growing state/checkpoint size

2023-09-01 Thread Sachin Mittal
on my > control . > > On Tue, Aug 29, 2023 at 9:29 AM Sachin Mittal wrote: > >> So for the smaller size of collection which does not grow with size for >> certain keys we stored the data in redis and instead of beam join in our >> DoFn we just did the lookup and got the data

Re: Issue with growing state/checkpoint size

2023-08-29 Thread Sachin Mittal
eply, Any strategy you followed to avoid joins when you > rewrite your pipeline? > > > > On Tue, Aug 29, 2023 at 9:15 AM Sachin Mittal wrote: > >> Yes even we faced the same issue when trying to run a pipeline involving >> join of two collections. It was deployed us

Re: Issue with growing state/checkpoint size

2023-08-29 Thread Sachin Mittal
Yes even we faced the same issue when trying to run a pipeline involving join of two collections. It was deployed using AWS KDA, which uses flink runner. The source was kinesis streams. Looks like join operations are not very efficient in terms of size management when run on flink. We had to

How can we get multiple side inputs from a single pipeline ?

2023-08-28 Thread Sachin Mittal
Hi, I was checking the code for side input patterns : https://beam.apache.org/documentation/patterns/side-inputs/ Basically I need multiple side inputs from a Slowly updating global window side inputs. So as per example pipeline is something like this: PCollectionView map =

Can we use RedisIO to write records from an unbounded collection

2023-07-21 Thread Sachin Mittal
Hi, I was planning to use the RedisIO write/writeStreams function in a streaming pipeline. https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/io/redis/RedisIO.html The pipeline would read an unbounded collection from Kinesis and update redis. It will update data for which key

EFO KinesisIO watermarking doubt

2023-07-21 Thread Sachin Mittal
Hi, We are implementing EFO Kinesis IO reader provided by apache beam. I see that in code that for implementation of getCurrentTimestamp we always return getApproximateArrivalTimestamp and not the event time which we may have set for that record using withCustomWatermarkPolicy. Please refer:

Can anyone tell me difference between these transformations

2023-06-09 Thread Sachin Mittal
1. Window.>into(FixedWindows.of(windowSize)) .triggering(Never.ever()) .withAllowedLateness(allowedLateness, Window.ClosingBehavior.FIRE_ALWAYS) .discardingFiredPanes() 2. Window.>into(FixedWindows.of(windowSize))

Re: Difference between sdk.io.aws2.kinesis.KinesisIO vs sdk.io.kinesis.KinesisIO

2023-05-22 Thread Sachin Mittal
e to support for the new v2 modules, but I > personally didn’t have time to start looking into it so far. > > https://github.com/apache/beam/issues/19967 > > > > Don’t hesitate to reach out in case you are facing any issues when > migrating! > > > > Kind regards, >

Re: Can we shutdown a pipeline based on some condition

2023-05-13 Thread Sachin Mittal
kype: pavel_solomin | Linkedin > <https://www.linkedin.com/in/pavelsolomin> > > > > > > On Thu, 4 May 2023 at 18:20, Sachin Mittal wrote: > >> Hi, >> I am kind of building a batch/streaming hybrid beam application. >> Data is fed into a kinesis stream

Re: Can we write to and read from and then write to same kinesis stream using KinesisIO

2023-05-12 Thread Sachin Mittal
only in direct runner? Or Flink runner behaves similarly? > > Best Regards, > Pavel Solomin > > Tel: +351 962 950 692 | Skype: pavel_solomin | Linkedin > <https://www.linkedin.com/in/pavelsolomin> > > > > > > On Fri, 12 May 2023 at 16:43, Sachin Mittal

Re: Is there a way to generated bounded sequence emitted at a particular rate

2023-05-12 Thread Sachin Mittal
;https://www.linkedin.com/in/pavelsolomin> > > > > > > On Fri, 12 May 2023 at 15:04, Sachin Mittal wrote: > >> Hi, >> I want to emit a bounded sequence of numbers from 0 to n but downstream >> to receive this sequence at a given rate. >> &g

Re: Can we write to and read from and then write to same kinesis stream using KinesisIO

2023-05-12 Thread Sachin Mittal
, 2 pipelines should work too: > > - writePipeline.run().waitUntilFinish() > - readAndWritePipeline.run().waitUntilFinish() > > Best Regards, > Pavel Solomin > > Tel: +351 962 950 692 | Skype: pavel_solomin | Linkedin > <https://www.linkedin.com/in/pavelsolomin> > >

Is there a way to generated bounded sequence emitted at a particular rate

2023-05-12 Thread Sachin Mittal
Hi, I want to emit a bounded sequence of numbers from 0 to n but downstream to receive this sequence at a given rate. This is needed so that we can rate limit the HTTP request downstream. Say if we generate sequence from 1 - 100 then downstream would make 100 such requests almost at the same

Re: Can we write to and read from and then write to same kinesis stream using KinesisIO

2023-05-10 Thread Sachin Mittal
ses/javadoc/current/org/apache/beam/sdk/io/aws2/kinesis/KinesisIO.html > > Best Regards, > Pavel Solomin > > Tel: +351 962 950 692 | Skype: pavel_solomin | Linkedin > <https://www.linkedin.com/in/pavelsolomin> > > > > > > On Wed, 10 May 2023 at 10:50,

Can we write to and read from and then write to same kinesis stream using KinesisIO

2023-05-10 Thread Sachin Mittal
Hi, I am using aws beam sdk1 to read from and write to a kinesis stream. *org.apache.beam.sdk.io.kinesis.KinesisIO* My pipeline is something like this: (*note the kinesis stream used to write to and then again read from is empty before starting the app*)

Re: How to identify what objects in your code have to be serialized

2023-05-08 Thread Sachin Mittal
; https://beam.apache.org/documentation/programming-guide/#requirements-for-writing-user-code-for-beam-transforms > > > Best, > Bruno > > > > > > > > > > On Tue, May 9, 2023 at 12:30 AM Sachin Mittal wrote: > >> I am trying to c

How to identify what objects in your code have to be serialized

2023-05-08 Thread Sachin Mittal
I am trying to create a pipeline where I query paginated data from some external service via a client and join them into a PCollectionList and flatten it to get the final collection of data items. The data class is encoded using a ProtoCoder Here is my code:

Can we shutdown a pipeline based on some condition

2023-05-04 Thread Sachin Mittal
Hi, I am kind of building a batch/streaming hybrid beam application. Data is fed into a kinesis stream and the beam pipeline is run. I want to stop the pipeline if no new data is fed into the stream for a certain period of time, say 5 minutes. Is there a way of achieving this ? Right now I only

Difference between sdk.io.aws2.kinesis.KinesisIO vs sdk.io.kinesis.KinesisIO

2022-08-31 Thread Sachin Mittal
Hello folks, We are consuming from a kinesis stream in our beam application. So far we were using *sdk.io *.kinesis.KinesisIO to read from kinesis. What I understand is that this library does not use Enhanced Fan-Out Consumer provided by AWS. Recently I saw this library