Hi Team,

I was working on a problem statement and I came across beam. Being very new
to beam I am not sure if my use case can be solved by beam. Can you please
help me here.

Use case:

I have list of CSV and JSON files coming every min in Google cloud storage.
The file can range from kb to gb. I need to parse the file and process
records in each file independently, which means file 1 records should be
parsed and data will be enriched and be stored in different output location
and file 2 will go into different location.

I started with launching a different dataflow job for each file but it is
over kill for small files. So, I thought if I can batch files every 15 mins
and process them together in a single job but I need to maintain the above
boundary of data processing.


Can anyone please help me if there is a solution around my problem or beam
is not meant for this problem statement.

Thanks in advance.

Looking forward for a reply.

Regards,
Subham Agarwal

Reply via email to