Hi Team, I was working on a problem statement and I came across beam. Being very new to beam I am not sure if my use case can be solved by beam. Can you please help me here.
Use case: I have list of CSV and JSON files coming every min in Google cloud storage. The file can range from kb to gb. I need to parse the file and process records in each file independently, which means file 1 records should be parsed and data will be enriched and be stored in different output location and file 2 will go into different location. I started with launching a different dataflow job for each file but it is over kill for small files. So, I thought if I can batch files every 15 mins and process them together in a single job but I need to maintain the above boundary of data processing. Can anyone please help me if there is a solution around my problem or beam is not meant for this problem statement. Thanks in advance. Looking forward for a reply. Regards, Subham Agarwal
