Hey folks Really simple question here. I currently have an etl pipeline that reads from s3 and saves the data to an endstore
I have to read from a list of keys in s3 but I am doing a raw extract then saving. Only some of the extracts have a simple transformation but overall the code looks the same I abstracted away this logic into a method that takes in an s3 path does the common transformations and saves to source But the job takes about 10 mins or so because I'm iteratively going down a list of keys Is it possible to asynchronously do this? FYI I'm using spark.read.json to read from s3 because it infers my schema Regards Sam