Hi, We have a use-case where it would be beneficial to "select" multiple files to process by a regex pattern (or a loop-like functionality to dynamically adjust which files to pick). We have files of different types and inside one type they have versions where we add new data to the records, but we do not remove info. As the files of the same type would be very similar, this would be a UNION. The files are stored in a directory and look like:
type-A-v1—1.avro type-A-v1—2.avro type-A-v1—3.avro type-A-v1—4.avro type-A-v2—1.avro type-A-v2—2.avro type-A-v2—3.avro type-A-v2—4.avro type-A-v2—5.avro type-B-v1—1.avro type-B-v1—2.avro type-B-v1—3.avro …. Same with C etc… As you can guess the v1 stands for version #1, so higher version will have new fields in it. Different types contain different data. It would be great if there is a possibility to address only certain files (aggregate all files type "A" for "v1" and "v2"). What would be the technique of choice here? The aim is to increment the version (adding fields to the records dynamically) without changing the aggregation itself. Of course the new fields will just be ignored. Thanks, Dennis
