Hello, We are using FileSource <https://nightlies.apache.org/flink/flink-docs-release-1.14/api/java/> to process Parquet Files and had a few doubts around it. Would really appreciate if somebody can help answer them:
1. For a given file, does FileSource read the contents inside it in order ? In other words, what is the order in which the file splits are generated from the contents of the file ? 2. We want to provide a GCS Bucket URL to the FileSource so that it can read parquet files from there. The bucket has multiple parquet files. Wanted to know, what is the order in which the files will be picked and processed by this FileSource ? Can we provide an order strategy ourselves, say, process according to creation time ? 3. Is it possible/good practice to apply checkpointing and watermarking for a bounded source like FileSource ? -- *Regards,* *Meghajit*