We are almost sure that there is no affinity. On Mon, Mar 18, 2019, 12:31 PM Vishal Santoshi <[email protected]> wrote:
> If a flink pipeline with a kafka source was to be re stored from a > checkpoint, would a partition be assigned to a the same sub task index ? It > does not seem so, but wanted to confirm. We are trying to simulate an edge > case where we have dangling files. We create these files by forcing 2 or > more restores between 2 consecutive checkpoints. These files are created > after the first checkpoint but are rendered redundant by another restore > before the next checkpoint. We want to be sure that all data in these files > are accounted for by checking for the records in resolved files but were > not sure we should check the ones with the same task index or all sub > indexes... > > > for example, if we have 2 sub task index ids > if _in-progress-part-1-1 is a dangling/redundant file would that data be > in part-1-2 ( the next finalized file but with the same sub task id ) or > could it be in part-2-2 plus for example ? If the partitions were to > maintain affinity, we would imagine that it would be former but wanted to > confirm. > > Regards. >
