Hi all, I have a question on how structured streaming does checkpointing. I’m noticing that spark is not reading from the max / latest offset it’s seen. For example, in HDFS, I see it stored offset file 30 which contains partition: offset {1: 2000}
But instead after stopping the job and restarting it, I see it instead reads from offset file 9 which contains {1:1000} Can someone explain why spark doesn’t take the max offset? Thanks. -- Cheers, Ruijing Li