Hey, It seems like both interfaces are pretty much capable of doing the same thing but work on slightly different assumptions.
Isn't there a way that the kafka source can work with the interruptions? I think the reachedEnd/next interface is slightly easier to grasp than the run() with the locks. But in any case I would slightly prefer having only one of them if they can technically do the same thing. Also adding a new interface means we add a new streamtask implementation which is also getting slightly too much. What is you opinion on this? Gyula On Fri, May 29, 2015 at 6:51 PM, Aljoscha Krettek <aljos...@apache.org> wrote: > Hi All, > after finishing my pull request that should fix the problems with the > synchronisation of checkpoints and element emission (the reason for > the faulty results of the exactly-once tests) I discovered that the > Kafka source does not deal well with being interrupted. We recently > changed the SourceFunction to the reachedEnd()/next() interface, with > the contract that the source must be interruptible to be able to > perform checkpoints. Now this doesn't seem to work with Kafka. I added > another Source interface in my PR > (https://github.com/apache/flink/pull/742). This is similar to the old > interface of run()/cancel(), with the addition that the source must > acquire a lock before updating state and emitting elements. The update > of state and the emission of elements must happen in the same > synchronized block to ensure consistency. This seems to solve the > problem but now we have two source interfaces. > > The question is now. What do you think about the two interfaces? > Should we keep both? Remove one? > > Cheers, > Aljoscha >