lhyundeadsoul commented on issue #2127:
URL:
https://github.com/apache/incubator-seatunnel/issues/2127#issuecomment-1175918277
> > > > org.apache.seatunnel.api.source.SourceReader#snapshotState looks
similar with
org.apache.seatunnel.api.source.SourceSplitEnumerator#snapshotState , but
return a different type(List and StateT), the comments is split checkpoint
state. while actually it returns List, they are not same in my mind. Could we
have a chance to unify the snapshot behavior?
> > >
> > >
> > > `SourceSplitEnumerator#snapshotState` and `SourceReader#snapshotState`
are different. `SourceSplitEnumerator` assumes the role of coordinator, which
may require information beyond the snapshot split. `SourceReader` is designed
to only need splits to run, so the snapshot returns `List<SplitT>`.
> >
> >
> > In my option, `SplitT` is the type of `SourceSplit`, not the State of
`checkpoint`, I wonder if `List<StateT>` is more appropriate?
>
> The `List<SplitT>` returned by `SourceReader#snapshotState` will be added
to `SourceSplitEnumerator#addSplitsBack`. This is to run normally when the
parallelism of the source is changed, so the state of the reader needs to be
able to be converted to split.
>
> To avoid ambiguity, we can add
>
> ```java
> public interface SourceSplitState {
> SourceSplit toSplit();
> }
> ```
>
> and `SourceReader#snapshotState` return `List<SourceSplitState>`.
> `org.apache.seatunnel.api.source.SourceSplitEnumerator.Context`
`#assignSplit` and `#signalNoMoreSplits` is behavior belonging to
`SourceSplitEnumerator`. These behaviors are in the context to mask their
internal implementation, and I don't have a better way to implement this.
Now I see what you mean.
You also think `#assignSplit` and `#signalNoMoreSplits` should belong to
`SourceSplitEnumerator`, but you don't want every subclass of
`SourceSplitEnumerator` has to implement `#assignSplit` and
`#signalNoMoreSplits` (because it has some common logic) .
Am I right?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]