Re: [DISCUSS] UnboundedSource and the KafkaIO.

Raghu Angadi Fri, 07 Oct 2016 17:47:16 -0700

On Fri, Oct 7, 2016 at 4:55 PM, Amit Sela <[email protected]> wrote:


>    3. Support reading of Kafka partitions that were added to topic/s while
>    a Pipeline reads from them - BEAM-727
>    <https://issues.apache.org/jira/browse/BEAM-727> was filed.
>

I think this is doable (assuming some caveats about generateInitalSplits()
contract). How important is this feature?

Some basic questions about Spark runner :

   - do number of partitions stay same during life of a long running Beam
   streaming job?
   - Will generateIntialSplits() be called more than once during the life
   of a job?
   - When a job is restarted, is generateInitialSplits() invoked again?
      - if yes, do you expect  'desiredNumSplits' for
      generateInitialSplits() to stay same as previous run?
      - if no, are the readers instantiated from previous runs?

Re: [DISCUSS] UnboundedSource and the KafkaIO.

Reply via email to