Hi Jason,

Yes, this is the idea. The connector assigns a subset of files to each
task.

A task stores the size of file, the bytes offset and the bytes size of the
last sent record as a source offsets.
A file is finished when recordBytesOffsets + recordBytesSize =
fileBytesSize.

The connector should be able to start a thread in background to track
offsets for each assigned file.
When all tasks has finished the connector can stop tasks or assigned new
files by requesting tasks reconfiguration.

Another advantage of monitoring source offsets from the connector is detect
slow or failed tasks and if necessary to be able to restart all tasks.

Thanks,

2017-02-18 6:47 GMT+01:00 Jason Gustafson <ja...@confluent.io>:

> Hey Florian,
>
> Can you explain a bit more how having access to the offset storage from the
> connector helps in your use case? I guess you are planning to use offsets
> to be able to tell when a task has finished a file?
>
> Thanks,
> Jason
>
> On Fri, Feb 17, 2017 at 4:45 AM, Florian Hussonnois <fhussonn...@gmail.com
> >
> wrote:
>
> > Hi Kafka Team,
> >
> > I'm developping a connector which need to monitor the progress of its
> tasks
> > in order to be able to request a tasks reconfiguration in some
> situations.
> >
> > Our connector is pretty simple. It's used to stream a thousands of files
> > into Kafka. The connector scans directories then schedules each task
> with a
> > set of assigned files.
> > When tasks are no longer required or new files are detected the connector
> > requests a reconfiguration.
> >
> > In addition, files are store into a shared storage which is accessible
> from
> > each connect worker. In that way, we can distribute file streaming.
> >
> > For that prupose, it would be very convenient to have access to an
> > offsetStorageReader instance from either the Connector class or the
> > ConnectorContext class.
> >
> > I found a similar question:
> > https://www.mail-archive.com/dev@kafka.apache.org/msg50579.html
> >
> > Do you think this improvement could be considered ? I can contribute to
> it.
> >
> > Thanks,
> >
> > --
> > Florian HUSSONNOIS
> >
>



-- 
Florian HUSSONNOIS

Reply via email to