You can micro batch kafka contents into a file that's replicated (e.g.
HDFS) and then ack all of the input tuples after the file has been closed.

On Wed, May 11, 2016 at 3:43 PM, Milind Vaidya <[email protected]> wrote:

> in case of failure to upload a file or disk corruption leading to loss of
> file, we have only current offset in Kafka Spout but have no record as to
> which offsets were lost in the file which need to be replayed. So these can
> be stored externally in zookeeper and could be used to account for lost
> data. For them to save in ZK, they should be available in a bolt.
>
> On Wed, May 11, 2016 at 11:10 AM, Nathan Leung <[email protected]> wrote:
>
>> Why not just ack the tuple once it's been written to a file.  If your
>> topology fails then the data will be re-read from Kafka.  Kafka spout
>> already does this for you.  Then uploading files to S3 is the
>> responsibility of another job.  For example, a storm topology that monitors
>> the output folder.
>>
>> Monitoring the data from Kafka all the way out to S3 seems unnecessary.
>>
>> On Wed, May 11, 2016 at 1:50 PM, Milind Vaidya <[email protected]> wrote:
>>
>>> It does not matter, in the sense I am ready to upgrade if this thing is
>>> in the roadmap.
>>>
>>> None the less
>>>
>>> kafka_2.9.2-0.8.1.1 apache-storm-0.9.4
>>>
>>>
>>>
>>>
>>> On Wed, May 11, 2016 at 5:53 AM, Abhishek Agarwal <[email protected]>
>>> wrote:
>>>
>>>> which version of storm-kafka, are you using?
>>>>
>>>> On Wed, May 11, 2016 at 12:29 AM, Milind Vaidya <[email protected]>
>>>> wrote:
>>>>
>>>>> Anybody ? Anything about this ?
>>>>>
>>>>> On Wed, May 4, 2016 at 11:31 AM, Milind Vaidya <[email protected]>
>>>>> wrote:
>>>>>
>>>>>> Is there any way I can know what Kafka offset corresponds to current
>>>>>> tuple I am processing in a bolt ?
>>>>>>
>>>>>> Use case : Need to batch events from Kafka, persists them to a local
>>>>>> file and eventually upload it to the S3. To manager failure cases, need 
>>>>>> to
>>>>>> know the Kafka offset for a message, so that it can be persisted to
>>>>>> Zookeeper and will be used to write / upload file.
>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Regards,
>>>> Abhishek Agarwal
>>>>
>>>>
>>>
>>
>

Reply via email to