Re: Trident, ZooKeeper and Kafka

Danijel Schiavuzzi Thu, 29 May 2014 01:00:07 -0700

You must set both forceFromStart to true and startOffsetTime to -1 or -2.


On Thu, May 29, 2014 at 12:23 AM, Raphael Hsieh <[email protected]>wrote:

> I'm doing both tridentKafkaConfig.forceFromStart = false; as well as
> tridentKafkaConfig.startOffsetTime = -1;
>
> Neither are working for me. Looking at my nimbus UI, I still get a large
> spike in processed data, before it levels off and seems to not process
> anything.
>
>
>
> On Wed, May 28, 2014 at 3:17 PM, Shaikh Riyaz <[email protected]>wrote:
>
>> I think you can use  kafkaConfig.forceFromStart = *false*;
>>
>> We have implemented this and its working fine.
>>
>> Regards,
>> Riyaz
>>
>>
>>
>> On Thu, May 29, 2014 at 1:02 AM, Raphael Hsieh <[email protected]>wrote:
>>
>>> This is still not working for me. I've set the offset to -1 and it is
>>> still backfilling data.
>>> Is there any documentation on the start offsets that I could take a look
>>> at ?
>>> Or even documentation on kafka.api.OffsetRequest.LatestTime() ?
>>>
>>>
>>> On Wed, May 28, 2014 at 1:01 PM, Raphael Hsieh <[email protected]>wrote:
>>>
>>>> would the Trident version of this be
>>>> tridentKafkaConfig.startOffsetTime ?
>>>>
>>>>
>>>> On Wed, May 28, 2014 at 12:23 PM, Danijel Schiavuzzi <
>>>> [email protected]> wrote:
>>>>
>>>>> By default, the Kafka spout resumes consuming where it last left off.
>>>>> That offset is stored in ZooKeeper.
>>>>>
>>>>> You can set forceStartOffset to -2 to start consuming from the
>>>>> earliest available offset, or -1 to start consuming from the latest
>>>>> available offset.
>>>>>
>>>>>
>>>>> On Wednesday, May 28, 2014, Raphael Hsieh <[email protected]>
>>>>> wrote:
>>>>>
>>>>>> If I don't tell trident to start consuming data from the beginning of
>>>>>> the Kafka stream, where does it start from?
>>>>>> If I were to do:
>>>>>>    tridentKafkaConfig.forceFromStart = true;
>>>>>> Then it will tell the spout to start consuming from the start of the
>>>>>> stream. If that is not set, then where does it start consuming from? and
>>>>>> How might I go about telling it to start consuming from the very end of 
>>>>>> the
>>>>>> stream?
>>>>>>
>>>>>> If a disaster were to happen and all my hosts died, when I start my
>>>>>> cluster back up, it might start consuming from where it left off. I would
>>>>>> rather manually process that old data, and have my storm system start
>>>>>> processing the live data.
>>>>>>
>>>>>> Thanks
>>>>>> --
>>>>>> Raphael Hsieh
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Danijel Schiavuzzi
>>>>>
>>>>> E: [email protected]
>>>>> W: www.schiavuzzi.com
>>>>> T: +385989035562
>>>>> Skype: danijels7
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Raphael Hsieh
>>>> Amazon.com
>>>> Software Development Engineer I
>>>> (978) 764-9014
>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>>
>>> --
>>> Raphael Hsieh
>>> Amazon.com
>>> Software Development Engineer I
>>> (978) 764-9014
>>>
>>>
>>>
>>>
>>
>>
>>
>> --
>> Regards,
>>
>> Riyaz
>>
>>
>
>
> --
> Raphael Hsieh
>
>
>



-- 
Danijel Schiavuzzi

E: [email protected]
W: www.schiavuzzi.com
T: +385989035562
Skype: danijels7

Re: Trident, ZooKeeper and Kafka

Reply via email to