So in scenario the stream name should be the same but how do sequence IDs get 
generated?  If I tried to tail the same log file 24 hours after doing it the 
first time would they have the same seq id?

On Mar 18, 2010, at 11:24 AM, Ariel Rabkin wrote:

> Howdy,
> 
> Chukwa does duplicate detection as follows: Each Chunk of data comes
> with a stream name (such as the name of a log file) and a sequence ID.
> If two chunks have the same name and ID, they're duplicate.  The
> content isn't inspected.
> 
> So in your example, the former will be treated as a duplicate, not the latter.
> 
> --Ari
> 
> On Thu, Mar 18, 2010 at 8:59 AM, Corbin Hoenes <cor...@tynt.com> wrote:
>> Does anyone have more information about how chukwa removes duplicates during 
>> demux? How does it decide what is a duplicate?  There are two cases I am 
>> thinking of...
>> 
>> 1 - we send the same log file to chukwa 2x
>> 2 - we have the exact same line in a log file 2x
> 
> 
> 
> -- 
> Ari Rabkin asrab...@gmail.com
> UC Berkeley Computer Science Department

Reply via email to