[ 
https://issues.apache.org/jira/browse/CHUKWA-349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12731203#action_12731203
 ] 

Jerome Boulon commented on CHUKWA-349:
--------------------------------------

Jiaqi, be careful with the ordering. In chukwa there's no guarantee that you 
will process Event from T0 before Event from T0+xxx.
Even if we make sure that we are not sending T0+xxx before T0, if you fail over 
to another collector after Events from T0 then everything depends on when the 
fail over collector will close his file.

Ex:
Collector 1 rotate every minute at xx:31 sec
Collector 2 rotate every minute at xx:52 sec

Collector 2 , writes T0 events from Client1 to file
There's a network issue, Client1 switch over Collector 1
Collector 1, writes T0+xxx events from Client1

Since Collector 1 will close his file before Collector 2 there's a possibility 
that demux will start before collector 1 had a chance to close his file.
 




> State-machine generation across split files
> -------------------------------------------
>
>                 Key: CHUKWA-349
>                 URL: https://issues.apache.org/jira/browse/CHUKWA-349
>             Project: Hadoop Chukwa
>          Issue Type: Improvement
>          Components: Data Processors
>    Affects Versions: 0.3.0
>            Reporter: Jiaqi Tan
>            Assignee: Jiaqi Tan
>             Fix For: 0.3.0
>
>
> Current SALSA state-machine generation assumes input files contain all starts 
> and ends of all states; this may not be the case if the input data is sliced 
> across Demux boundaries. There is a need to track incomplete data across 
> multiple runs of the FSMBuilder and to expire and purge state as it's kept 
> past a certain duration. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to