[
https://issues.apache.org/jira/browse/CHUKWA-349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12731203#action_12731203
]
Jerome Boulon commented on CHUKWA-349:
--------------------------------------
Jiaqi, be careful with the ordering. In chukwa there's no guarantee that you
will process Event from T0 before Event from T0+xxx.
Even if we make sure that we are not sending T0+xxx before T0, if you fail over
to another collector after Events from T0 then everything depends on when the
fail over collector will close his file.
Ex:
Collector 1 rotate every minute at xx:31 sec
Collector 2 rotate every minute at xx:52 sec
Collector 2 , writes T0 events from Client1 to file
There's a network issue, Client1 switch over Collector 1
Collector 1, writes T0+xxx events from Client1
Since Collector 1 will close his file before Collector 2 there's a possibility
that demux will start before collector 1 had a chance to close his file.
> State-machine generation across split files
> -------------------------------------------
>
> Key: CHUKWA-349
> URL: https://issues.apache.org/jira/browse/CHUKWA-349
> Project: Hadoop Chukwa
> Issue Type: Improvement
> Components: Data Processors
> Affects Versions: 0.3.0
> Reporter: Jiaqi Tan
> Assignee: Jiaqi Tan
> Fix For: 0.3.0
>
>
> Current SALSA state-machine generation assumes input files contain all starts
> and ends of all states; this may not be the case if the input data is sliced
> across Demux boundaries. There is a need to track incomplete data across
> multiple runs of the FSMBuilder and to expire and purge state as it's kept
> past a certain duration.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.