[ 
https://issues.apache.org/jira/browse/METRON-516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15675444#comment-15675444
 ] 

Matt Foley edited comment on METRON-516 at 11/18/16 1:58 AM:
-------------------------------------------------------------

Concrete design:
* add "*batchTimeout" methods and member variables as appropriate wherever 
there are "*batchSize" methods and/or member variables, in the 8 modules cited 
above.  In EnrichmentConfigFunctions, extend SET_BATCH to include timeout.
* default to 0, meaning use default calculation



was (Author: mattf):
Concrete design:
* add "*batchTimeout" methods and member variables as appropriate wherever 
there are "*batchSize" methods and/or member variables, in the 8 modules cited 
above.
* default to 0, meaning use default calculation


> Add batchTimeout parameters with every use of batchSize parameters, 
> defaulting to 0
> -----------------------------------------------------------------------------------
>
>                 Key: METRON-516
>                 URL: https://issues.apache.org/jira/browse/METRON-516
>             Project: Metron
>          Issue Type: Sub-task
>            Reporter: Matt Foley
>            Assignee: Matt Foley
>
> This is the control mechanism for METRON-322, timeout-based flushing of batch 
> queues.  The behavior of the batchTimeout parameter is as follows:
> * The default value will be 1/2 the value of "topology.message.timeout.secs", 
> minus delta, except in the Enrichment topology where the default value will 
> be 1/4 the value of "topology.message.timeout.secs" (due to the possibility 
> of having an enricher bolt and a threat intel bolt, both with batch queues, 
> daisy-chained one after the other, in the same Enrichment topology).
> ** The default value of "topology.message.timeout.secs" is 30 sec, so the 
> general default batchTimeout will be 14 sec, or 7 sec for Enrichment 
> topologies.
> * Because there is a good default setting, the user does not need to set 
> batchTimeout unless they wish to force more frequent flushing.
> * If the user sets a batchTimeout interval larger than the default based on 
> "topology.message.timeout.secs", it will be ignored.  You can flush more 
> frequently than the default, but not less frequently.  If you want less 
> frequent flushing, you must set a larger "topology.message.timeout.secs" at 
> the Storm level.
> ** Question: should we provide a Metron-level ZK parameter to override the 
> Storm-level "topology.message.timeout.secs"?
> * The value of batchTimeout is not a guaranteed flush interval:
> ** The bolt will attempt to set "topology.tick.tuple.freq.secs" to the 
> smallest interval used by any of that bolt's queues' batchTimeout.  
> ** -The actual interval of Tick Tuple receipt by all bolts within a topology 
> will be the smallest "topology.tick.tuple.freq.secs" interval specified for 
> any bolt in the topology.  (This is a Storm behavior.)- (Tentatively striking 
> as incorrect, but still seeking clarification from Taylor G.)
> ** Each bolt, when it receives a Tick Tuple, will decide whether its queue(s) 
> need flushing, based on whether each queue is actually older then that 
> queue's batchTimeout.
> ** Altogether, this means that each queue will flush at an indeterminate time 
> when its age is between \(i) the batchTimeout interval for that queue and 
> (ii) the sum of that batchTimeout interval plus the actual Tick Tuple receipt 
> interval -- unless it flushes first due to batchSize.
> Bolts that use batch queues will be modified to capture the value of 
> "topology.message.timeout.secs" and set the value of 
> "topology.tick.tuple.freq.secs" in the call to getComponentConfiguration() 
> method.  Because this apparently only happens at topology startup:
> * If the Metron administrator wishes to change 
> "topology.message.timeout.secs" from the default value of 30 sec, she must do 
> it in storm.yaml or in the CLI, and *must not attempt to change it* via 
> custom java code in topology submission.  Attempted changes at topology 
> submission time (except in the CLI) will not be seen in time to correctly set 
> "topology.tick.tuple.freq.secs" in response.  
> * -Authors of custom bolts may change "topology.message.timeout.secs" in the 
> getComponentConfiguration() method at the same time as setting 
> "topology.tick.tuple.freq.secs".- (Need to confirm whether this will really 
> work at the topology level.)
> * "topology.tick.tuple.freq.secs" cannot be changed during runtime; it 
> requires restarting the topology (unless there are Storm GUI actions to 
> change it in runtime, which has not yet been investigated).  As a result, 
> changing batchTimeout during runtime will have this result:  The affected 
> queue will flush at an indeterminate time when its age is between \(i) the 
> new batchTimeout interval, and (ii) the sum of the new batchTimeout plus the 
> unchanged Tick Tuple receipt interval -- noting that the unchanged Tick Tuple 
> receipt interval may be much larger than the new batchTimeout, if the revised 
> batchTimeout was made much smaller.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to