Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Samza Wiki" for change 
notification.

The "Pluggable MessageChooser" page has been changed by ChrisRiccomini:
https://wiki.apache.org/samza/Pluggable%20MessageChooser?action=diff&rev1=6&rev2=7

  
  In cases where two streams have the same priority, we need to implement a 
strategy (either time-aligned, round robin, or some proxy for time-aligned). I 
haven't come up with an opinion on which strategy we should use yet.
  
+ ==== Batching ====
+ 
+ I mentioned batching at the beginning of this document, but haven't mentioned 
it since. I think we can make the DefaultChooser batch simply by having an 
affinity to the last SSP it picked in cases where all envelopes have the same 
priority.
+ 
+ For example, if two streams, X and Y, are of the same priority, and each has 
an envelope available, the chooser would execute some tie breaking logic 
(time-aligned, round robin, etc) to choose the next envelope. The next time 
choose is called, if stream X and Y both have envelopes again, the chooser 
doesn't execute the same tie-breaking logic; it simply picks the envelope from 
the SSP that it picked last time. It can do this up to the batch size, at which 
point it can then re-execute the tie-breaking logic, and reset its batch 
counter.
+ 
+ The advantage of batching is that it will lead to smaller replay log messages 
(read X, read Y, read X, read Y vs. read 2 X, read 2 Y).
+ 
  === MessageChooser interface ===
  
  I think the !MessageChooser interface is fine as it is. Initially, I wanted 
to add register/start/stop methods to it.
  
  The main motivation for start/stop is that it allows developers to setup a 
client that queries some outside service to make picking decisions. I'm not 
saying that this is advisable, but I know people will try and do it. Without 
stop, there's no way to shut down the client when the service stops. If we 
assume the service never stops, then this isn't a problem, but if there is a 
definite "end" to the processor (i.e. !TaskCoordinator.shutdown), then the 
chooser needs a graceful shutdown.
  
- The motivation for register is that there are situations where you want to 
initialize your data structures (or whatever) on startup before any messages 
are received. Letting the !MessageChooser know which !SystemStreamPartitions 
it's going to be receiving messages from just seems like a good idea..
+ The motivation for register is that there are situations where you want to 
initialize your data structures (or whatever) on startup before any messages 
are received. Letting the !MessageChooser know which !SystemStreamPartitions 
it's going to be receiving messages from just seems like a good idea.
  
  I have since backed off on this idea since I can't come up with a good 
concrete example of why we need them, and all of the reference implementations 
we've written so far wouldn't need them.
  

Reply via email to