Re: [marf] Requested Report Interval: stateless cache-driven exponential decrease

Alessandro Vesely Mon, 11 Jul 2011 08:42:48 -0700

On 11/Jul/11 08:29, Hal Murray wrote:
> 
>> I've been wandering about such period.  If we use the record's Time-To-Live
>> (TTL) we can specify stateless reporting like so:
> 
> Piggybacking on the TTL field seems like a bad idea.  A big system might be 
> loafing at X reports per second while the same load could kill a small system 
> or saturate a smaller link.  So you have to distribute something like a scale 
> factor.  I'm assuming that would be done over DNS.  At that point you might 
> as well distribute the real data.


I assume you mean "/receiving/ X reports per second".  I reckon X is
proportional to the size of the domain anyway:  a large domain sends
much mail, part of which may cause authentication failures, and is
more heavily phished than a small domain.  The current ri parameter
provides for setting a /linear/ scale factor, whereby a domain can say
it is only interested in getting, say, 25% of failure reports.  Such
linear behavior can be achieved in a stateless manner by tossing R and
sending if R <= 0.25.

Linear and exponential cutoffs don't have to be mutually exclusive.

> --------
> 
>> On diagnosing a failure, the agent generates a random number R in
>> the interval [0, 1] (or sets R=0.5).  It then computes a value P
>> [...]  If P >= R, then the agent generates and sends the report,
>> otherwise does nothing.
> 
>> P may be computed so as to be near to 1 for newly retrieved
>> records and then decreasing more or less rapidly, according to
>> the value of ri.
> 
> What are the goals of this section?

Since the reporter does not know what failures might be important for
a domain, varying the criteria on which reports are discarded may
better the chances to report something useful.

> I assume the main idea is to avoid overloading (DoS) the receiving system.
> There are two parts to that.  How many reports are coming from each system,
> and how many systems are contributing to the overall load.

Yes.  Each contributing system looks up the DNS record when a message
arrives.  Larger systems will use the cached version of that record
repeatedly until it expires, if they receive a nearly continuous
stream of messages.  A small system possibly uses it just once.  A
sharp Gaussian probability centered on TTL can thus increase the
relative visibility of the latter.

> I like the idea of an exponential backoff.  What are the appropriate 
> parameters?  What data would the sending system need in order to do the right 
> thing?

We should do some simulations to ascertain that.

> Should this type of reporting be moved to a separate socket or separate IP 
> Address?  (so a TCP level reject/timeout can be used to trigger the backoff)

I don't think so, it'd be a rather blind trigger.

> ---------
> 
> Would it help to batch the data (at the report stage)?  If you are the 
> receiving system, what fraction of your CPU/whatever resources are spent 
> processing the connection vs processing the data for a "report" transmitted 
> over that connection?  If I have 100 reports per hour, would you like to get 
> them batched in one message rather than 100 separate messages?

Yes, I certainly would.  Indeed, this is what is currently being
specified.  However, exactly one message has to be attached to the
report.  For the other failures the reporter can only supply a count,
implying they are "similar" in some sense.  Does that mean they all
had the same Auth-Failure type?  Local-part?  Did each failure occur
today?  How are multiple recipients being counted?  I'd never know.

Computing a probability, however complicate it may seem, can be done
in a few lines of code and yields coherent results across most modern
systems.  Batching requires more implementation effort and more CPU
cycles for the reporter, as it implies maintaining tables indexed on
domain names.  I think that mandating such behavior would exclude more
contributing sites than the stateless approach.
_______________________________________________
marf mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/marf

Re: [marf] Requested Report Interval: stateless cache-driven exponential decrease

Reply via email to