Elizabeth,

An excellent point that counting messages not reported would be both easier and 
more valuable than
counting records not reported. So I ask, should this information be provided, 
and if so,
should it appear at the top of the report along with begin/end, or should it 
appear at the end
of the report as part of a new section called something like "summary"?

Then, when armed with such information, one could decide to either process the 
report or ignore it
or interpolate it.

Best Regards,
--Bryan Costales

  >------------
  > Quoting Elizabeth Zwicky <[email protected]>
  > Subject: Re: [dmarc-discuss] Aggregate Report Missing Data
  >------------
  > 
  > Note that when you're talking about aggregate reports, records are per 
  >     sending ip + disposition + reason + authentication results
  > 
  > (Theoretically, recipient domain also enters into the equation, but at the
  > moment nobody generates a report in which it varies.)
  > 
  > That could be a number of records per sending IP, but in practice it isn't;
  > most sending IPs get a single record. So 1,000,000 records will usually
  > be at least 500,000 sending IPs. Even big 
  > senders with lots of mail going through forwarders and mailing lists come 
nowhere 
  > near that in valid mail. The result is that truncated reports consist in 
the vast majority 
  > of abusive mail, and in particular of abusive mail sent from botnets.
  > 
  >  Because these abusive IPs send few messages per IP, and most valid IPs 
  > send large numbers of messages per IP, you can easily truncate half or more 
the records 
  > and still end up with a report that represents the majority of the actual 
mail. So in order to 
  > determine statistical significance, you probably don't want to know
  > how many report records are omitted, but how many pieces of mail are 
omitted.
  > 
  > There are two reasons that reports are truncated. First, there's the effort 
to keep the 
  > reports to a size that people can effectively receive and process. That's 
easily achieved 
  > by truncating reports after generation.  Second, report generators need to 
keep the resource 
  > consumption of the reports to a manageable level. That generally requires 
just ignoring 
  > things at some point. Therefore, as a report generator, I'm reluctant to 
volunteer to count up 
  > anything about the data I'm not reporting on. However, if I were going to 
count something, 
  > it would be messages, not report lines -- messages is both more useful to 
calculating
  > significance and lower cost to count.
  > 
  >     Elizabeth Zwicky 
  > 
  > On Aug 18, 2012, at 6:50 AM, <[email protected]> wrote:
  > 
  > > It is possible for some sites to choose and arbitrary Aggregate Report 
size and
  > > to truncate reports at that size. Google currently truncates at 1,000,000 
records.
  > > The problem is that without knowing how many records are missing, we do 
not know
  > > if we can trust the sent data. For example, if 1,000,000 record are 
reported
  > > for example.com, and 20 were omitted, that is not statistically 
significant
  > > enough to worry about, but is 100,000 were omitted the actually reported 
data
  > > may be misleading and should probably not be used.
  > > I suggest that the Aggregate Reports should contain an indication
  > > of the data included and omitted. Perhaps:
  > > 
  > >   <records_available>1405671</records_available>
  > >   <records_reported>1000000</records_reported>
  > > 
  > > Best Regards,
  > > --Bryan Costales
  > > _______________________________________________
  > > dmarc-discuss mailing list
  > > [email protected]
  > > http://www.dmarc.org/mailman/listinfo/dmarc-discuss
  > > 
  > > NOTE: Participating in this list means you agree to the DMARC Note Well 
terms (http://www.dmarc.org/note_well.html)
  > 
  > 

_______________________________________________
dmarc-discuss mailing list
[email protected]
http://www.dmarc.org/mailman/listinfo/dmarc-discuss

NOTE: Participating in this list means you agree to the DMARC Note Well terms 
(http://www.dmarc.org/note_well.html)

Reply via email to