My apologies for the late response due to Thanksgiving break (happy
holidays!)

We tried hard not to cause any potential GDPR issues and not to fall into
the human subject research category as Tobias and Laura already explained
(thanks!); for example, we intentionally did not use Google Surveys because
we were not sure if Google Forms are GDPR compliant. (but, SurveyMonkey has
a Data Processing Agreement,
https://www.surveymonkey.com/curiosity/surveymonkey-committed-to-gdpr-compliance/)
and we revised the questionaries with our IRB not to fall into the human
subject research category and not to collect any private information. But
again, we (and IRB) are not perfect. That is why we made all questionnaires
optional (after the first page).

> The proper limit of DNS queries for SPF lookups is ten, per RFC 7208, and
should be baked into any code library used by an operator that is doing SPF
validation on inbound mail, so I don't see them facing challenges here.
> On the other hand, staying under the limit of ten for domains publishing
SPF records can be quite a challenge for complex organizations using
multiple services to send their email, and while there are known ways to
skin that cat, there isn't a universally adopted method for doing so.

Thanks for pointing it out – that is why we find that many domains have
many referrals causing many DNS lookups. Our focus is more on the SPF
validation side --- We've been testing many open-source milters, plugins,
libraries, and so on; most of them have 10 or 20 limits. But as SMTP
servers install multiple plugins, libraries, or spam filters, we find that
the actual number of lookups can be amplified. This may cause some security
concerns, which we are investigating now; once our analyses are complete,
we will share the results here.

> Is the survey investigating problems faced by operators doing SPF
validation or operators publishing SPF records or both?

The survey itself is more focused on the operators doing SPF validation.
But I think it would have been better if it also covered the other
perspective as you said :-)

> While we wait for Tijay's reply, let me anticipate that he works on a
> "DNSSECFixer" project, which leverages machine learning techniques to
> automatically fix incorrectly configured and insecure domains.[*]
> As one of the few who took part in the survey, my guess is that it aims
at a
> bird's eye view of email operators' involvement in the configuration of
> security features available in SMTP servers.  Correct?

Yes. As Tobias explained, we can observe certain phenomena (e.g., some mail
servers look up SPF records more than 100 times) from data, but we don't
know *why* it happens; we have interviewed around 5~10 mail operators
individually, which were great but hard to scale, thus we thought that
survey would be useful. So far, thanks to mailop and dnssec communities, I
gained many benefits by doing so --- our team got many help last year from
the mailop community to understand why sometimes deploying/managing DANE is
hard (https://list.mailop.org/private/mailop/2021-June/019414.html and
shared our findings after the survey:
https://list.mailop.org/private/mailop/2021-November/020617.html) or help
from DNS registrars to understand the challenges of deploying DNSSEC.

Thanks again for your help.

Taejoong "Tijay" Chung, Assistant Professor
Virginia Tech  |  Computer Science
Knowledge Works II, RM 2228
2202 Kraft Drive, Blacksburg, VA 24060
(540) 231-0667| ti...@vt.edu




On Wed, Nov 23, 2022 at 7:54 AM Tobias Fiebig <tob...@fiebig.nl> wrote:

> Heho,
> (Excuse the footnotes; But there is a lot of tangential stuff worth
> mentioning, but not necessarily core to the thread.)
>
> Let me weigh in here and provide some context as Tijay listed me as a
> collaborator, and he seems to be a bit delayed in replies. Tijay is a
> Professor at VT, and works on a variety of topics at the intersection of
> security and measurement. This means that he usually supervises students in
> pursuit of their PhD*.
>
> One of the areas a PhD student currently works on is SPF. This started
> with them prodding around and finding that a) The _defined_ limits for SPF
> lookups seem to be impractical (using Bulk ESPs, Salesforce here, marketing
> there,... lots of includes), and b) Some SPF implementations are really not
> good at dealing with _very_ deep SPF trees.
>
> This is, for better or worse, how ideas for research papers usually come
> together in this specific field of research**.
>
> So, what is this research about now: The group found that odd behavior,
> and what you usually do then first is do some measurements. That didn't go
> to well, and shut off a couple of mailserver***, and they are currently
> working really hard at getting that measurement setup robust and in line; I
> am helping there a bit, as we built a similar setup for our mail-delivery
> scans (even though that does no potentially intrusive things, which makes
> stuff a lot easier in terms of consent etc.).
>
> Anyway, apart from that, to get a scientific**** exploration of the topic,
> the question also is _why_ did this problem occur in the first place.
> Hence, there is now a survey which asks people (broadly) a) how they are
> implementing SPF/DMARC/TLS-RPT in their setup (software, limits), b)
> demographics about their setup [number of users, domains etc.], and whether
> they are aware of the (TBH, obviously outdated) limits from the standards.
> The survey does not contain mandatory fields, and tells people what to
> expect and that they can quit at any time, and can of course also have
> their incomplete replies (or complete replies if they change their mind
> later on!) deleted.***** I think Laura also explained _very_ well why this
> is not 'human subject research'; I still see the issue there in terms of
> the IRB terminology (which is often even prescribed in that unhandy way;
> And IRBs also usually have 'limited experience and knowledge' when it comes
> to 'Internet things', especially network measurements; But for that, see
> **.)
>
> For the survey and measurements combined, the idea is to then have
> empirical data on how things are implemented (network measurements, even
> though there is ongoing debate on how usable those earlier measurements are
> in a scientific sense), _why_ they are implemented that way (technical
> level; part about tools and implementations in the survey), and _if_ and
> _how_ best practices/RFCs should be amended to capture the actual needs of
> the email ecosystem (second part of the survey). That together then can
> form a publication on the observed challenges in SPF/DMARC reporting.
>
> As such, the survey is for everyone involved in Email; There is no harm to
> [the research], if you are not sure if the survey is for you and you agree
> to participate, go through the survey and see if you'd be able to provide
> meaningful input on the questions.
>
> If there are further questions, or you want to further discuss individual
> points, please let me know.
>
> With best regards,
> Tobias
>
> * This is also the reason why I'd argue that it might be somewhat good if
> the communication style with researchers tries to keep in mind that, in the
> end, there is a human on the other side. People tend to make rather strong
> and frustratedly sounding statements from time to time, but these might be
> ultimately read by a PhD student, i.e., a very junior person. A bit of
> kindness can go a long way in communication, and for me personally is also
> an important point of being an engineer.
>
> ** We can have an awful lot of discussions about this, and there is a lot
> going on; Besides the obvious 'is it good or not' and 'is this really
> science?', we essentially deal with 'science' with all its incentives
> (publish or perish); This means trying to do research on Internet protocols
> that have been around for longer than the combined age of the PI and PhD in
> many cases; PhDs fall out of their masters before getting into such a
> topic, and then essentially have a year to know all the ins and out of a
> protocol that grew over the past 40 years... because after the first year
> they _should_ basically have their first paper under submission. Certainly
> people that haven't run their own mailserver/authNSes for more than a
> decade. Yes, recipe for disaster.
>
> Currently, I am actually looking into building an 'AS for Measurement
> Research', especially 'higher level protocols' (mail, DNS,... ) that is
> open to researchers; That should a) have its own ASN, /47, /23, and
> delegated zones, so people can _really_ easily block/filter it, b) Maintain
> a global opt-out list across experiments, c) Have an 'Internet measurement
> ethics board' that vets experiments before they are run, so that they are
> checked for possible harm and correctness of implementation, and d) Have
> somewhat experienced people run 24/7 abuse, so _when_ things go wrong,
> stuff is mitigated quickly. Hence, it would make measurement more
> accessible (not all [smaller] universities have the infrastructure in place
> to do proper active measurements), while also having mechanics in place to
> make sure 'opsies' due to inexperience (which, tbh, quiet commonly come out
> of the science bobble [I am also part of]) do not have as much impact. But
> I am currently stuck at 'getting a /23', which is surprisingly difficult
> without $30k to blow... so if one of you has some spare v4, I wouldn't say
> no. ;-)
>
> *** There were a lot of unkind words around that here; Then again, it is
> probably the same voices that are also quick to proclaim that it is your
> own fault if you system breaks due to your inexperience (RTFM etc.)... and
> well, there is a lot of noise on the net, and my own stuff was hit by
> something similar recently [0]; So... bigger discussion again, I'd argue.
>
> **** Again, what is 'the science of security' may very well be the most
> debated topic in that scientific community. Happy to discuss that if we
> cross paths at RIPE/NANOG/IETF or sth. somewhen.
>
> ***** I mean, how else should you take consent? You can decide at every
> step of the process. If the questions were presented along with the consent
> request, they would be presented without collecting consent to present the
> questions, which kind of defeats the point of consent in the first place.
>
> [0]
> https://doing-stupid-things.as59645.net/mail/opensmtpd/mysql/2022/08/30/receiving-an-email.html
>
>
> -----Original Message-----
> From: mailop <mailop-boun...@mailop.org> On Behalf Of Alessandro Vesely
> via mailop
> Sent: Wednesday, 23 November 2022 12:17
> To: mailop@mailop.org
> Cc: ti...@vt.edu
> Subject: Re: [mailop] SPF (and other email security protocols) Survey
>
> On Tue 22/Nov/2022 16:41:44 +0100 Todd Herr via mailop wrote:
> > On Mon, Nov 21, 2022 at 2:00 PM Taejoong (tijay) Chung via mailop wrote:
> >>
> >> The Sender Policy Framework (SPF) is an easy way to check whether the
> >> sender is authorized to send emails – however, it may cause some
> >> security holes if it causes too many DNS lookups.
> >>
> >> Together with researchers from Virginia Tech and Max-Planck-Institut
> >> für Informatik, we would like to understand which challenges
> >> operators face when configuring the proper limit of DNS queries for
> >> SPF lookups and when deploying other email security protocols.
> >
> > I'm not quite sure I understand the premise behind the survey, and
> > since I don't manage email for any domain, I can't realistically take
> > part in the survey to learn the premise, so I'll try here.
> >
> > The proper limit of DNS queries for SPF lookups is ten, per RFC 7208,
> > and
> > *should* be baked into any code library used by an operator that is
> > doing SPF validation on inbound mail, so I don't see them facing
> challenges here.
>
>
> On my MTA the (default) limit is 20.  That looks consistent with Postel's
> principle.
>
>
> > On the other hand, staying under the limit of ten for domains publishing
> > SPF records can be quite a challenge for complex organizations using
> > multiple services to send their email, and while there are known ways to
> > skin that cat, there isn't a universally adopted method for doing so.
> >
> > Is the survey investigating problems faced by operators doing SPF
> > validation or operators publishing SPF records or both?
>
>
> While we wait for Tijay's reply, let me anticipate that he works on a
> "DNSSECFixer" project, which leverages machine learning techniques to
> automatically fix incorrectly configured and insecure domains.[*]
>
> As one of the few who took part in the survey, my guess is that it aims at
> a
> bird's eye view of email operators' involvement in the configuration of
> security features available in SMTP servers.  Correct?
>
> As the survey asks for the domain name where such features are configured,
> I
> understand that that might sound as intelligence gathering for a targeted
> attack.  However, I don't reckon the survey collects any sensitive data.
>
>
> Best
> Ale
> --
>
> [*] https://cs.vt.edu/research/research-areas/security.html
>
>
>
>
>
>
> _______________________________________________
> mailop mailing list
> mailop@mailop.org
> https://list.mailop.org/listinfo/mailop
>
>
_______________________________________________
mailop mailing list
mailop@mailop.org
https://list.mailop.org/listinfo/mailop

Reply via email to