On Thu 19/Aug/2021 20:36:08 +0200 Alessandro Vesely wrote:
Now it's about dinner time and I'm not willing to program a binary distribution for a decently high number of trials.


Today I took the time to compute it. Consider the values of k that are acceptable up to a rounding error. That is, for pct=20 and n = 10 we have k = 1, 2. Both values of k approximate the expected percentage, so we consider their sum as the probability that the result of applying the specified algorithm is acceptable. As n grows, so grow the acceptable k's. The results are as follows:

     n | low k | high k | probability that the result of applying
       |       |        | the specified algorithm is acceptable
-------+-------+--------+-------------------------------------
    10 |     1 |      2 | 0.570425
   100 |    15 |     24 | 0.788203
  1000 |   150 |    249 | 0.999913
 10000 |  1500 |   2499 | 1.000000

That proves that as the number of failed DMARC checks raises, the specified algorithm tends to deliver the exact result.

I attach a program to compute those figures in case anyone likes to review its correctness.


Best
Ale
--


// compute the sum of probabilities that get rounded to pct=20 for higher values of n.
// For example, for n = 100, and k = 15, 16, ..., 24, k/n ~ 0.2 when rounded to 1st digit.
// For n = 1000, consider k = 150, 151, ..., 249, as k/n ~ 0.2 when rounded to two digits.

#include <stdio.h>
#include <stdlib.h>
#include <math.h>


static inline double lbinomial(int n, int k)
{
    // compute log(C(n,k)) using gamma function
	// after C(n,k) = exp(lgamma(n+1) - lgamma(k+1) -lgamma(n-k+1))
	// see https://en.wikipedia.org/wiki/Binomial_coefficient#In_programming_languages
	return lgamma(n+1) - lgamma(k+1) -lgamma(n-k+1);
}

int main(int argc, char *argv[])
{
	int n;
	if (argc <= 1 || (n = atoi(argv[1])) <= 1)
		n = 100;

	double nn = (double)n;
	double p = 0.2, q = 1.0 - p;
	double lp = log(p), lq = log(q);
	double err = 0.5; // rounding error
	double low_k = floor(p * nn - err * nn / 10.0);
	double high_k = p * nn + err * nn / 10.0;

	double sum = 0.0;
	int k;
	for (k = low_k; k < high_k; ++k)
	{
		// add C(n,k)*p^k*(1-p)^(n-k)
		double l = lbinomial(n, k) + k*lp + (n-k)*lq;
		sum += exp(l);
	}

	printf("sum from %f to %d = %f\n", low_k, k - 1, sum);

	return 0;
}
_______________________________________________
dmarc mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/dmarc

Reply via email to