[
https://issues.apache.org/jira/browse/NUMBERS-29?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17626651#comment-17626651
]
Alex Herbert commented on NUMBERS-29:
-------------------------------------
I have spent some time looking at stirlingS2. I fixed one bug for S(n, k) when
k==2 and s>64. Fix and test pushed to master.
The main implementation to compute the S2 number has no test coverage of a
successful computation (i.e. all test input is expected to throw an exception).
The final part of that computation divides the sum by k!. This will be limited
to maximum k of 20. However there are some cases that should compute a result:
{noformat}
S(26, 3) = 423610750290L
S(26, 4) = 187226356946265L
S(26, 23) = 4126200
S(26, 24) = 47450{noformat}
These currently raise an overflow as the implementation sums terms of
alternating magnitude and checks for a negative sum at each step. It should
only check for a negative sum when an addition to the sum has been made.
I locally fixed the implementation as:
{code:java}
// definition formula: note that this may trigger some overflow
// S(n, k) = [ sum_{i=0}^k (-1)^i binom(k, i) (k - i)^n ] / k!
long sum = 0;
long sign = ((k & 0x1) == 0) ? 1 : -1;
// Compute terms in ascending magnitude: j == k - i
// Check the sum for overflow only after additions.
for (int j = 1; j <= k; ++j) {
sign = -sign;
// FIXED here to check sign on additions only
sum += sign * Math.multiplyExact(BinomialCoefficient.value(k, j),
ArithmeticUtils.pow((long) j, n));
if (sign == 1 && sum < 0) {
// there was an overflow somewhere
throw new
MathArithmeticException(LocalizedFormats.ARGUMENT_OUTSIDE_DOMAIN,
n, 0, stirlingS2.length - 1);
}
}
return sum / Factorial.value(k); {code}
I did not push this fix to math git master as the exceptions are sometimes a
java.lang.ArithmeticException and sometimes MathArithmeticException (does not
extend ArithmeticException). These should be consolidated to all be the same
type.
Note also that this computation divides by k!. So will throw an exception for
any k > 20. This could be checked before doing the loop. It is also possible to
compute the largest used binomial coefficient first. If this is OK then the
rest can be computed by downward recursion: binom(n , k-1) = binom(n, k) * k /
(n-k+1). Thus only one call to binomial coefficient is required. These can be
computed and stored in an array to exploit reuse via symmetry: binom(n, k) =
binom(n, n-k).
This method is limited to k<=20 due to the use of the factorial.
However there is an alternative implementation using forward recursion:
{code:java}
// S(n, k) = k * S(n - 1, k) + S(n - 1, k - 1)
return Math.addExact(
Math.multiplyExact(k, stirlingS2(n - 1, k)),
stirlingS2(n - 1, k - 1)
); {code}
This is not limited by k!.
A quick test using the following loop shows that when n is large then the
maximum supported value for k is close to n.
{code:java}
@Test
void test() {
int count = 0;
for (int k = 2;; k++) {
for (int n = k + 3;; n++) {
try {
CombinatoricsUtils.stirlingS2(n, k);
count++;
} catch (Exception ex) {
if (n == k + 3) {
// limit
return;
}
System.out.printf("S(%d, %d) %d%n", n - 1, k, count);
break;
}
}
}
} {code}
Output:
{noformat}
S(64, 2) 60
S(41, 3) 96
S(33, 4) 123
S(30, 5) 146
S(28, 6) 166
S(26, 7) 183
...
S(183, 179) 940
S(184, 180) 942
S(185, 181) 944
S(186, 182) 946
S(187, 183) 948
S(188, 184) 950
S(189, 185) 952{noformat}
When n >= 189, (n-k) <= 4. So it is possible to create some checks based on
input n and k to quickly error.
This method uses recursive method calls and is limited for very large n by a
stack overflow. I did not find this to occur in testing for k=n-3, but it did
for k=n-2. Each stirlingS2() will use 2 method calls and thus the method
recursion grows non-linearly as n increases. The recursion is terminated when
k=2 or k=n-1 and so the stack depth is limited from squared growth. At large n,
k is limited (by the value of the final result) to be close to n so the
computation recursion using S(n-1, k) will at most recurse a few method calls.
The main recursion will be in the S(n-1, k-1) call which could recurse
thousands of calls.
Note that the other implementation cannot compute these values at all.
A check with mathematica shows that large n can be computed and the cut-off is
below n=2800, k=n-3, e.g.
{noformat}
StirlingS2[2000, 1997] = 1326015650507998500
StirlingS2[2500, 2497] = 5063921680676560625
StirlingS2[2750, 2747] = 8974638660543443250
// Too large
StirlingS2[2800, 2797] = 10000001047387677900
StirlingS2[3000, 2997] = 15131891757955497750
{noformat}
Computation using the method recursion implementation for S(2750, 2747) does
compute the correct result. The S(2800, 2797) raises arithmetic exception. So
at the limit on my test machine a stack overflow does not occur for k=n-3.
For k=n-2:
{code:java}
// OK
CombinatoricsUtils.stirlingS2(4000, 3998) = 31962680665000
// Overflow
CombinatoricsUtils.stirlingS2(5000, 4998) = 78052105206250 ~ 0.001% of
2^63{code}
However this simple case of k=n-2 is possible to compute without recursive
calls as the recursion reduces to a summation of the binomial coefficient. Here
is the method without overflow checks. It computes S(5000, 4998) correctly:
{code:java}
if (k == n - 2) {
// Avoid method call recursion
long sum = 0;
for (int i = k; i > 0; i--) {
sum += i * BinomialCoefficient.value(i + 1, 2);
}
return sum;
} {code}
h2. Summary
A port of this method is possible. It can be updated to: fix the bug in the
implementation; consolidate on the raised exceptions; use a recursive algorithm
for k=n-3 and a loop for k=n-2 when n is large.
If the recursive method is used for the main computation then a warning will
have to be added to the javadoc that a stack overflow error may occur for large
n when k == n - 3.
I suggest adding to the combinatorics package as e.g.
{code:java}
long StirlingS2.value(int n, int k){code}
> Move combinatorics utilities from "Commons Math"
> ------------------------------------------------
>
> Key: NUMBERS-29
> URL: https://issues.apache.org/jira/browse/NUMBERS-29
> Project: Commons Numbers
> Issue Type: Task
> Reporter: Gilles Sadowski
> Priority: Minor
> Labels: module, move
> Fix For: 1.1
>
>
> Create a new {{commons-numbers-combinatorics}} module to contain the code in
> classes {{CombinatoricsUtils}} and {{Combinations}} (located in package
> {{o.a.c.math4.util}}).
--
This message was sent by Atlassian Jira
(v8.20.10#820010)