Hi Gilles, thanks for your response. I've run a small JMH benchmark of both old (direct summation) and new (default 16KB tabulation) approaches, and here are the results:
Benchmark Mode Samples Score Error Units o.a.c.m.s.FactorialLogBench.newFactorialLog100 thrpt 20 75066.697 ± 7122.980 ops/ms o.a.c.m.s.FactorialLogBench.newFactorialLog1000 thrpt 20 76905.243 ± 5965.782 ops/ms o.a.c.m.s.FactorialLogBench.newFactorialLog10K thrpt 20 5770.486 ± 285.069 ops/ms o.a.c.m.s.FactorialLogBench.newFactorialLog100K thrpt 20 6156.102 ± 278.842 ops/ms o.a.c.m.s.FactorialLogBench.oldFactorialLog100 thrpt 20 585.576 ± 26.802 ops/ms o.a.c.m.s.FactorialLogBench.oldFactorialLog1000 thrpt 20 29.883 ± 1.663 ops/ms o.a.c.m.s.FactorialLogBench.oldFactorialLog10K thrpt 20 2.744 ± 0.095 ops/ms o.a.c.m.s.FactorialLogBench.oldFactorialLog100K thrpt 20 0.264 ± 0.007 ops/ms Here *100 calculates the factorial of a random number in range [0..99], *1000 -- of a number in range [900..999], *10K -- of a number in range [9900..9999], and *100K -- of a number in range [99900..99999]. The default tabulated version covers the first two cases with a direct lookup, the third one with a use-the-closest-stored-value approach (about two logs, two multiplications and two additions on average), and the fourth one delegates to logGamma. As you can see, the tabulated (new) approach is about 130x faster on small numbers and the difference grows drastically from there on, up to about 23000x on fairly big numbers. Another interesting result is that logGamma doesn't seem to be slower than the assisted lookup, which possibly indicates that we can drop the assisted lookup altogether, reducing the memory footprint. Sincerely, Aleksei Dievskii. On Thu, Nov 19, 2015 at 11:18 AM, Gilles <gil...@harfang.homelinux.org> wrote: > Hello. > > On Thu, 19 Nov 2015 10:34:46 +0100, Alexey Dievsky wrote: > >> Hi, >> >> I recently submitted an issue to JIRA ( >> https://issues.apache.org/jira/browse/MATH-1293) and received an advice >> to >> discuss it here. >> >> In short, the current implementation of CombinatoricsUtils#factorialLog >> employs direct summation and thus has linear complexity (O(n) additions >> and >> O(n) logs). I suggest replacing it with a tabulated lookup. This solution >> will need to allocate some additional memory, but the speed-up for larger >> n >> would be tremendous. For the values that are too large to be looked up, >> equivalent Gamma#logGamma is still much faster than the existing >> algorithm. >> >> I attached a proposed patch to the issue. However, there are a lot of >> details that need to be worked out, first and foremost whether such >> optimization is at all worth it. >> > > According to your statement above (significantly improved performance), > there > is actually no doubt that the feature should be offered to CM users.[1][2] > The main question is whether a single speed vs memory (vs possibly > increased > initialization time) trade-off is always acceptable. > > Hence at first sight, I'd prefer to provide a factory to create an > object (that will compute the factorial) with a user-defined trade-off. > > Regards, > Gilles > > [1] It would also be nice if you could provide a benchmark. > [2] A real-life use-case would also be nice to enhance the CM userguide. > > Thus I ask for your opinion. Please let me >> know what think. >> >> Sincerely, >> Aleksei Dievskii. >> > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org > For additional commands, e-mail: dev-h...@commons.apache.org > >