On Thursday, 31 January 2013 at 13:41:13 UTC, Andrei Alexandrescu
wrote:
On 1/31/13 5:18 AM, Don wrote:
std.numeric is not superficially flawed, it's fundamentally
flawed. What
is it for? What is its theme? The problem is, std.numeric is
one of the
few good names which are left as a possible package name,
after C
insulted the mathematical community by creating a module
called 'math'.
Guilty as charged. I've put stuff in std.numeric as I was
working on my thesis. I recall you added some stuff there too.
As I'm sure you remember the state of D in 2007 was rather
different than that of today. Overall no need to get agitated
here, we're all on the same boat and aiming for the same shore.
Sorry if that came across as agitated, it wasn't intended to be.
As you noted, I have code in there as well.
It's just one of those old modules that needs to be cleaned up,
though it reveals a deeper issue - see below.
Let's see what we have there:
entropy
CustomFloat
kullbackLeiblerDivergence
Fft
gapWeightedSimilarityIncremental
gapWeightedSimilarity
gapWeightedSimilarityNormalized
FPTemporary
findRoot
euclideanDistance
dotProduct
cosineSimilarity
gcd
jensenShannonDivergence
normalize
secantMethod
The general theme is obvious - numeric algorithms and data
structures. Many are obvious and with obvious utility to one
interested in numerics: entropy, various distance and
similarity measures. I think you wrote findRoot.
Yes.
The basic problem is that there are hundreds of potential numeric
algorithms and data structures of equal importance to these ones.
In fact, the total number of mathematical algorithms is probably
a substantial fraction of the total algorithms in computer
science!
Even a module which contained only FFT, could be quite large,
once it included all the important related transforms.
The gapWeightedSimilarity algorithms are string kernels. They
are somewhat niche but quite powerful to anyone interested in
string similarity (technically they are string edit distance on
steroids). They might belong in std.string but I figured they
have enough numeric algorithm flavor to put them in there.
So let's itemize the grievances and see how we can sort this
out.
I'm not sure that we can solve this without addressing the
high-level question: What is the scope of Phobos?
How big will it eventually get? Twice its current size? Ten
times? A hundred times?
Both SmallPhobos and LargePhobos are reasonable, but we do have
to pick one. Currently we have aspects of both approaches, but
they aren't compatible.
The current approach of putting everything directly into a single
level in std doesn't scale very far -- it will look very clumsy
once it gets more than (say) three times larger. This argues for
SmallPhobos.
But if it doesn't get to be at least ten times larger, some of
this niche stuff shouldn't be in there, they are functions from
LargePhobos. If we go with SmallPhobos then we need to move the
niche stuff somewhere else.