On Thursday, 31 January 2013 at 15:38:04 UTC, Don wrote:
On Thursday, 31 January 2013 at 13:41:13 UTC, Andrei
Alexandrescu wrote:
On 1/31/13 5:18 AM, Don wrote:
std.numeric is not superficially flawed, it's fundamentally
flawed. What
is it for? What is its theme? The problem is, std.numeric is
one of the
few good names which are left as a possible package name,
after C
insulted the mathematical community by creating a module
called 'math'.
Guilty as charged. I've put stuff in std.numeric as I was
working on my thesis. I recall you added some stuff there too.
As I'm sure you remember the state of D in 2007 was rather
different than that of today. Overall no need to get agitated
here, we're all on the same boat and aiming for the same shore.
Sorry if that came across as agitated, it wasn't intended to be.
As you noted, I have code in there as well.
It's just one of those old modules that needs to be cleaned up,
though it reveals a deeper issue - see below.
Let's see what we have there:
entropy
CustomFloat
kullbackLeiblerDivergence
Fft
gapWeightedSimilarityIncremental
gapWeightedSimilarity
gapWeightedSimilarityNormalized
FPTemporary
findRoot
euclideanDistance
dotProduct
cosineSimilarity
gcd
jensenShannonDivergence
normalize
secantMethod
The general theme is obvious - numeric algorithms and data
structures. Many are obvious and with obvious utility to one
interested in numerics: entropy, various distance and
similarity measures. I think you wrote findRoot.
Yes.
The basic problem is that there are hundreds of potential
numeric algorithms and data structures of equal importance to
these ones. In fact, the total number of mathematical
algorithms is probably a substantial fraction of the total
algorithms in computer science!
Even a module which contained only FFT, could be quite large,
once it included all the important related transforms.
The gapWeightedSimilarity algorithms are string kernels. They
are somewhat niche but quite powerful to anyone interested in
string similarity (technically they are string edit distance
on steroids). They might belong in std.string but I figured
they have enough numeric algorithm flavor to put them in there.
So let's itemize the grievances and see how we can sort this
out.
I'm not sure that we can solve this without addressing the
high-level question: What is the scope of Phobos?
How big will it eventually get? Twice its current size? Ten
times? A hundred times?
Both SmallPhobos and LargePhobos are reasonable, but we do have
to pick one. Currently we have aspects of both approaches, but
they aren't compatible.
The current approach of putting everything directly into a
single level in std doesn't scale very far -- it will look very
clumsy once it gets more than (say) three times larger. This
argues for SmallPhobos.
But if it doesn't get to be at least ten times larger, some of
this niche stuff shouldn't be in there, they are functions from
LargePhobos. If we go with SmallPhobos then we need to move the
niche stuff somewhere else.
I think having a large standard library inspires confidence in
developers. Rightly or wrongly, code in a standard library has an
appearance of permanence, as opposed to being someone's personal
project that may or may not disappear/cease to be maintained
tomorrow.