On Sunday, 21 June 2015 at 16:17:57 UTC, Russel Winder wrote:
Contributing to the "What D needs to get traction" debate
ongoing in various threads, a bit of feedback from the PyData
London 2015 day yesterday (I couldn't get there Friday or
today).
Data science folk use Python because of
NumPy/SciPy/Matplotlib/Pandas. And IPython (soon to be
Jupyter). Julia is on the radar, but…
NumPy is actually relatively easy to crack (it is just an
n-dimensional array type with algorithms), which means most of
SciPy is straightforward (it just adds stuff on NumPy).
Matplotlib cannot be competed against so D needs to ensure it
can very trivially interwork with Python and Matplotlib.
C-linkage and CFFI attacks much of this, PyD attack much of the
rest. This leaves Pandas (which is about time series and
n-dimensional equivalents) and Jupyter (which is about creating
Markdown or LaTeX documents with embedded executable code
fragments).
If D had a library that attacked the capabilities offered by
Pandas and could be a language usable in Jupyter, there is an
angle for serious usage as long as D performs orders of
magnitude faster than NumPy and faster than Cython code.
At the heart of all this is a review of std.parallelism to make
sure we
can get better performance than we currently do.
Thanks for the colour, Russell.
I agree about NumPy and Pandas - the foundations are not so hard
to replicate (but better!) John Colvin and Ilya seem to be
working on this now (and Vlad Levenfeld's stuff too).
I don't know about matplotlib. It's pretty easy to use D to
chart using it, but I didn't find it the friendliest library for
what I wanted to do. And bokeh is nice for interactivity (which
is easy to talk to via python, but wouldn't be hard to write a D
wrapper for - something I made a start on - since it is only
object representation, and no real hard work on the server side).
Is matplotlib better than mathgl? (I don't have enough
experience of either to have a view).
But D is a language usable in Jupyter - I have been playing with
it for a few days now. Main thing missing for it to be very
usable is seeing the compiler output in a pretty manner (well,
actually just making it visible, would be a start) and making a
nice way to be able to use dub with PyD/PyDmagic.
If you review std.parallelism, would it be worth adding
fork/processes there as seems like for some purposes that may be
better than threading?
Laeeth.