On Sunday, 21 June 2015 at 16:17:57 UTC, Russel Winder wrote:
Contributing to the "What D needs to get traction" debate ongoing in various threads, a bit of feedback from the PyData London 2015 day yesterday (I couldn't get there Friday or today).

Data science folk use Python because of NumPy/SciPy/Matplotlib/Pandas. And IPython (soon to be Jupyter). Julia is on the radar, but…

NumPy is actually relatively easy to crack (it is just an n-dimensional array type with algorithms), which means most of SciPy is straightforward (it just adds stuff on NumPy). Matplotlib cannot be competed against so D needs to ensure it can very trivially interwork with Python and Matplotlib. C-linkage and CFFI attacks much of this, PyD attack much of the rest. This leaves Pandas (which is about time series and n-dimensional equivalents) and Jupyter (which is about creating Markdown or LaTeX documents with embedded executable code fragments).

If D had a library that attacked the capabilities offered by Pandas and could be a language usable in Jupyter, there is an angle for serious usage as long as D performs orders of magnitude faster than NumPy and faster than Cython code.

At the heart of all this is a review of std.parallelism to make sure we
can get better performance than we currently do.


Thanks for the colour, Russell.

I agree about NumPy and Pandas - the foundations are not so hard to replicate (but better!) John Colvin and Ilya seem to be working on this now (and Vlad Levenfeld's stuff too).

I don't know about matplotlib. It's pretty easy to use D to chart using it, but I didn't find it the friendliest library for what I wanted to do. And bokeh is nice for interactivity (which is easy to talk to via python, but wouldn't be hard to write a D wrapper for - something I made a start on - since it is only object representation, and no real hard work on the server side).

Is matplotlib better than mathgl? (I don't have enough experience of either to have a view).

But D is a language usable in Jupyter - I have been playing with it for a few days now. Main thing missing for it to be very usable is seeing the compiler output in a pretty manner (well, actually just making it visible, would be a start) and making a nice way to be able to use dub with PyD/PyDmagic.

If you review std.parallelism, would it be worth adding fork/processes there as seems like for some purposes that may be better than threading?


Laeeth.

Reply via email to