+1 for scientific programming in Nim
There are layers of functionality/complexity required (as I see it), based on
comparison to Python and R:
1. vector/matrix algebra
2. dataframes
3. groupby of dataframes (aggregated manipulation of dataframes)
4. easy graphing and display of data
5. memory-limit agnostic structures
As I see it, the development and provision of libraries is at stage 1 (and is
what you are primarily discussing, although Numpy has been mentioned a number
of times.) There are quite a few linear algebra libraries currently available
in nimble.
To provide the equivalent of Numpy requires (among other things) a dataframe
mechanism, which is more than hard-wiring your own seq[]. A dataframe needs to
easily handle multiple fields of different types (dates, strings, ints, floats,
....), and easily display the data, .....
I have listed the third option (GroupBy) as distinct to a dataframe, because
the result of grouping a dataframe is a datafrsame on steroids (a superset of a
dataframe). So I assume dataframes are made up of sequences, and GroupBy
thingies are made up of dataframes.
sequence -> dataframe -> groupby
Point 4 would be good if it was easy to port pyplot to Nim (but it is highly
Python dependent, IIRC). Someone has already provided a library for accessing
gnuPlot, so that might be an option (although its output is not quite as nice)
Point 5 is for handling "large" datasets. Not many people will take it
seriously when the program dies because it couldn't fit the data into memory.
It needs to seamlessly page data to disk as an option for analysing large
datsets. The Spills library does this, but that functionality would need to be
included in the dataframe and groupby functionality.