On Wednesday, 7 September 2016 at 20:37:50 UTC, jmh530 wrote:
On Wednesday, 7 September 2016 at 19:19:23 UTC, data pulverizer
wrote:
For some time I have been considering a problem to do with
creating tables with unbounded types, one of the failed
attempts is here:
https://forum.dlang.org/thread/gdjaoxypicsxlfvzw...@forum.dlang.org?page=1
I then exchanged emails with Lucian, Sparrows creator and he
very quickly and simply outlined the solution to the problem.
Thereafter I read his PhD thesis - one of the most informative
texts in computer science I have read and very well written.
At the moment, there are lots of languages attempting to solve
the dynamic-static loop, being able to have features inherent
in dynamic programming languages, while keeping the safety and
performance that comes with a static compiled programming
language, and then doing so in a language that doesn't cause
your brain to bleed. The "One language to rule them all" motif
of Julia has hit the rocks; one reason is because they now
realize that their language is being held back because the
compiler cannot infer certain types for example:
http://www.johnmyleswhite.com/notebook/2015/11/28/why-julias-dataframes-are-still-slow/
I don't see any reason why D can't implement pandas DataFrames
without needing to change the language at all
http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.html
It's just a lot of work.
The simplest I can think of is a struct containing a tuple that
contains slices of equal length and an array of strings
containing column names. You could have a specialization with a
two-dimensional array (or ndslice).
You're quite right that D doesn't need to change at all to
implement something like pandas or dataframes in R, but I am
thinking of how to got further. Very often in data science
applications types will turn up that are required but are not
currently configured for your table. The choice you have is to
have to modify the code or as scala does give programmers the
ability to write their own interface to the type so that the it
can be stored in their DataFrame. The best solution is that the
data table is able to cope with arbitrary number of types which
can be done in Sparrow.