A quick look at the source code, seems like you are going for a Numpy+Pandas 
implementations (evidenced by the DataFrame implementation, which closely 
follows the way Pandas implements it). Just FYI, Wes McKinney, the original 
Pandas author and main contributor, has written an 
[article](https://wesmckinney.com/blog/apache-arrow-pandas-internals/) listing 
what he considered the biggest problems with Pandas (3 years ago already) -- 
and I've seen him somewhere else describing how the memory allocation scheme 
(keeping all ints together, all floats together, etc.) is a really bad decision 
-- that it is much better in practical dataframe use to manage each column 
individually (this is only shortly touched in the blog post I linked to).

Also, it's probably a good idea to take a look at [ggplotnim's 
dataframe](https://github.com/Vindaar/ggplotnim#data-frame) which is (a) 
already implemented (b) on top of arraymancer and (c) is integrated with, and 
derived from, an excellent plotting library; so that when you want to do some 
graphics, you don't have to reimplement plotnine of matplotlib.

Reply via email to