@krux: Thanks for the feedback! I was thinking about your `dataLanguage` idea 
as well. But I don't know if this would allow good composability in larger 
projects, i.e., you want data frames to be something that can be passed around 
etc. What I can probably do in the end is to write a macro which takes the 
iteration body:
    
    
    iterate(dataFrame):
      echo x
    

and the macro would analyze the processing pipeline and generate a single for 
loop internally. The good thing is that the user side of the API only exposes 
transformations and actions. So it doesn't really matter for now if I use 
closure iterators, and I can still switch the iteration logic internally later.

And the good news is that I get exceptionally good performance with the closure 
iterator approach already (I pushed a few first [benchmarks 
results](https://github.com/bluenote10/NimData#benchmarks)). I have optimized 
the CSV parsing macro a little bit, and CSV parsing is now a factor of 2 faster 
than Pandas, which is known to have a very fast parser. As expected for data 
which is still too small for Spark to shine, Nim is faster by a factor of 10 
(although Spark already runs on 4 cores).

I also made some good progress in terms of features and updated the 
documentation a lot, so this is reaching a state where it is actually pretty 
much usable.

@perturbation2: Yes very good point, and thanks for the link. If Nim will not 
feature multi dispatch this might become a problem. For now I don't need multi 
dispatch for the higher order functions like `map` though (and maybe I never 
will), so I will simply stick to using a proc (which I hope will continue to 
work).

I also understand issue 3 now, for the record: The compiler is basically 
complaining that the base method has a provable lock level of 0, while one of 
the overloaded methods calls a procvar, and potentially, this procvar can lock. 
The locking system needs to know lock levels at compile time though, which is 
ruined by the dynamic dispatching in this case. There are two ways out: 
Convince the compiler that the procvar does not lock, or tell the compiler in 
the base method that there will be overloads of unknown lock level. The former 
would be nicer, but I can't really get it to work for now -- for the latter I 
have opened a tiny PR to allow doing that in Nim.

Reply via email to