Scala Spark-like RDD for D?

2016-02-15 Thread data pulverizer via Digitalmars-d-learn
Are there are any plans to create a scala spark-like RDD class for D (https://www.cs.berkeley.edu/~matei/papers/2012/nsdi_spark.pdf)? This is a powerful model that has taken the data science world by storm; it would be useful to have something like this in the D world. Most of the algorithms

Re: Speed of csvReader

2016-01-22 Thread data pulverizer via Digitalmars-d-learn
On Friday, 22 January 2016 at 02:16:14 UTC, H. S. Teoh wrote: On Thu, Jan 21, 2016 at 04:50:12PM -0800, H. S. Teoh via Digitalmars-d-learn wrote: [...] > > https://github.com/quickfur/fastcsv [...] Fixed some boundary condition crashes and reverted doubled quote handling in unquoted

Re: Speed of csvReader

2016-01-22 Thread data pulverizer via Digitalmars-d-learn
On Friday, 22 January 2016 at 21:41:46 UTC, data pulverizer wrote: On Friday, 22 January 2016 at 02:16:14 UTC, H. S. Teoh wrote: [...] Hi H. S. Teoh, I have used you fastcsv on my file: import std.file; import fastcsv; import std.stdio; import std.datetime; void main(){ StopWatch sw;

Re: Speed of csvReader

2016-01-21 Thread data pulverizer via Digitalmars-d-learn
On Thursday, 21 January 2016 at 10:40:39 UTC, data pulverizer wrote: On Thursday, 21 January 2016 at 10:20:12 UTC, Rikki Cattermole wrote: Okay without registering not gonna get that data. So usual things to think about, did you turn on release mode? What about inlining? Lastly how about

Re: Speed of csvReader

2016-01-21 Thread data pulverizer via Digitalmars-d-learn
On Thursday, 21 January 2016 at 11:08:18 UTC, Ali Çehreli wrote: On 01/21/2016 02:40 AM, data pulverizer wrote: dmd -release -inline code.d These two as well please: -O -boundscheck=off the ingest of files and speed of calculation is very important to me. We should understand why D is

Speed of csvReader

2016-01-21 Thread data pulverizer via Digitalmars-d-learn
I have been reading large text files with D's csv file reader and have found it slow compared to R's read.table function which is not known to be particularly fast. Here I am reading Fannie Mae mortgage acquisition data which can be found here

Re: Speed of csvReader

2016-01-21 Thread data pulverizer via Digitalmars-d-learn
On Thursday, 21 January 2016 at 10:20:12 UTC, Rikki Cattermole wrote: Okay without registering not gonna get that data. So usual things to think about, did you turn on release mode? What about inlining? Lastly how about disabling the GC? import core.memory : GC; GC.disable(); dmd -release

Re: Speed of csvReader

2016-01-21 Thread data pulverizer via Digitalmars-d-learn
On Thursday, 21 January 2016 at 14:56:13 UTC, Saurabh Das wrote: On Thursday, 21 January 2016 at 14:32:52 UTC, Saurabh Das wrote: On Thursday, 21 January 2016 at 13:42:11 UTC, Edwin van Leeuwen wrote: On Thursday, 21 January 2016 at 09:39:30 UTC, data pulverizer wrote: StopWatch sw;

Re: Speed of csvReader

2016-01-21 Thread data pulverizer via Digitalmars-d-learn
On Thursday, 21 January 2016 at 15:17:08 UTC, data pulverizer wrote: On Thursday, 21 January 2016 at 14:56:13 UTC, Saurabh Das wrote: On Thursday, 21 January 2016 at 14:32:52 UTC, Saurabh Das Actually since you're aiming for speed, this might be better: sw.start(); auto records =

Re: Speed of csvReader

2016-01-21 Thread data pulverizer via Digitalmars-d-learn
On Thursday, 21 January 2016 at 16:25:55 UTC, bachmeier wrote: On Thursday, 21 January 2016 at 10:48:15 UTC, data pulverizer wrote: Running Ubuntu 14.04 LTS In that case, have you looked at http://lancebachmeier.com/rdlang/ If this is a serious bottleneck you can solve it with two lines

Re: Speed of csvReader

2016-01-21 Thread data pulverizer via Digitalmars-d-learn
On Thursday, 21 January 2016 at 16:01:33 UTC, wobbles wrote: Interesting that reading a file is so slow. Your timings from R, is that including reading the file also? Yes, its just insane isn't it?

Re: Speed of csvReader

2016-01-21 Thread data pulverizer via Digitalmars-d-learn
On Thursday, 21 January 2016 at 17:17:52 UTC, Saurabh Das wrote: On Thursday, 21 January 2016 at 17:10:39 UTC, data pulverizer wrote: On Thursday, 21 January 2016 at 16:01:33 UTC, wobbles wrote: Interesting that reading a file is so slow. Your timings from R, is that including reading the

Re: Speed of csvReader

2016-01-21 Thread data pulverizer via Digitalmars-d-learn
On Thursday, 21 January 2016 at 18:31:17 UTC, data pulverizer wrote: Good news and bad new. I was going for something similar to what you have above and both slash the time alot: Time (s): 1.024 But now the output is a little garbled. For some reason the splitter isn't splitting correctly -

Re: Speed of csvReader

2016-01-21 Thread data pulverizer via Digitalmars-d-learn
On Thursday, 21 January 2016 at 23:58:35 UTC, H. S. Teoh wrote: On Thu, Jan 21, 2016 at 11:29:49PM +, data pulverizer via Digitalmars-d-learn wrote: On Thursday, 21 January 2016 at 21:24:49 UTC, H. S. Teoh wrote: >On Thu, Jan 21, 2016 at 07:11:05PM +, Jesse Phillips via >This piq

Re: Speed of csvReader

2016-01-21 Thread data pulverizer via Digitalmars-d-learn
On Thursday, 21 January 2016 at 21:24:49 UTC, H. S. Teoh wrote: On Thu, Jan 21, 2016 at 07:11:05PM +, Jesse Phillips via This piqued my interest today, so I decided to take a shot at writing a fast CSV parser. First, I downloaded a sample large CSV file from: [...] Hi H. S. Teoh, I

Re: Speed of csvReader

2016-01-21 Thread data pulverizer via Digitalmars-d-learn
On Thursday, 21 January 2016 at 20:46:15 UTC, Gerald Jansen wrote: On Thursday, 21 January 2016 at 09:39:30 UTC, data pulverizer wrote: I have been reading large text files with D's csv file reader and have found it slow compared to R's read.table function This great blog post has an

Re: Speed of csvReader

2016-01-21 Thread data pulverizer via Digitalmars-d-learn
On Thursday, 21 January 2016 at 23:58:35 UTC, H. S. Teoh wrote: are there flags that I should be compiling with or some other thing that I am missing? Did you supply a main() function? If not, it won't run, because fastcsv.d is only a module. If you want to run the benchmark, you'll have to

Re: Speed of csvReader

2016-01-21 Thread data pulverizer via Digitalmars-d-learn
On Thursday, 21 January 2016 at 18:46:03 UTC, Justin Whear wrote: On Thu, 21 Jan 2016 18:37:08 +, data pulverizer wrote: It's interesting that the output first array is not the same as the input byLine reuses a buffer (for speed) and the subsequent split operation just returns slices

Re: Speed of csvReader

2016-01-21 Thread data pulverizer via Digitalmars-d-learn
On Thursday, 21 January 2016 at 19:08:38 UTC, data pulverizer wrote: On Thursday, 21 January 2016 at 18:46:03 UTC, Justin Whear wrote: On Thu, 21 Jan 2016 18:37:08 +, data pulverizer wrote: It's interesting that the output first array is not the same as the input byLine reuses a buffer

Re: Functions that return type

2016-01-17 Thread data pulverizer via Digitalmars-d-learn
On Sunday, 17 January 2016 at 02:08:06 UTC, Timon Gehr wrote: On 01/16/2016 11:50 PM, data pulverizer wrote: I guess the constraints are that of a static language. (This is not true.) Could you please explain?

Re: Functions that return type

2016-01-16 Thread data pulverizer via Digitalmars-d-learn
On Saturday, 16 January 2016 at 21:22:15 UTC, data pulverizer wrote: Is it possible to create a function that returns Type like typeof() does? Something such as: Type returnInt(){ return int; } More to the point what is the Type of a type such as int? Thanks p.s. I am aware I could do

Functions that return type

2016-01-16 Thread data pulverizer via Digitalmars-d-learn
Is it possible to create a function that returns Type like typeof() does? Something such as: Type returnInt(){ return int; } More to the point what is the Type of a type such as int? Thanks

Re: Functions that return type

2016-01-16 Thread data pulverizer via Digitalmars-d-learn
On Saturday, 16 January 2016 at 21:59:22 UTC, data pulverizer wrote: On Saturday, 16 January 2016 at 21:22:15 UTC, data pulverizer wrote: Is it possible to create a function that returns Type like typeof() does? Something such as: Type returnInt(){ return int; } More to the point what is

noob in c macro preprocessor hell converting gsl library header files

2016-01-06 Thread data pulverizer via Digitalmars-d-learn
I have been converting C numeric libraries and depositing them here: https://github.com/dataPulverizer. So far I have glpk and nlopt converted on a like for like c function basics. I am now stuck on the gsl library, primarily because of the preprocessor c code which I am very new to. The

Re: noob in c macro preprocessor hell converting gsl library header files

2016-01-06 Thread data pulverizer via Digitalmars-d-learn
On Wednesday, 6 January 2016 at 13:59:44 UTC, John Colvin wrote: #define INLINE_FUN extern inline // used in gsl_pow_int.h: INLINE_FUN double gsl_pow_2(const double x) { return x*x; } Could I just ignore the INLINE_FUN and use alias for function pointer declaration? For example ... alias

Re: Repeated struct definitions for graph data structures and in/out naming conflict in C library

2016-01-03 Thread data pulverizer via Digitalmars-d-learn
Thanks library now compiles. On Sunday, 3 January 2016 at 13:45:13 UTC, anonymous wrote: On 03.01.2016 14:30, data pulverizer wrote: I am trying to access functionality in the glpk C library using extern(C). It has graph structs in its header file that are specified in an odd recurring

Repeated struct definitions for graph data structures and in/out naming conflict in C library

2016-01-03 Thread data pulverizer via Digitalmars-d-learn
Dear D Gurus, I am trying to access functionality in the glpk C library using extern(C). It has graph structs in its header file that are specified in an odd recurring manner that I cannot reproduce in D: typedef struct glp_graph glp_graph; typedef struct glp_vertex glp_vertex; typedef

Re: OT: why do people use python when it is slow?

2015-10-18 Thread data pulverizer via Digitalmars-d-learn
On Thursday, 15 October 2015 at 21:16:18 UTC, Laeeth Isharc wrote: On Wednesday, 14 October 2015 at 22:11:56 UTC, data pulverizer wrote: On Tuesday, 13 October 2015 at 23:26:14 UTC, Laeeth Isharc wrote: https://www.quora.com/Why-is-Python-so-popular-despite-being-so-slow Andrei suggested

Re: OT: why do people use python when it is slow?

2015-10-15 Thread data pulverizer via Digitalmars-d-learn
On Thursday, 15 October 2015 at 02:20:42 UTC, jmh530 wrote: On Wednesday, 14 October 2015 at 22:11:56 UTC, data pulverizer wrote: On Tuesday, 13 October 2015 at 23:26:14 UTC, Laeeth Isharc wrote: https://www.quora.com/Why-is-Python-so-popular-despite-being-so-slow Andrei suggested posting more

Re: OT: why do people use python when it is slow?

2015-10-15 Thread data pulverizer via Digitalmars-d-learn
On Thursday, 15 October 2015 at 07:57:51 UTC, Russel Winder wrote: On Thu, 2015-10-15 at 06:48 +, data pulverizer via Digitalmars-d- learn wrote: Just because D doesn't have this now doesn't mean it cannot. C doesn't have such capability but R and Python do even though R and CPython

Re: OT: why do people use python when it is slow?

2015-10-14 Thread data pulverizer via Digitalmars-d-learn
On Tuesday, 13 October 2015 at 23:26:14 UTC, Laeeth Isharc wrote: https://www.quora.com/Why-is-Python-so-popular-despite-being-so-slow Andrei suggested posting more widely. I am coming at D by way of R, C++, Python etc. so I speak as a statistician who is interested in data science

dynamic get from variantArray() data table

2015-10-13 Thread data pulverizer via Digitalmars-d-learn
Hi, I am trying to use variantArray() as a data table object to hold columns each of which is an array of a specific type. I need to be able to get values from data table but I am having problems ... import std.stdio; // i/o import std.variant; // type variations void main(){ // Columns

Re: dynamic get from variantArray() data table

2015-10-13 Thread data pulverizer via Digitalmars-d-learn
Thanks for the suggestion Alex, however I need the dynamic behaviour properties of variantArray(), writing a struct each time would be undesirable. Perhaps I could boil down the question to something like, is there a way of writing auto x = dt[0][0]; auto y = x.get!(x.type - or whatever);

<    1   2   3   4