Re: problem with parallel foreach

2015-05-15 Thread Gerald Jansen via Digitalmars-d-learn
On Thursday, 14 May 2015 at 17:12:07 UTC, John Colvin wrote: Would it be OK if I showed some parts of this code as examples in my DConf talk in 2 weeks? Sure!!!

Re: problem with parallel foreach

2015-05-14 Thread John Colvin via Digitalmars-d-learn
On Thursday, 14 May 2015 at 10:46:53 UTC, Gerald Jansen wrote: John Colvin's improvements to my D program seem to have resolved the problem. (http://forum.dlang.org/post/ydgmzhlspvvvrbeem...@forum.dlang.org and http://dpaste.dzfl.pl/114d5a6086b7). I have rerun my tests and now the picture is a

Re: problem with parallel foreach

2015-05-14 Thread Gerald Jansen via Digitalmars-d-learn
John Colvin's improvements to my D program seem to have resolved the problem. (http://forum.dlang.org/post/ydgmzhlspvvvrbeem...@forum.dlang.org and http://dpaste.dzfl.pl/114d5a6086b7). I have rerun my tests and now the picture is a bit different (see tables below). In the middle table I have

Re: problem with parallel foreach

2015-05-13 Thread John Colvin via Digitalmars-d-learn
On Wednesday, 13 May 2015 at 14:43:50 UTC, John Colvin wrote: On Wednesday, 13 May 2015 at 14:28:52 UTC, Gerald Jansen wrote: On Wednesday, 13 May 2015 at 13:40:33 UTC, John Colvin wrote: On Wednesday, 13 May 2015 at 11:33:55 UTC, John Colvin wrote: On Tuesday, 12 May 2015 at 18:14:56 UTC, Ger

Re: problem with parallel foreach

2015-05-13 Thread Gerald Jansen via Digitalmars-d-learn
On Wednesday, 13 May 2015 at 12:16:19 UTC, weaselcat wrote: On Wednesday, 13 May 2015 at 09:01:05 UTC, Gerald Jansen wrote: On Wednesday, 13 May 2015 at 03:19:17 UTC, thedeemon wrote: In case of Python's parallel.Pool() separate processes do the work without any synchronization issues. In case

Re: problem with parallel foreach

2015-05-13 Thread John Colvin via Digitalmars-d-learn
On Wednesday, 13 May 2015 at 14:28:52 UTC, Gerald Jansen wrote: On Wednesday, 13 May 2015 at 13:40:33 UTC, John Colvin wrote: On Wednesday, 13 May 2015 at 11:33:55 UTC, John Colvin wrote: On Tuesday, 12 May 2015 at 18:14:56 UTC, Gerald Jansen wrote: On Tuesday, 12 May 2015 at 16:35:23 UTC, Rikk

Re: problem with parallel foreach

2015-05-13 Thread Gerald Jansen via Digitalmars-d-learn
On Wednesday, 13 May 2015 at 13:40:33 UTC, John Colvin wrote: On Wednesday, 13 May 2015 at 11:33:55 UTC, John Colvin wrote: On Tuesday, 12 May 2015 at 18:14:56 UTC, Gerald Jansen wrote: On Tuesday, 12 May 2015 at 16:35:23 UTC, Rikki Cattermole wrote: On 13/05/2015 4:20 a.m., Gerald Jansen wrot

Re: problem with parallel foreach

2015-05-13 Thread Gerald Jansen via Digitalmars-d-learn
On Wednesday, 13 May 2015 at 14:11:25 UTC, Gerald Jansen wrote: On Wednesday, 13 May 2015 at 11:33:55 UTC, John Colvin wrote: On Tuesday, 12 May 2015 at 18:14:56 UTC, Gerald Jansen wrote: On Tuesday, 12 May 2015 at 16:35:23 UTC, Rikki Cattermole wrote: On 13/05/2015 4:20 a.m., Gerald Jansen wr

Re: problem with parallel foreach

2015-05-13 Thread Gerald Jansen via Digitalmars-d-learn
On Wednesday, 13 May 2015 at 11:33:55 UTC, John Colvin wrote: On Tuesday, 12 May 2015 at 18:14:56 UTC, Gerald Jansen wrote: On Tuesday, 12 May 2015 at 16:35:23 UTC, Rikki Cattermole wrote: On 13/05/2015 4:20 a.m., Gerald Jansen wrote: At the risk of great embarassment ... here's my program: ht

Re: problem with parallel foreach

2015-05-13 Thread John Colvin via Digitalmars-d-learn
On Wednesday, 13 May 2015 at 11:33:55 UTC, John Colvin wrote: On Tuesday, 12 May 2015 at 18:14:56 UTC, Gerald Jansen wrote: On Tuesday, 12 May 2015 at 16:35:23 UTC, Rikki Cattermole wrote: On 13/05/2015 4:20 a.m., Gerald Jansen wrote: At the risk of great embarassment ... here's my program: ht

Re: problem with parallel foreach

2015-05-13 Thread weaselcat via Digitalmars-d-learn
On Wednesday, 13 May 2015 at 09:01:05 UTC, Gerald Jansen wrote: On Wednesday, 13 May 2015 at 03:19:17 UTC, thedeemon wrote: In case of Python's parallel.Pool() separate processes do the work without any synchronization issues. In case of D's std.parallelism it's just threads inside one process

Re: problem with parallel foreach

2015-05-13 Thread John Colvin via Digitalmars-d-learn
On Tuesday, 12 May 2015 at 18:14:56 UTC, Gerald Jansen wrote: On Tuesday, 12 May 2015 at 16:35:23 UTC, Rikki Cattermole wrote: On 13/05/2015 4:20 a.m., Gerald Jansen wrote: At the risk of great embarassment ... here's my program: http://dekoppel.eu/tmp/pedupg.d Would it be possible to give us

Re: problem with parallel foreach

2015-05-13 Thread Rikki Cattermole via Digitalmars-d-learn
On 13/05/2015 2:59 a.m., Gerald Jansen wrote: I am a data analyst trying to learn enough D to decide whether to use D for a new project rather than Python + Fortran. I have recoded a non-trivial Python program to do some simple parallel data processing (using the map function in Python's multipr

Re: problem with parallel foreach

2015-05-13 Thread Gerald Jansen via Digitalmars-d-learn
On Wednesday, 13 May 2015 at 03:19:17 UTC, thedeemon wrote: In case of Python's parallel.Pool() separate processes do the work without any synchronization issues. In case of D's std.parallelism it's just threads inside one process and they do fight for some locks, thus this result. Okay, so t

Re: problem with parallel foreach

2015-05-13 Thread thedeemon via Digitalmars-d-learn
On Wednesday, 13 May 2015 at 06:59:02 UTC, Ali Çehreli wrote: > In case of Python's parallel.Pool() separate processes do the > work without any synchronization issues. In case of D's > std.parallelism it's just threads inside one process and they > do fight for some locks, thus this result. Rig

Re: problem with parallel foreach

2015-05-13 Thread Ali Çehreli via Digitalmars-d-learn
On 05/12/2015 08:19 PM, thedeemon wrote: > In case of Python's parallel.Pool() separate processes do the > work without any synchronization issues. In case of D's > std.parallelism it's just threads inside one process and they > do fight for some locks, thus this result. Right. To do the same in

Re: problem with parallel foreach

2015-05-12 Thread thedeemon via Digitalmars-d-learn
On Tuesday, 12 May 2015 at 20:50:45 UTC, Gerald Jansen wrote: Your advice is appreciated but quite disheartening. I was hoping for something (nearly) as easy to use as Python's parallel.Pool() map(), given that this is essentially an "embarassingly parallel" problem. Avoidance of GC allocation

Re: problem with parallel foreach

2015-05-12 Thread Gerald Jansen via Digitalmars-d-learn
On Tuesday, 12 May 2015 at 20:58:16 UTC, Vladimir Panteleev wrote: On Tuesday, 12 May 2015 at 18:14:56 UTC, Gerald Jansen wrote: On Tuesday, 12 May 2015 at 16:35:23 UTC, Rikki Cattermole wrote: On 13/05/2015 4:20 a.m., Gerald Jansen wrote: At the risk of great embarassment ... here's my progra

Re: problem with parallel foreach

2015-05-12 Thread Vladimir Panteleev via Digitalmars-d-learn
On Tuesday, 12 May 2015 at 18:14:56 UTC, Gerald Jansen wrote: On Tuesday, 12 May 2015 at 16:35:23 UTC, Rikki Cattermole wrote: On 13/05/2015 4:20 a.m., Gerald Jansen wrote: At the risk of great embarassment ... here's my program: http://dekoppel.eu/tmp/pedupg.d Would it be possible to give us

Re: problem with parallel foreach

2015-05-12 Thread Gerald Jansen via Digitalmars-d-learn
On Tuesday, 12 May 2015 at 17:45:54 UTC, thedeemon wrote: On Tuesday, 12 May 2015 at 17:02:19 UTC, Gerald Jansen wrote: About 3.5 million lines read by main(), 0.5 to 2 million lines read and 3.5 million lines written by runTraits (aka runJob). Each GC allocation in D is a locking operation (

Re: problem with parallel foreach

2015-05-12 Thread Gerald Jansen via Digitalmars-d-learn
On Tuesday, 12 May 2015 at 19:14:23 UTC, Laeeth Isharc wrote: But if you disable the logging does that change things? There is only a tiny bit of logging happening. And are you using optimization on gdc ? gdc -Ofast -march=native -frelease Also try byLineFast eg http://forum.dlang.org/th

Re: problem with parallel foreach

2015-05-12 Thread Laeeth Isharc via Digitalmars-d-learn
On Tuesday, 12 May 2015 at 19:10:13 UTC, Laeeth Isharc wrote: On Tuesday, 12 May 2015 at 18:14:56 UTC, Gerald Jansen wrote: On Tuesday, 12 May 2015 at 16:35:23 UTC, Rikki Cattermole wrote: On 13/05/2015 4:20 a.m., Gerald Jansen wrote: At the risk of great embarassment ... here's my program: ht

Re: problem with parallel foreach

2015-05-12 Thread Laeeth Isharc via Digitalmars-d-learn
On Tuesday, 12 May 2015 at 18:14:56 UTC, Gerald Jansen wrote: On Tuesday, 12 May 2015 at 16:35:23 UTC, Rikki Cattermole wrote: On 13/05/2015 4:20 a.m., Gerald Jansen wrote: At the risk of great embarassment ... here's my program: http://dekoppel.eu/tmp/pedupg.d Would it be possible to give us

Re: problem with parallel foreach

2015-05-12 Thread Gerald Jansen via Digitalmars-d-learn
On Tuesday, 12 May 2015 at 16:35:23 UTC, Rikki Cattermole wrote: On 13/05/2015 4:20 a.m., Gerald Jansen wrote: At the risk of great embarassment ... here's my program: http://dekoppel.eu/tmp/pedupg.d Would it be possible to give us some example data? I might give it a go to try rewriting it to

Re: problem with parallel foreach

2015-05-12 Thread thedeemon via Digitalmars-d-learn
On Tuesday, 12 May 2015 at 17:02:19 UTC, Gerald Jansen wrote: About 3.5 million lines read by main(), 0.5 to 2 million lines read and 3.5 million lines written by runTraits (aka runJob). Each GC allocation in D is a locking operation (and disabling GC doesn't help here at all), probably each

Re: problem with parallel foreach

2015-05-12 Thread Gerald Jansen via Digitalmars-d-learn
On Tuesday, 12 May 2015 at 16:46:42 UTC, thedeemon wrote: On Tuesday, 12 May 2015 at 14:59:38 UTC, Gerald Jansen wrote: The output of /usr/bin/time is as follows: Lang JobsUser System Elapsed %CPU Py 2 79.242.16 0:48.90 166 D 2 19.41 10.14 0:17.96 164 Py 30

Re: problem with parallel foreach

2015-05-12 Thread thedeemon via Digitalmars-d-learn
On Tuesday, 12 May 2015 at 14:59:38 UTC, Gerald Jansen wrote: The output of /usr/bin/time is as follows: Lang JobsUser System Elapsed %CPU Py 2 79.242.16 0:48.90 166 D 2 19.41 10.14 0:17.96 164 Py 30 1255.17 58.38 2:39.54 823 * Pool(12) D 30 421.61

Re: problem with parallel foreach

2015-05-12 Thread Rikki Cattermole via Digitalmars-d-learn
On 13/05/2015 4:20 a.m., Gerald Jansen wrote: At the risk of great embarassment ... here's my program: http://dekoppel.eu/tmp/pedupg.d As per Rick's first suggestion (thanks) I added import core.memory : GC; main() GC.disable; GC.reserve(1024 * 1024 * 1024); ... to no avail. thanks

Re: problem with parallel foreach

2015-05-12 Thread Gerald Jansen via Digitalmars-d-learn
At the risk of great embarassment ... here's my program: http://dekoppel.eu/tmp/pedupg.d As per Rick's first suggestion (thanks) I added import core.memory : GC; main() GC.disable; GC.reserve(1024 * 1024 * 1024); ... to no avail. thanks for all the help so far. Gerald ps. I am using G

Re: problem with parallel foreach

2015-05-12 Thread Ali Çehreli via Digitalmars-d-learn
On 05/12/2015 08:35 AM, Gerald Jansen wrote: > I could put it somewhere if that would help. Please do so. We all want to learn to avoid such issues. Thank you, Ali

Re: problem with parallel foreach

2015-05-12 Thread Rikki Cattermole via Digitalmars-d-learn
On 13/05/2015 2:59 a.m., Gerald Jansen wrote: I am a data analyst trying to learn enough D to decide whether to use D for a new project rather than Python + Fortran. I have recoded a non-trivial Python program to do some simple parallel data processing (using the map function in Python's multipr

Re: problem with parallel foreach

2015-05-12 Thread Gerald Jansen via Digitalmars-d-learn
Thanks Ali. I have tried putting GC.disable() in both main and runJob, but the timing behaviour did not change. The python version works in a similar fashion and also has automatic GC. I tend to think that is not the (biggest) problem. The program is long and newbie-ugly ... but I could put it

Re: problem with parallel foreach

2015-05-12 Thread Ali Çehreli via Digitalmars-d-learn
On 05/12/2015 07:59 AM, Gerald Jansen wrote: > the performance of my D version deteriorates > rapidly beyond a handful of jobs whereas the time for the Python version > increases linearly with the number of jobs per cpu core. It may be related to GC collections. If it hasn't been changed recentl

Re: problem with parallel foreach

2015-05-12 Thread John Colvin via Digitalmars-d-learn
On Tuesday, 12 May 2015 at 15:11:01 UTC, John Colvin wrote: On Tuesday, 12 May 2015 at 14:59:38 UTC, Gerald Jansen wrote: I am a data analyst trying to learn enough D to decide whether to use D for a new project rather than Python + Fortran. I have recoded a non-trivial Python program to do so

Re: problem with parallel foreach

2015-05-12 Thread John Colvin via Digitalmars-d-learn
On Tuesday, 12 May 2015 at 14:59:38 UTC, Gerald Jansen wrote: I am a data analyst trying to learn enough D to decide whether to use D for a new project rather than Python + Fortran. I have recoded a non-trivial Python program to do some simple parallel data processing (using the map function i