Re: [R] speed of makeCluster (package parallel)

2013-10-29 Thread Arnaud Mosnier
Thanks Brian, I thought that forking clusters was better ... but as you
mentioned, it is not available on windows.
Unfortunately, you do not always choose the OS used by your company !

Arnaud



Date: Mon, 28 Oct 2013 17:59:10 +
From: Prof Brian Ripley rip...@stats.ox.ac.uk
To: r-help@r-project.org
Subject: Re: [R] speed of makeCluster (package parallel)
Message-ID: 526ea5ee.9060...@stats.ox.ac.uk
Content-Type: text/plain; charset=ISO-8859-1; format=flowed

On 28/10/2013 16:19, Arnaud Mosnier wrote:
 Hi all,

 I am quite new in the world of parallelization and I wonder if there is a
 way to increase the speed of creation of a parallel socket cluster. The
 time spend to include threads increase exponentially with the number of

It increases linearly in my tests (or a decent OS).  But really if
parallel computing is worthwhile you will be doing minutes of work on
each worker process and the startup time will not be signifcant.

 thread considered and I use of computer with two 8 cores CPU and thus
 showing a total of 32 threads in windows 7.

The first way to speed things up: use a decent OS:  forking clusters is
much faster.

 Currently, I use the default parameters (type = PSOCK), but is there any
 fine tuning parameters that I can use to take advantage of this system ?

 Thanks in advance for your help !

 Arnaud

 R version 3.0.1 (2013-05-16)
 Platform: x86_64-w64-mingw32/x64 (64-bit)

   [[alternative HTML version deleted]]


--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] speed of makeCluster (package parallel)

2013-10-28 Thread Arnaud Mosnier
Hi all,

I am quite new in the world of parallelization and I wonder if there is a
way to increase the speed of creation of a parallel socket cluster. The
time spend to include threads increase exponentially with the number of
thread considered and I use of computer with two 8 cores CPU and thus
showing a total of 32 threads in windows 7.
Currently, I use the default parameters (type = PSOCK), but is there any
fine tuning parameters that I can use to take advantage of this system ?

Thanks in advance for your help !

Arnaud

R version 3.0.1 (2013-05-16)
Platform: x86_64-w64-mingw32/x64 (64-bit)

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] speed of makeCluster (package parallel)

2013-10-28 Thread Simon Zehnder
See library(help = parallel”)


On 28 Oct 2013, at 17:19, Arnaud Mosnier a.mosn...@gmail.com wrote:

 Hi all,
 
 I am quite new in the world of parallelization and I wonder if there is a
 way to increase the speed of creation of a parallel socket cluster. The
 time spend to include threads increase exponentially with the number of
 thread considered and I use of computer with two 8 cores CPU and thus
 showing a total of 32 threads in windows 7.
 Currently, I use the default parameters (type = PSOCK), but is there any
 fine tuning parameters that I can use to take advantage of this system ?
 
 Thanks in advance for your help !
 
 Arnaud
 
 R version 3.0.1 (2013-05-16)
 Platform: x86_64-w64-mingw32/x64 (64-bit)
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] speed of makeCluster (package parallel)

2013-10-28 Thread Arnaud Mosnier
Thanks Simon,

I already read the parallel vignette but I did not found what I wanted.
May be you can be more specific on a part of the document that can provide
me hints !

Arnaud


2013/10/28 Simon Zehnder szehn...@uni-bonn.de

 See library(help = parallel”)


 On 28 Oct 2013, at 17:19, Arnaud Mosnier a.mosn...@gmail.com wrote:

  Hi all,
 
  I am quite new in the world of parallelization and I wonder if there is a
  way to increase the speed of creation of a parallel socket cluster. The
  time spend to include threads increase exponentially with the number of
  thread considered and I use of computer with two 8 cores CPU and thus
  showing a total of 32 threads in windows 7.
  Currently, I use the default parameters (type = PSOCK), but is there
 any
  fine tuning parameters that I can use to take advantage of this system ?
 
  Thanks in advance for your help !
 
  Arnaud
 
  R version 3.0.1 (2013-05-16)
  Platform: x86_64-w64-mingw32/x64 (64-bit)
 
[[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] speed of makeCluster (package parallel)

2013-10-28 Thread Simon Zehnder
First,

use only the number of cores as a number of thread - i.e. I would not use hyper 
threading, etc.. Each core has its own caches and it is always fortunate if a 
process has enough memory; hyper threads use all the same cache on the core 
they are running on. detectCores() gives me for example 4 - I know I have 2. I 
would therefore call makeCluster() with nnode = 2. mcaffinity() lets you 
perform a technique called process-pinning (see process affinity) and is only 
possible if the OS supports it. It makes sometimes sense to assign certain 
processes to certain CPUs such that each process has enough memory in caches 
(e.g. for a 16 Core machine using 8 processes on CPUs 1, 3, 5, 7, 9, 11, 13 and 
15; so each process has the cache of two CPUs). 
A lot of functions though are not available for Windows. 

At first it comes always the problem you want to solve and then you look how 
much memory will be used in a process and how much you have (more often the 
memory bandwidth is the bottleneck and not the computing power). Look at the 
architecture of your chips (how much L1 Cache, how much L2 cache). Then you 
decide how many cores to use and if it makes sense to pin processes to certain 
cores. 

There are no general recipes for parallel computing - each problem is 
different. Some problems are even not scalable. 

Simon


On 28 Oct 2013, at 17:51, Arnaud Mosnier a.mosn...@gmail.com wrote:

 Thanks Simon,
 
 I already read the parallel vignette but I did not found what I wanted.
 May be you can be more specific on a part of the document that can provide me 
 hints !
 
 Arnaud
 
 
 2013/10/28 Simon Zehnder szehn...@uni-bonn.de
 See library(help = parallel”)
 
 
 On 28 Oct 2013, at 17:19, Arnaud Mosnier a.mosn...@gmail.com wrote:
 
  Hi all,
 
  I am quite new in the world of parallelization and I wonder if there is a
  way to increase the speed of creation of a parallel socket cluster. The
  time spend to include threads increase exponentially with the number of
  thread considered and I use of computer with two 8 cores CPU and thus
  showing a total of 32 threads in windows 7.
  Currently, I use the default parameters (type = PSOCK), but is there any
  fine tuning parameters that I can use to take advantage of this system ?
 
  Thanks in advance for your help !
 
  Arnaud
 
  R version 3.0.1 (2013-05-16)
  Platform: x86_64-w64-mingw32/x64 (64-bit)
 
[[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] speed of makeCluster (package parallel)

2013-10-28 Thread Prof Brian Ripley

On 28/10/2013 16:19, Arnaud Mosnier wrote:

Hi all,

I am quite new in the world of parallelization and I wonder if there is a
way to increase the speed of creation of a parallel socket cluster. The
time spend to include threads increase exponentially with the number of


It increases linearly in my tests (or a decent OS).  But really if 
parallel computing is worthwhile you will be doing minutes of work on 
each worker process and the startup time will not be signifcant.



thread considered and I use of computer with two 8 cores CPU and thus
showing a total of 32 threads in windows 7.


The first way to speed things up: use a decent OS:  forking clusters is 
much faster.



Currently, I use the default parameters (type = PSOCK), but is there any
fine tuning parameters that I can use to take advantage of this system ?

Thanks in advance for your help !

Arnaud

R version 3.0.1 (2013-05-16)
Platform: x86_64-w64-mingw32/x64 (64-bit)

[[alternative HTML version deleted]]



--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] speed of makeCluster (package parallel)

2013-10-28 Thread Arnaud Mosnier
Thanks a lot Simon, that's useful.
I will take a look at this process-pinning things.

Arnaud


2013/10/28 Simon Zehnder szehn...@uni-bonn.de

 First,

 use only the number of cores as a number of thread - i.e. I would not use
 hyper threading, etc.. Each core has its own caches and it is always
 fortunate if a process has enough memory; hyper threads use all the same
 cache on the core they are running on. detectCores() gives me for example 4
 - I know I have 2. I would therefore call makeCluster() with nnode = 2.
 mcaffinity() lets you perform a technique called process-pinning (see
 process affinity) and is only possible if the OS supports it. It makes
 sometimes sense to assign certain processes to certain CPUs such that each
 process has enough memory in caches (e.g. for a 16 Core machine using 8
 processes on CPUs 1, 3, 5, 7, 9, 11, 13 and 15; so each process has the
 cache of two CPUs).
 A lot of functions though are not available for Windows.

 At first it comes always the problem you want to solve and then you look
 how much memory will be used in a process and how much you have (more often
 the memory bandwidth is the bottleneck and not the computing power). Look
 at the architecture of your chips (how much L1 Cache, how much L2 cache).
 Then you decide how many cores to use and if it makes sense to pin
 processes to certain cores.

 There are no general recipes for parallel computing - each problem is
 different. Some problems are even not scalable.

 Simon


 On 28 Oct 2013, at 17:51, Arnaud Mosnier a.mosn...@gmail.com wrote:

  Thanks Simon,
 
  I already read the parallel vignette but I did not found what I wanted.
  May be you can be more specific on a part of the document that can
 provide me hints !
 
  Arnaud
 
 
  2013/10/28 Simon Zehnder szehn...@uni-bonn.de
  See library(help = parallel”)
 
 
  On 28 Oct 2013, at 17:19, Arnaud Mosnier a.mosn...@gmail.com wrote:
 
   Hi all,
  
   I am quite new in the world of parallelization and I wonder if there
 is a
   way to increase the speed of creation of a parallel socket cluster. The
   time spend to include threads increase exponentially with the number of
   thread considered and I use of computer with two 8 cores CPU and thus
   showing a total of 32 threads in windows 7.
   Currently, I use the default parameters (type = PSOCK), but is there
 any
   fine tuning parameters that I can use to take advantage of this system
 ?
  
   Thanks in advance for your help !
  
   Arnaud
  
   R version 3.0.1 (2013-05-16)
   Platform: x86_64-w64-mingw32/x64 (64-bit)
  
 [[alternative HTML version deleted]]
  
   __
   R-help@r-project.org mailing list
   https://stat.ethz.ch/mailman/listinfo/r-help
   PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
   and provide commented, minimal, self-contained, reproducible code.
 
 



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.