Re: [R] (performance) time in Windows vs Linux

2009-06-30 Thread Zeljko Vrba
On Tue, Jun 30, 2009 at 10:20:17AM +0900, Raymond Wan wrote:
 
 I. Soumpasis wrote:
 2009/6/29 C騷ar Freitas cafanselm...@yahoo.com.br
 This is true. So I tried the same computer with windows XP and ubuntu 8.10
 64bit dual core @3Gz and 4MB RAM
 Windows 32bit results:
user  system elapsed
   21.660.02   21.69
 Linux 64bit Results
user  system elapsed
  27.242   0.004  27.275
 
 This difference is small and it is truly explained by what the advanced
 users have said.
 
 
 One minor comment which I forgot to mention is that a difference of 6 
 seconds for system that ran 30 seconds is worth noting, but may not be 
 statistically significant.  Especially when we're now talking about two 
 completely different OS' and, thus, two different ways of timing a 
 program.  If whatever data file you are using is also on a different 
 file system, then one could be fragmented, etc.
 

Using wall-clock time metric is not two different ways of timing.  He could
have just as well measured the time using stop-watch, and the results would be
equally valid (given that his reaction time is ~0.5 seconds :-)) Also, user
time is time just spent in executing instructions in user-mode, which does
*not* account for waiting on disk I/O.  system time is time spent just in the
OS kernel, and that number would be also higher if there were intensive I/O
going on.  If I/O pauses were to blame (fragmentation, as you suggest, does
increase latency of serving disk requests), there would be a lot of idle time,
which would be manifested in a large difference between user+system and elapsed
(the latter being wall-clock time).

The numbers displayed above suggest that Windows version performs a *lot* less
work[*] than Linux version (6 seconds of CPU time is a  *lot* of work given
today's CPU frequencies).

[*] All programs actually run equally fast on a given machine, but some finish
the same task sooner because they are doing less work...  

The answer can actually be obtained by compiling R with profiling enabled; this
will slow down execution considerably, but one will be able to pinpoint major
differences in performance.  Wild speculations about sources of this difference:
1) whatever compiler produced the Windows binary has better optimizations,
2) R heavily uses some standard library routines (within system-provided DLLs)
   that are better optimized under Windows than under Linux.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] (performance) time in Windows vs Linux

2009-06-30 Thread Raymond Wan


Hi Zeljko,


Zeljko Vrba wrote:

Windows 32bit results:
  user  system elapsed
 21.660.02   21.69
Linux 64bit Results
  user  system elapsed
27.242   0.004  27.275

Using wall-clock time metric is not two different ways of timing.  He could
have just as well measured the time using stop-watch, and the results would be
equally valid (given that his reaction time is ~0.5 seconds :-)) Also, user



True, but I'd be surprised if anyone using a stop watch can give 3 digit 
accuracy.  :-)




time is time just spent in executing instructions in user-mode, which does
*not* account for waiting on disk I/O.  system time is time spent just in the
OS kernel, and that number would be also higher if there were intensive I/O
going on.  If I/O pauses were to blame (fragmentation, as you suggest, does
increase latency of serving disk requests), there would be a lot of idle time,
which would be manifested in a large difference between user+system and elapsed
(the latter being wall-clock time).



Yes, you're correct.  Fragmentation was just an example (albeit a poor 
one).  My point is that there are many factors that differ between the 
two systems to draw a conclusion such as the 32-bit Windows version is 
faster than the 64-bit Linux version.   I wouldn't be surprised if 
another machine with different specs (say, cache size, etc.) would give 
the opposite results.


I suppose to be fair, one would have to use the same compiler, etc. and 
make sure the only difference is the OS.  One compiler might have 
options turned on that would make one operation faster than the 
other...and that operation might be heavily used in the calculations 
performed.


I guess my point is that there is a big jump from 21.69 seconds vs 
27.275 seconds and system A is faster than system B...


Of the two systems, one thing that usually gets me is that a typical 
Windows machine loads a lot of things into memory.  And because of these 
differences, timing results that differ by 6 seconds should be taken 
with caution -- at least until various data sizes are considered or 
multiple runs with the same data executed and averaged.




The numbers displayed above suggest that Windows version performs a *lot* less
work[*] than Linux version (6 seconds of CPU time is a  *lot* of work given
today's CPU frequencies).



:-)  Well, I think it's obvious that I don't agree with this last 
statement.  I'd be more inclined to believe this if it was repeated 10 
times each and averaged and we still see a 6 second difference, 
though...  (I don't suggest the OP do this, of course; and in fact, I 
wouldn't be surprised if it were true that this difference continues to 
exist.)



Ray

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] (performance) time in Windows vs Linux

2009-06-30 Thread Cézar Freitas
The test is only an example. The data is an example too. The difference is
not the problem, I think, because we can make the data larger and the
difference will grow.
In my system, the original test points to Windows having the best time, and
points to difference larger than 10% between linux generic binaries and
linux compiled from source. I'll try deal with the advanced users'
suggestions (to investigate better) and report here at the list.

Thanks,
C.

On Mon, Jun 29, 2009 at 10:20 PM, Raymond Wan r@aist.go.jp wrote:


 Hi,


 I. Soumpasis wrote:

 2009/6/29 C$Bq[(Bar Freitas cafanselm...@yahoo.com.br
 This is true. So I tried the same computer with windows XP and ubuntu 8.10
 64bit dual core @3Gz and 4MB RAM
 Windows 32bit results:
   user  system elapsed
  21.660.02   21.69
 Linux 64bit Results
   user  system elapsed
  27.242   0.004  27.275

 This difference is small and it is truly explained by what the advanced
 users have said.



 One minor comment which I forgot to mention is that a difference of 6
 seconds for system that ran 30 seconds is worth noting, but may not be
 statistically significant.  Especially when we're now talking about two
 completely different OS' and, thus, two different ways of timing a program.
  If whatever data file you are using is also on a different file system,
 then one could be fragmented, etc.

 My point is that your test might be true (and others have given you reasons
 for it), but also don't worry too much about it...  :-)

 Ray




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] (performance) time in Windows vs Linux

2009-06-29 Thread milton ruser
Really?

In fact I have a quadcore. But how can I know if Linux are really using only
one core, and how can I setup it to use the 4cores?

Thanks a lot,

milton

On Mon, Jun 29, 2009 at 1:46 AM, Patrick Connolly 
p_conno...@slingshot.co.nz wrote:

 On Fri, 26-Jun-2009 at 04:37PM -0400, milton ruser wrote:

 | Hi there,
 |

 | I have both systems on a DELL 64bit machine.

 | I compiled R 2.9.0 on both systems, to get 64bits capability.
 | Surpriselly, on Linux (Ubuntu with I installed 3 month ago) I spent
 | 41s to run the same test you did, and less time (35s) under
 | Vista. In fact I had noticed that I not have gained time when
 | running under linux (I had done jobs that run for several
 | days). But somethings I gain with memory managment, because for
 | some programs or steps, windows say that memory is full, while
 | Linux run up to the end of the job.

 I think there's a simple explanation for both of those observations.
 Your Windows installation is more than likely using both cores (or 4
 if you have a quad core) while your Linux is using only 1 of them.

 best

 --
 ~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.
   ___Patrick Connolly
  {~._.~}   Great minds discuss ideas
  _( Y )_ Average minds discuss events
 (:_~*~_:)  Small minds discuss people
  (_)-(_)  . Eleanor Roosevelt

 ~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] (performance) time in Windows vs Linux

2009-06-29 Thread Patrick Connolly
On Mon, 29-Jun-2009 at 02:13AM -0400, milton ruser wrote:

| Really?
| 
| In fact I have a quadcore. But how can I know if Linux are really
| using only one core, and how can I setup it to use the 4cores?

I use GKrellM (install with aptitude install gkrellm if you don't
have it already).  It shows a trace of the % activity for each CPU
where it's very clear if only one is being used.  I've never had
occasion to change such a setting, but someone more skilled in such
things could say how to make it use more than one.

HTH



| 
| Thanks a lot,
| 
| milton
| 
| On Mon, Jun 29, 2009 at 1:46 AM, Patrick Connolly 
| p_conno...@slingshot.co.nz wrote:
| 
|  On Fri, 26-Jun-2009 at 04:37PM -0400, milton ruser wrote:
| 
|  | Hi there,
|  |
| 
|  | I have both systems on a DELL 64bit machine.
| 
|  | I compiled R 2.9.0 on both systems, to get 64bits capability.
|  | Surpriselly, on Linux (Ubuntu with I installed 3 month ago) I spent
|  | 41s to run the same test you did, and less time (35s) under
|  | Vista. In fact I had noticed that I not have gained time when
|  | running under linux (I had done jobs that run for several
|  | days). But somethings I gain with memory managment, because for
|  | some programs or steps, windows say that memory is full, while
|  | Linux run up to the end of the job.
| 
|  I think there's a simple explanation for both of those observations.
|  Your Windows installation is more than likely using both cores (or 4
|  if you have a quad core) while your Linux is using only 1 of them.
| 
|  best
| 
|  --
|  ~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.
|___Patrick Connolly
|   {~._.~}   Great minds discuss ideas
|   _( Y )_ Average minds discuss events
|  (:_~*~_:)  Small minds discuss people
|   (_)-(_)  . Eleanor Roosevelt
| 
|  ~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.
| 

-- 
~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.   
   ___Patrick Connolly   
 {~._.~}   Great minds discuss ideas
 _( Y )_ Average minds discuss events 
(:_~*~_:)  Small minds discuss people  
 (_)-(_)  . Eleanor Roosevelt
  
~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] (performance) time in Windows vs Linux

2009-06-29 Thread Zeljko Vrba
On Mon, Jun 29, 2009 at 06:56:55PM +1200, Patrick Connolly wrote:
 On Mon, 29-Jun-2009 at 02:13AM -0400, milton ruser wrote:
 
 | Really?
 | 
 | In fact I have a quadcore. But how can I know if Linux are really
 | using only one core, and how can I setup it to use the 4cores?
 
 I use GKrellM (install with aptitude install gkrellm if you don't
 have it already).  It shows a trace of the % activity for each CPU
 where it's very clear if only one is being used.  I've never had
 occasion to change such a setting, but someone more skilled in such
 things could say how to make it use more than one.
 
How do you know that it is *R* that uses all 4 CPUs, and not other applications
on the system?  What does running top in the terminal say?  If R uses more
than one CPU, it's CPU usage will be  100%.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] (performance) time in Windows vs Linux

2009-06-29 Thread Raymond Wan


Hi,


milton ruser wrote:

In fact I have a quadcore. But how can I know if Linux are really using only
one core, and how can I setup it to use the 4cores?



I don't know the answer in the context of R -- I didn't know that R can 
use multiple cores by default?  But in general, I use htop, whose man 
pages describes it as:  This program is a free (GPL) ncurses-based 
process viewer.


It is a colored version of top, essentially.  At the top of the 
screen, you will see your 4 cores represented as percentages.  Under 
Setup, add Processor to the list of options and then CPU will appear 
as a column, which if you have 4 cores, the values will vary from 1 to 4.


If you want to check if R is running on more than one core, then 
obviously R should appear more than once and with two different values 
under CPU.


Ray

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] (performance) time in Windows vs Linux

2009-06-29 Thread Patrick Connolly
On Mon, 29-Jun-2009 at 09:05AM +0200, Zeljko Vrba wrote:

| On Mon, Jun 29, 2009 at 06:56:55PM +1200, Patrick Connolly wrote:
|  On Mon, 29-Jun-2009 at 02:13AM -0400, milton ruser wrote:
|  
|  | Really?
|  | 
|  | In fact I have a quadcore. But how can I know if Linux are really
|  | using only one core, and how can I setup it to use the 4cores?
|  
|  I use GKrellM (install with aptitude install gkrellm if you don't
|  have it already).  It shows a trace of the % activity for each CPU
|  where it's very clear if only one is being used.  I've never had
|  occasion to change such a setting, but someone more skilled in such
|  things could say how to make it use more than one.
|  
| How do you know that it is *R* that uses all 4 CPUs, and not other
| applications on the system?  What does running top in the
| terminal say?  If R uses more than one CPU, it's CPU usage will be
|  100%.

There are undoubtedly more scientific ways, but if the machine is idling
with both krells showing a very low number like 2% BEFORE running the
R script, and then we see only one suddenly become 99 or 100%, it's a
fairly safe bet that it was R that made the difference -- particularly
if it drops back once the R code finishes.

I think top adds the two usages together, so values over 100% would be
possible.  The numbers shown by GKrellM could be thought of as more
sensible where you have the choice of composite, real or both.

HTH

-- 
~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.   
   ___Patrick Connolly   
 {~._.~}   Great minds discuss ideas
 _( Y )_ Average minds discuss events 
(:_~*~_:)  Small minds discuss people  
 (_)-(_)  . Eleanor Roosevelt
  
~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] (performance) time in Windows vs Linux

2009-06-29 Thread Prof Brian Ripley

On Mon, 29 Jun 2009, Raymond Wan wrote:


milton ruser wrote:


In fact I have a quadcore. But how can I know if Linux are really 
using only one core, and how can I setup it to use the 4cores?


I don't know the answer in the context of R -- I didn't know that R can use 
multiple cores by default?


It cannot, and much of this thread is pure speculation.  So let's try 
to set the record straight (as we have already done in the manuals).


The only way that a single R process will be using more than one CPU 
is if you have added a mulithreaded BLAS (and I've never heard of one 
being used successfully with R for Windows) or other add-on such as 
Luke Tierney's pmath[0] packages.  Packages such as snow and multicore 
run multiple R processes.


I do run a multithreaded BLAS on my 8-core Linux box and do often see 
'top' well over 100% -- I just tested and saw 798.9%.


It is exceptional to see R under Windows running faster than a 
well-tuned R under Linux on the same hardware (and my only Windows 
machine is a multiboot that normally runs Linux, so I do have 
extensive experience).  There are a number of reasons


- R for Windows always uses a shared library, whereas under Linux by 
default it does not, for speed -- see the R-admin manual.


- MinGW until recently had only an older compiler, 4.2.1. (gcc 4.4.0 
for mingw is just out, but I have not tried it).  gcc 4.3.x has both 
better general optimizations and better support for the Core 2 Duo my 
machine has.


- You can tune the Linux version better by compiling yourself 
(although some tuning is possible on Windows).


- Linux uses interrupts for things that Windows polls (or for some 
instances R does on those platforms).  That includes the overhead on 
Windows of running Rgui (if you are using that rather than Rterm) and 
polling the Windows message system.


- 32-bit Linux allows access to more address space than 32-bit 
Windows, so there may be less frequent garbage collections on large 
tasks.  In any case, the Linux memory manager is more efficient.


Against that, a 64-bit build will in general be slower than a 32-bit 
one -- see the R-admin manual.  If you run 32-bit R for Windows on 
64-bit Windows you are running under a WOW subsystem and that has a 
small overhead: but in our tests the REvolution 64-bit build of R was 
slightly slower.


But we are only talking about small differences, say up to 20% and 
usually more like 5-10%.


It is usually possible to find some task that a particular compiler 
optimizes badly, so there will be rare exceptions.


But in general, I use htop, whose man pages 
describes it as:  This program is a free (GPL) ncurses-based process 
viewer.


It is a colored version of top, essentially.  At the top of the screen, you 
will see your 4 cores represented as percentages.  Under Setup, add 
Processor to the list of options and then CPU will appear as a column, 
which if you have 4 cores, the values will vary from 1 to 4.


If you want to check if R is running on more than one core, then obviously R 
should appear more than once and with two different values under CPU.


Not so: that will happen if multiple copies of R are running, not if a 
single copy of R is running multiple threads.


--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] (performance) time in Windows vs Linux

2009-06-29 Thread Raymond Wan


Hi Brian,

Thank you for the clarification -- the first part does set the record 
straight about what I thought about R.  I would expect a program to run 
on a single core by default unless something specifically (and somewhat 
non-trivial) was done to it.



Prof Brian Ripley wrote:
If you want to check if R is running on more than one core, then 
obviously R should appear more than once and with two different values 
under CPU.


Not so: that will happen if multiple copies of R are running, not if a 
single copy of R is running multiple threads.



Ah, yes, you are right -- there is a big difference.  The former works 
for both multiple copies of R or a single program [purposely not saying 
R here to keep it general] running on multiple cores [distributed 
memory parallelization].  It doesn't work for multiple threads, though.


(As an aside for anyone else interested, my Debian distribution has the 
package r-cran-rmpi, which is described as the GNU R package 
interfacing MPI libraries...so that seems to be another way to use 
multiple cores, though I have never used it...)


Ray

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] (performance) time in Windows vs Linux

2009-06-29 Thread Gavin Simpson
On Mon, 2009-06-29 at 19:21 +1200, Patrick Connolly wrote:
 On Mon, 29-Jun-2009 at 09:05AM +0200, Zeljko Vrba wrote:
 
 | On Mon, Jun 29, 2009 at 06:56:55PM +1200, Patrick Connolly wrote:
 |  On Mon, 29-Jun-2009 at 02:13AM -0400, milton ruser wrote:
 |  
 |  | Really?
 |  | 
 |  | In fact I have a quadcore. But how can I know if Linux are really
 |  | using only one core, and how can I setup it to use the 4cores?
 |  
 |  I use GKrellM (install with aptitude install gkrellm if you don't
 |  have it already).  It shows a trace of the % activity for each CPU
 |  where it's very clear if only one is being used.  I've never had
 |  occasion to change such a setting, but someone more skilled in such
 |  things could say how to make it use more than one.
 |  
 | How do you know that it is *R* that uses all 4 CPUs, and not other
 | applications on the system?  What does running top in the
 | terminal say?  If R uses more than one CPU, it's CPU usage will be
 |  100%.
 
 There are undoubtedly more scientific ways, but if the machine is idling
 with both krells showing a very low number like 2% BEFORE running the
 R script, and then we see only one suddenly become 99 or 100%, it's a
 fairly safe bet that it was R that made the difference -- particularly
 if it drops back once the R code finishes.
 
 I think top adds the two usages together, so values over 100% would be
 possible.  The numbers shown by GKrellM could be thought of as more
 sensible where you have the choice of composite, real or both.

Hmm, R is a single threaded application - you might be able to call
functions that use multi-threaded compiled code that will use the extra
cores, but R itself won't, whether it is running on Linux or Windows.

On my 4 core workstation, top reports load averages up to (and a bit
exceeding sometimes) 4 when I'm utilising all 4 cores for processing
jobs - but that is only when I initiate 4 separate R processes; each of
the 4 processes only ever has 100% maximum usage.

I think you are being misled by krellms; on my 4 core workstation the
cpu throttling application you can stick in the panel (cpufreq or
cpuspeed) reports that pairs of cores hit full speed when required, say
when running a single R process. But that R process is only using 100%
of one core - if you have top or system monitor running, you'll see
this. I think this is because the 4 cores are on two chips and if you
need to run one core up to full speed, the other core on that chip also
gets sped up, but it isn't crunching anything.

G

 
 HTH
 
-- 
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
 Dr. Gavin Simpson [t] +44 (0)20 7679 0522
 ECRC, UCL Geography,  [f] +44 (0)20 7679 0565
 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk
 Gower Street, London  [w] http://www.ucl.ac.uk/~ucfagls/
 UK. WC1E 6BT. [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] (performance) time in Windows vs Linux

2009-06-29 Thread Prof Brian Ripley

I meant to write not so for 'top' in the final para.

On Mon, 29 Jun 2009, Prof Brian Ripley wrote:


On Mon, 29 Jun 2009, Raymond Wan wrote:


milton ruser wrote:


In fact I have a quadcore. But how can I know if Linux are really using 
only one core, and how can I setup it to use the 4cores?


I don't know the answer in the context of R -- I didn't know that R can use 
multiple cores by default?


It cannot, and much of this thread is pure speculation.  So let's try to set 
the record straight (as we have already done in the manuals).


The only way that a single R process will be using more than one CPU is if 
you have added a mulithreaded BLAS (and I've never heard of one being used 
successfully with R for Windows) or other add-on such as Luke Tierney's 
pmath[0] packages.  Packages such as snow and multicore run multiple R 
processes.


I do run a multithreaded BLAS on my 8-core Linux box and do often see 'top' 
well over 100% -- I just tested and saw 798.9%.


It is exceptional to see R under Windows running faster than a well-tuned R 
under Linux on the same hardware (and my only Windows machine is a multiboot 
that normally runs Linux, so I do have extensive experience).  There are a 
number of reasons


- R for Windows always uses a shared library, whereas under Linux by default 
it does not, for speed -- see the R-admin manual.


- MinGW until recently had only an older compiler, 4.2.1. (gcc 4.4.0 for 
mingw is just out, but I have not tried it).  gcc 4.3.x has both better 
general optimizations and better support for the Core 2 Duo my machine has.


- You can tune the Linux version better by compiling yourself (although some 
tuning is possible on Windows).


- Linux uses interrupts for things that Windows polls (or for some instances 
R does on those platforms).  That includes the overhead on Windows of running 
Rgui (if you are using that rather than Rterm) and polling the Windows 
message system.


- 32-bit Linux allows access to more address space than 32-bit Windows, so 
there may be less frequent garbage collections on large tasks.  In any case, 
the Linux memory manager is more efficient.


Against that, a 64-bit build will in general be slower than a 32-bit one -- 
see the R-admin manual.  If you run 32-bit R for Windows on 64-bit Windows 
you are running under a WOW subsystem and that has a small overhead: but in 
our tests the REvolution 64-bit build of R was slightly slower.


But we are only talking about small differences, say up to 20% and usually 
more like 5-10%.


It is usually possible to find some task that a particular compiler optimizes 
badly, so there will be rare exceptions.


But in general, I use htop, whose man pages describes it as:  This 
program is a free (GPL) ncurses-based process viewer.


It is a colored version of top, essentially.  At the top of the screen, 
you will see your 4 cores represented as percentages.  Under Setup, add 
Processor to the list of options and then CPU will appear as a column, 
which if you have 4 cores, the values will vary from 1 to 4.


If you want to check if R is running on more than one core, then obviously 
R should appear more than once and with two different values under CPU.


Not so: that will happen if multiple copies of R are running, not if a single 
copy of R is running multiple threads.


--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] (performance) time in Windows vs Linux

2009-06-29 Thread Raymond Wan


Prof Brian Ripley wrote:

I meant to write not so for 'top' in the final para.


Ah, I'm not certain enough to know that htop works for threads as 
well...so I was quick to jump to agreeing with you.  :-)  I only know it 
works for multi-cores...


Ray

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] (performance) time in Windows vs Linux

2009-06-29 Thread Cézar Freitas
Hi, Ilias. I think that is not ok to compare performance in different 
plataforms of different machines. To compare times, is necessary you execute 
the code at the two plataforms (Linux and Windows) in the same machine.
But the problem here is other. The advanced users explained it well. 
 
So I ask them:
Is it possible build the R (under Linux) and force the compilation make it in 
32 bits (at a 64 bits machine)?
 
Thanks to all,
Cezar Freitas

--- Em seg, 29/6/09, I. Soumpasis nono@gmail.com escreveu:


De: I. Soumpasis nono@gmail.com
Assunto: Re: [R] (performance) time in Windows vs Linux
Para: Cézar Freitas cafanselm...@yahoo.com.br
Data: Segunda-feira, 29 de Junho de 2009, 5:53


Hi Cezar,

I tried your code in a core duo laptop (@2.5Gz) with ubuntu x86_64 with 4GB of 
RAM. Both R and libraries are compiled from source. These are the results.
   user  system elapsed 
 23.861   0.812  25.065

It seems better than the windows. Is there a posibility that on linux you have 
a big workspace, or something is consuming your memory and thus R is forced to 
use swap memory? Just a speculation.

Ilias


PS. BTW It uses only one core.



2009/6/26 Cézar Freitas cafanselm...@yahoo.com.br

  
 #windows time
 #   user  system elapsed
 #  27.81    0.00   27.82
 
 #linux usual compilation time
 #   user  system elapsed
 # 52.635   0.016  52.748
 
 #linux (my compilation) time
 #   user  system elapsed
 # 52.567   0.016  52.588
 #==END OF CODE





     

Veja quais são os assuntos do momento no Yahoo! +Buscados
http://br.maisbuscados.yahoo.com
       [[alternative HTML version deleted]]


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.





  

Veja quais são os assuntos do momento no Yahoo! +Buscados
http://br.maisbuscados.yahoo.com
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] (performance) time in Windows vs Linux

2009-06-29 Thread Cézar Freitas


--- Em seg, 29/6/09, I. Soumpasis nono@gmail.com escreveu:


De: I. Soumpasis nono@gmail.com
Assunto: Re: [R] (performance) time in Windows vs Linux
Para: Cézar Freitas cafanselm...@yahoo.com.br
Data: Segunda-feira, 29 de Junho de 2009, 5:53


Hi Cezar,

I tried your code in a core duo laptop (@2.5Gz) with ubuntu x86_64 with 4GB of 
RAM. Both R and libraries are compiled from source. These are the results.
   user  system elapsed 
 23.861   0.812  25.065

It seems better than the windows. Is there a posibility that on linux you have 
a big workspace, or something is consuming your memory and thus R is forced to 
use swap memory? Just a speculation.

Ilias


PS. BTW It uses only one core.



2009/6/26 Cézar Freitas cafanselm...@yahoo.com.br

  
 #windows time
 #   user  system elapsed
 #  27.81    0.00   27.82
 
 #linux usual compilation time
 #   user  system elapsed
 # 52.635   0.016  52.748
 
 #linux (my compilation) time
 #   user  system elapsed
 # 52.567   0.016  52.588
 #==END OF CODE





     

Veja quais são os assuntos do momento no Yahoo! +Buscados
http://br.maisbuscados.yahoo.com
       [[alternative HTML version deleted]]


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.





  

Veja quais são os assuntos do momento no Yahoo! +Buscados
http://br.maisbuscados.yahoo.com
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] (performance) time in Windows vs Linux

2009-06-29 Thread I. Soumpasis
2009/6/29 Cézar Freitas cafanselm...@yahoo.com.br

 Hi, Ilias. I think that is not ok to compare performance in different
 plataforms of different machines. To compare times, is necessary you execute
 the code at the two plataforms (Linux and Windows) in the same machine.
 But the problem here is other. The advanced users explained it well.


This is true. So I tried the same computer with windows XP and ubuntu 8.10
64bit dual core @3Gz and 4MB RAM
Windows 32bit results:
   user  system elapsed
  21.660.02   21.69
Linux 64bit Results
   user  system elapsed
 27.242   0.004  27.275

This difference is small and it is truly explained by what the advanced
users have said.

Regards,
Ilias

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] (performance) time in Windows vs Linux

2009-06-29 Thread Raymond Wan


Hi,


I. Soumpasis wrote:

2009/6/29 C騷ar Freitas cafanselm...@yahoo.com.br
This is true. So I tried the same computer with windows XP and ubuntu 8.10
64bit dual core @3Gz and 4MB RAM
Windows 32bit results:
   user  system elapsed
  21.660.02   21.69
Linux 64bit Results
   user  system elapsed
 27.242   0.004  27.275

This difference is small and it is truly explained by what the advanced
users have said.



One minor comment which I forgot to mention is that a difference of 6 
seconds for system that ran 30 seconds is worth noting, but may not be 
statistically significant.  Especially when we're now talking about two 
completely different OS' and, thus, two different ways of timing a 
program.  If whatever data file you are using is also on a different 
file system, then one could be fragmented, etc.


My point is that your test might be true (and others have given you 
reasons for it), but also don't worry too much about it...  :-)


Ray

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] (performance) time in Windows vs Linux

2009-06-28 Thread Patrick Connolly
On Fri, 26-Jun-2009 at 04:37PM -0400, milton ruser wrote:

| Hi there,
| 

| I have both systems on a DELL 64bit machine.

| I compiled R 2.9.0 on both systems, to get 64bits capability.
| Surpriselly, on Linux (Ubuntu with I installed 3 month ago) I spent
| 41s to run the same test you did, and less time (35s) under
| Vista. In fact I had noticed that I not have gained time when
| running under linux (I had done jobs that run for several
| days). But somethings I gain with memory managment, because for
| some programs or steps, windows say that memory is full, while
| Linux run up to the end of the job.

I think there's a simple explanation for both of those observations.
Your Windows installation is more than likely using both cores (or 4
if you have a quad core) while your Linux is using only 1 of them.

best

-- 
~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.   
   ___Patrick Connolly   
 {~._.~}   Great minds discuss ideas
 _( Y )_ Average minds discuss events 
(:_~*~_:)  Small minds discuss people  
 (_)-(_)  . Eleanor Roosevelt
  
~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] (performance) time in Windows vs Linux

2009-06-26 Thread Zeljko Vrba
On Fri, Jun 26, 2009 at 12:23:35PM -0700, Cézar Freitas wrote:
 
 I supposed R on Linux should be faster (32 and 64 bit) than windows version. 
 Is this difference because 64 bit R version is slower than 32 bits one? I 
 started the machine in both sittuations and checked free memory.
 
I suspect that the compiler is to blame.  Download Intel's C and C++ compiler
for linux (it is free for personal use), try to compile R with it, and see
what results you get (and report them here!).  Of course, if you have the
time and are willing to tinker :)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] (performance) time in Windows vs Linux

2009-06-26 Thread milton ruser
Hi there,

I have both systems on a DELL 64bit machine.
I compiled R 2.9.0 on both systems, to get 64bits capability.
Surpriselly, on Linux (Ubuntu with I installed 3 month ago) I spent 41s to
run the same test you did, and less time (35s) under Vista. In fact I had
noticed that I not have gained time when running under linux (I had done
jobs that run for several days). But somethings I gain with memory
managment, because for some programs or steps, windows say that memory is
full, while Linux run up to the end of the job.

good luck

milton

On Fri, Jun 26, 2009 at 4:01 PM, Zeljko Vrba zv...@ifi.uio.no wrote:

 On Fri, Jun 26, 2009 at 12:23:35PM -0700, Cézar Freitas wrote:
 
  I supposed R on Linux should be faster (32 and 64 bit) than windows
 version. Is this difference because 64 bit R version is slower than 32 bits
 one? I started the machine in both sittuations and checked free memory.
 
 I suspect that the compiler is to blame.  Download Intel's C and C++
 compiler
 for linux (it is free for personal use), try to compile R with it, and see
 what results you get (and report them here!).  Of course, if you have the
 time and are willing to tinker :)

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] (performance) time in Windows vs Linux

2009-06-26 Thread Uwe Ligges
Yes, under 64-bit it is sometimes slower and it highly depends on the 
problem and the compiler you have. Note also that nobody managed to get 
a 64-bit Windows R binary compiled with gcc so far.
Remember, 10 years ago there was the SUN Ultra Sparc III and above 
architecture, and gcc was known to produce extremely inefficient 64-bit 
binaries for that platform. Things got somewhat better in the meantime.


With the tests I used 32-bit R compiled with gcc was roughly 10% slower 
on Windows than under Linux - but as I said, it depends on the problem. 
Trying loops versus matrix operations is a way to specify a two very 
different problems, for example.


Uwe Ligges





Cézar Freitas wrote:

Hi, all.

I began to migrate my R codes from Windows to Linux and surprised me
with an old question. I simplified the problem and made a little test to 
compare times at same
computer and the Linux time is worse (not so little) than Windows time:
28 vs 53 seconds.

I make an example (below) to facilitate all to see the difference.
I also build from source (it's my first time) a version of R to compare with 
the distributed (compiled) R version. The times are similar to the other Linux 
version.

I supposed R on Linux should be faster (32 and 64 bit) than windows version. Is 
this difference because 64 bit R version is slower than 32 bits one? I started 
the machine in both sittuations and checked free memory.

Tecnichal details:
Machine: Intel Core 2 Duo DDR2 4 Gb RAM
Windows version: XP Professional - 32 bits
R version: 2.9* binaries
Linux version: Ubuntu 8* (Hardy) - 64 bits
R version: 2.9* binaries and 2.9* compiled from source

Thanks to all,
Cezar Freitas

 #code
 N = 5
 n = 15000
 
 #makes data

 dad = as.data.frame(cbind(sample(N,N,replace=FALSE), rpois(N,30)))
 names(dad) = c(id,age)
 
 aux = as.data.frame(cbind(sample(N,n,replace=FALSE), round(runif(n),4)))

 names(aux) = c(id,score)
 
 #calculates time

 set.seed(790) #to be equal to everyone
 system.time({
   dad$score = 0
   subdad = subset(dad, id%in%aux$id)
   for(k in 1:(dim(subdad)[1])){
 temp = aux$score[aux$id==subdad$id[k]]
 if(length(temp)) subdad$score[k] = temp
   }
 })
 
 #windows time

 #   user  system elapsed
 #  27.810.00   27.82
 
 #linux usual compilation time

 #   user  system elapsed
 # 52.635   0.016  52.748
 
 #linux (my compilation) time

 #   user  system elapsed
 # 52.567   0.016  52.588
 #==END OF CODE




  

Veja quais são os assuntos do momento no Yahoo! +Buscados
http://br.maisbuscados.yahoo.com
[[alternative HTML version deleted]]





__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.