Re: [R] (performance) time in Windows vs Linux
On Tue, Jun 30, 2009 at 10:20:17AM +0900, Raymond Wan wrote: I. Soumpasis wrote: 2009/6/29 C騷ar Freitas cafanselm...@yahoo.com.br This is true. So I tried the same computer with windows XP and ubuntu 8.10 64bit dual core @3Gz and 4MB RAM Windows 32bit results: user system elapsed 21.660.02 21.69 Linux 64bit Results user system elapsed 27.242 0.004 27.275 This difference is small and it is truly explained by what the advanced users have said. One minor comment which I forgot to mention is that a difference of 6 seconds for system that ran 30 seconds is worth noting, but may not be statistically significant. Especially when we're now talking about two completely different OS' and, thus, two different ways of timing a program. If whatever data file you are using is also on a different file system, then one could be fragmented, etc. Using wall-clock time metric is not two different ways of timing. He could have just as well measured the time using stop-watch, and the results would be equally valid (given that his reaction time is ~0.5 seconds :-)) Also, user time is time just spent in executing instructions in user-mode, which does *not* account for waiting on disk I/O. system time is time spent just in the OS kernel, and that number would be also higher if there were intensive I/O going on. If I/O pauses were to blame (fragmentation, as you suggest, does increase latency of serving disk requests), there would be a lot of idle time, which would be manifested in a large difference between user+system and elapsed (the latter being wall-clock time). The numbers displayed above suggest that Windows version performs a *lot* less work[*] than Linux version (6 seconds of CPU time is a *lot* of work given today's CPU frequencies). [*] All programs actually run equally fast on a given machine, but some finish the same task sooner because they are doing less work... The answer can actually be obtained by compiling R with profiling enabled; this will slow down execution considerably, but one will be able to pinpoint major differences in performance. Wild speculations about sources of this difference: 1) whatever compiler produced the Windows binary has better optimizations, 2) R heavily uses some standard library routines (within system-provided DLLs) that are better optimized under Windows than under Linux. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] (performance) time in Windows vs Linux
Hi Zeljko, Zeljko Vrba wrote: Windows 32bit results: user system elapsed 21.660.02 21.69 Linux 64bit Results user system elapsed 27.242 0.004 27.275 Using wall-clock time metric is not two different ways of timing. He could have just as well measured the time using stop-watch, and the results would be equally valid (given that his reaction time is ~0.5 seconds :-)) Also, user True, but I'd be surprised if anyone using a stop watch can give 3 digit accuracy. :-) time is time just spent in executing instructions in user-mode, which does *not* account for waiting on disk I/O. system time is time spent just in the OS kernel, and that number would be also higher if there were intensive I/O going on. If I/O pauses were to blame (fragmentation, as you suggest, does increase latency of serving disk requests), there would be a lot of idle time, which would be manifested in a large difference between user+system and elapsed (the latter being wall-clock time). Yes, you're correct. Fragmentation was just an example (albeit a poor one). My point is that there are many factors that differ between the two systems to draw a conclusion such as the 32-bit Windows version is faster than the 64-bit Linux version. I wouldn't be surprised if another machine with different specs (say, cache size, etc.) would give the opposite results. I suppose to be fair, one would have to use the same compiler, etc. and make sure the only difference is the OS. One compiler might have options turned on that would make one operation faster than the other...and that operation might be heavily used in the calculations performed. I guess my point is that there is a big jump from 21.69 seconds vs 27.275 seconds and system A is faster than system B... Of the two systems, one thing that usually gets me is that a typical Windows machine loads a lot of things into memory. And because of these differences, timing results that differ by 6 seconds should be taken with caution -- at least until various data sizes are considered or multiple runs with the same data executed and averaged. The numbers displayed above suggest that Windows version performs a *lot* less work[*] than Linux version (6 seconds of CPU time is a *lot* of work given today's CPU frequencies). :-) Well, I think it's obvious that I don't agree with this last statement. I'd be more inclined to believe this if it was repeated 10 times each and averaged and we still see a 6 second difference, though... (I don't suggest the OP do this, of course; and in fact, I wouldn't be surprised if it were true that this difference continues to exist.) Ray __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] (performance) time in Windows vs Linux
The test is only an example. The data is an example too. The difference is not the problem, I think, because we can make the data larger and the difference will grow. In my system, the original test points to Windows having the best time, and points to difference larger than 10% between linux generic binaries and linux compiled from source. I'll try deal with the advanced users' suggestions (to investigate better) and report here at the list. Thanks, C. On Mon, Jun 29, 2009 at 10:20 PM, Raymond Wan r@aist.go.jp wrote: Hi, I. Soumpasis wrote: 2009/6/29 C$Bq[(Bar Freitas cafanselm...@yahoo.com.br This is true. So I tried the same computer with windows XP and ubuntu 8.10 64bit dual core @3Gz and 4MB RAM Windows 32bit results: user system elapsed 21.660.02 21.69 Linux 64bit Results user system elapsed 27.242 0.004 27.275 This difference is small and it is truly explained by what the advanced users have said. One minor comment which I forgot to mention is that a difference of 6 seconds for system that ran 30 seconds is worth noting, but may not be statistically significant. Especially when we're now talking about two completely different OS' and, thus, two different ways of timing a program. If whatever data file you are using is also on a different file system, then one could be fragmented, etc. My point is that your test might be true (and others have given you reasons for it), but also don't worry too much about it... :-) Ray [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] (performance) time in Windows vs Linux
Really? In fact I have a quadcore. But how can I know if Linux are really using only one core, and how can I setup it to use the 4cores? Thanks a lot, milton On Mon, Jun 29, 2009 at 1:46 AM, Patrick Connolly p_conno...@slingshot.co.nz wrote: On Fri, 26-Jun-2009 at 04:37PM -0400, milton ruser wrote: | Hi there, | | I have both systems on a DELL 64bit machine. | I compiled R 2.9.0 on both systems, to get 64bits capability. | Surpriselly, on Linux (Ubuntu with I installed 3 month ago) I spent | 41s to run the same test you did, and less time (35s) under | Vista. In fact I had noticed that I not have gained time when | running under linux (I had done jobs that run for several | days). But somethings I gain with memory managment, because for | some programs or steps, windows say that memory is full, while | Linux run up to the end of the job. I think there's a simple explanation for both of those observations. Your Windows installation is more than likely using both cores (or 4 if you have a quad core) while your Linux is using only 1 of them. best -- ~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~. ___Patrick Connolly {~._.~} Great minds discuss ideas _( Y )_ Average minds discuss events (:_~*~_:) Small minds discuss people (_)-(_) . Eleanor Roosevelt ~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] (performance) time in Windows vs Linux
On Mon, 29-Jun-2009 at 02:13AM -0400, milton ruser wrote: | Really? | | In fact I have a quadcore. But how can I know if Linux are really | using only one core, and how can I setup it to use the 4cores? I use GKrellM (install with aptitude install gkrellm if you don't have it already). It shows a trace of the % activity for each CPU where it's very clear if only one is being used. I've never had occasion to change such a setting, but someone more skilled in such things could say how to make it use more than one. HTH | | Thanks a lot, | | milton | | On Mon, Jun 29, 2009 at 1:46 AM, Patrick Connolly | p_conno...@slingshot.co.nz wrote: | | On Fri, 26-Jun-2009 at 04:37PM -0400, milton ruser wrote: | | | Hi there, | | | | | I have both systems on a DELL 64bit machine. | | | I compiled R 2.9.0 on both systems, to get 64bits capability. | | Surpriselly, on Linux (Ubuntu with I installed 3 month ago) I spent | | 41s to run the same test you did, and less time (35s) under | | Vista. In fact I had noticed that I not have gained time when | | running under linux (I had done jobs that run for several | | days). But somethings I gain with memory managment, because for | | some programs or steps, windows say that memory is full, while | | Linux run up to the end of the job. | | I think there's a simple explanation for both of those observations. | Your Windows installation is more than likely using both cores (or 4 | if you have a quad core) while your Linux is using only 1 of them. | | best | | -- | ~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~. |___Patrick Connolly | {~._.~} Great minds discuss ideas | _( Y )_ Average minds discuss events | (:_~*~_:) Small minds discuss people | (_)-(_) . Eleanor Roosevelt | | ~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~. | -- ~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~. ___Patrick Connolly {~._.~} Great minds discuss ideas _( Y )_ Average minds discuss events (:_~*~_:) Small minds discuss people (_)-(_) . Eleanor Roosevelt ~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] (performance) time in Windows vs Linux
On Mon, Jun 29, 2009 at 06:56:55PM +1200, Patrick Connolly wrote: On Mon, 29-Jun-2009 at 02:13AM -0400, milton ruser wrote: | Really? | | In fact I have a quadcore. But how can I know if Linux are really | using only one core, and how can I setup it to use the 4cores? I use GKrellM (install with aptitude install gkrellm if you don't have it already). It shows a trace of the % activity for each CPU where it's very clear if only one is being used. I've never had occasion to change such a setting, but someone more skilled in such things could say how to make it use more than one. How do you know that it is *R* that uses all 4 CPUs, and not other applications on the system? What does running top in the terminal say? If R uses more than one CPU, it's CPU usage will be 100%. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] (performance) time in Windows vs Linux
Hi, milton ruser wrote: In fact I have a quadcore. But how can I know if Linux are really using only one core, and how can I setup it to use the 4cores? I don't know the answer in the context of R -- I didn't know that R can use multiple cores by default? But in general, I use htop, whose man pages describes it as: This program is a free (GPL) ncurses-based process viewer. It is a colored version of top, essentially. At the top of the screen, you will see your 4 cores represented as percentages. Under Setup, add Processor to the list of options and then CPU will appear as a column, which if you have 4 cores, the values will vary from 1 to 4. If you want to check if R is running on more than one core, then obviously R should appear more than once and with two different values under CPU. Ray __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] (performance) time in Windows vs Linux
On Mon, 29-Jun-2009 at 09:05AM +0200, Zeljko Vrba wrote: | On Mon, Jun 29, 2009 at 06:56:55PM +1200, Patrick Connolly wrote: | On Mon, 29-Jun-2009 at 02:13AM -0400, milton ruser wrote: | | | Really? | | | | In fact I have a quadcore. But how can I know if Linux are really | | using only one core, and how can I setup it to use the 4cores? | | I use GKrellM (install with aptitude install gkrellm if you don't | have it already). It shows a trace of the % activity for each CPU | where it's very clear if only one is being used. I've never had | occasion to change such a setting, but someone more skilled in such | things could say how to make it use more than one. | | How do you know that it is *R* that uses all 4 CPUs, and not other | applications on the system? What does running top in the | terminal say? If R uses more than one CPU, it's CPU usage will be | 100%. There are undoubtedly more scientific ways, but if the machine is idling with both krells showing a very low number like 2% BEFORE running the R script, and then we see only one suddenly become 99 or 100%, it's a fairly safe bet that it was R that made the difference -- particularly if it drops back once the R code finishes. I think top adds the two usages together, so values over 100% would be possible. The numbers shown by GKrellM could be thought of as more sensible where you have the choice of composite, real or both. HTH -- ~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~. ___Patrick Connolly {~._.~} Great minds discuss ideas _( Y )_ Average minds discuss events (:_~*~_:) Small minds discuss people (_)-(_) . Eleanor Roosevelt ~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] (performance) time in Windows vs Linux
On Mon, 29 Jun 2009, Raymond Wan wrote: milton ruser wrote: In fact I have a quadcore. But how can I know if Linux are really using only one core, and how can I setup it to use the 4cores? I don't know the answer in the context of R -- I didn't know that R can use multiple cores by default? It cannot, and much of this thread is pure speculation. So let's try to set the record straight (as we have already done in the manuals). The only way that a single R process will be using more than one CPU is if you have added a mulithreaded BLAS (and I've never heard of one being used successfully with R for Windows) or other add-on such as Luke Tierney's pmath[0] packages. Packages such as snow and multicore run multiple R processes. I do run a multithreaded BLAS on my 8-core Linux box and do often see 'top' well over 100% -- I just tested and saw 798.9%. It is exceptional to see R under Windows running faster than a well-tuned R under Linux on the same hardware (and my only Windows machine is a multiboot that normally runs Linux, so I do have extensive experience). There are a number of reasons - R for Windows always uses a shared library, whereas under Linux by default it does not, for speed -- see the R-admin manual. - MinGW until recently had only an older compiler, 4.2.1. (gcc 4.4.0 for mingw is just out, but I have not tried it). gcc 4.3.x has both better general optimizations and better support for the Core 2 Duo my machine has. - You can tune the Linux version better by compiling yourself (although some tuning is possible on Windows). - Linux uses interrupts for things that Windows polls (or for some instances R does on those platforms). That includes the overhead on Windows of running Rgui (if you are using that rather than Rterm) and polling the Windows message system. - 32-bit Linux allows access to more address space than 32-bit Windows, so there may be less frequent garbage collections on large tasks. In any case, the Linux memory manager is more efficient. Against that, a 64-bit build will in general be slower than a 32-bit one -- see the R-admin manual. If you run 32-bit R for Windows on 64-bit Windows you are running under a WOW subsystem and that has a small overhead: but in our tests the REvolution 64-bit build of R was slightly slower. But we are only talking about small differences, say up to 20% and usually more like 5-10%. It is usually possible to find some task that a particular compiler optimizes badly, so there will be rare exceptions. But in general, I use htop, whose man pages describes it as: This program is a free (GPL) ncurses-based process viewer. It is a colored version of top, essentially. At the top of the screen, you will see your 4 cores represented as percentages. Under Setup, add Processor to the list of options and then CPU will appear as a column, which if you have 4 cores, the values will vary from 1 to 4. If you want to check if R is running on more than one core, then obviously R should appear more than once and with two different values under CPU. Not so: that will happen if multiple copies of R are running, not if a single copy of R is running multiple threads. -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] (performance) time in Windows vs Linux
Hi Brian, Thank you for the clarification -- the first part does set the record straight about what I thought about R. I would expect a program to run on a single core by default unless something specifically (and somewhat non-trivial) was done to it. Prof Brian Ripley wrote: If you want to check if R is running on more than one core, then obviously R should appear more than once and with two different values under CPU. Not so: that will happen if multiple copies of R are running, not if a single copy of R is running multiple threads. Ah, yes, you are right -- there is a big difference. The former works for both multiple copies of R or a single program [purposely not saying R here to keep it general] running on multiple cores [distributed memory parallelization]. It doesn't work for multiple threads, though. (As an aside for anyone else interested, my Debian distribution has the package r-cran-rmpi, which is described as the GNU R package interfacing MPI libraries...so that seems to be another way to use multiple cores, though I have never used it...) Ray __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] (performance) time in Windows vs Linux
On Mon, 2009-06-29 at 19:21 +1200, Patrick Connolly wrote: On Mon, 29-Jun-2009 at 09:05AM +0200, Zeljko Vrba wrote: | On Mon, Jun 29, 2009 at 06:56:55PM +1200, Patrick Connolly wrote: | On Mon, 29-Jun-2009 at 02:13AM -0400, milton ruser wrote: | | | Really? | | | | In fact I have a quadcore. But how can I know if Linux are really | | using only one core, and how can I setup it to use the 4cores? | | I use GKrellM (install with aptitude install gkrellm if you don't | have it already). It shows a trace of the % activity for each CPU | where it's very clear if only one is being used. I've never had | occasion to change such a setting, but someone more skilled in such | things could say how to make it use more than one. | | How do you know that it is *R* that uses all 4 CPUs, and not other | applications on the system? What does running top in the | terminal say? If R uses more than one CPU, it's CPU usage will be | 100%. There are undoubtedly more scientific ways, but if the machine is idling with both krells showing a very low number like 2% BEFORE running the R script, and then we see only one suddenly become 99 or 100%, it's a fairly safe bet that it was R that made the difference -- particularly if it drops back once the R code finishes. I think top adds the two usages together, so values over 100% would be possible. The numbers shown by GKrellM could be thought of as more sensible where you have the choice of composite, real or both. Hmm, R is a single threaded application - you might be able to call functions that use multi-threaded compiled code that will use the extra cores, but R itself won't, whether it is running on Linux or Windows. On my 4 core workstation, top reports load averages up to (and a bit exceeding sometimes) 4 when I'm utilising all 4 cores for processing jobs - but that is only when I initiate 4 separate R processes; each of the 4 processes only ever has 100% maximum usage. I think you are being misled by krellms; on my 4 core workstation the cpu throttling application you can stick in the panel (cpufreq or cpuspeed) reports that pairs of cores hit full speed when required, say when running a single R process. But that R process is only using 100% of one core - if you have top or system monitor running, you'll see this. I think this is because the 4 cores are on two chips and if you need to run one core up to full speed, the other core on that chip also gets sped up, but it isn't crunching anything. G HTH -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] (performance) time in Windows vs Linux
I meant to write not so for 'top' in the final para. On Mon, 29 Jun 2009, Prof Brian Ripley wrote: On Mon, 29 Jun 2009, Raymond Wan wrote: milton ruser wrote: In fact I have a quadcore. But how can I know if Linux are really using only one core, and how can I setup it to use the 4cores? I don't know the answer in the context of R -- I didn't know that R can use multiple cores by default? It cannot, and much of this thread is pure speculation. So let's try to set the record straight (as we have already done in the manuals). The only way that a single R process will be using more than one CPU is if you have added a mulithreaded BLAS (and I've never heard of one being used successfully with R for Windows) or other add-on such as Luke Tierney's pmath[0] packages. Packages such as snow and multicore run multiple R processes. I do run a multithreaded BLAS on my 8-core Linux box and do often see 'top' well over 100% -- I just tested and saw 798.9%. It is exceptional to see R under Windows running faster than a well-tuned R under Linux on the same hardware (and my only Windows machine is a multiboot that normally runs Linux, so I do have extensive experience). There are a number of reasons - R for Windows always uses a shared library, whereas under Linux by default it does not, for speed -- see the R-admin manual. - MinGW until recently had only an older compiler, 4.2.1. (gcc 4.4.0 for mingw is just out, but I have not tried it). gcc 4.3.x has both better general optimizations and better support for the Core 2 Duo my machine has. - You can tune the Linux version better by compiling yourself (although some tuning is possible on Windows). - Linux uses interrupts for things that Windows polls (or for some instances R does on those platforms). That includes the overhead on Windows of running Rgui (if you are using that rather than Rterm) and polling the Windows message system. - 32-bit Linux allows access to more address space than 32-bit Windows, so there may be less frequent garbage collections on large tasks. In any case, the Linux memory manager is more efficient. Against that, a 64-bit build will in general be slower than a 32-bit one -- see the R-admin manual. If you run 32-bit R for Windows on 64-bit Windows you are running under a WOW subsystem and that has a small overhead: but in our tests the REvolution 64-bit build of R was slightly slower. But we are only talking about small differences, say up to 20% and usually more like 5-10%. It is usually possible to find some task that a particular compiler optimizes badly, so there will be rare exceptions. But in general, I use htop, whose man pages describes it as: This program is a free (GPL) ncurses-based process viewer. It is a colored version of top, essentially. At the top of the screen, you will see your 4 cores represented as percentages. Under Setup, add Processor to the list of options and then CPU will appear as a column, which if you have 4 cores, the values will vary from 1 to 4. If you want to check if R is running on more than one core, then obviously R should appear more than once and with two different values under CPU. Not so: that will happen if multiple copies of R are running, not if a single copy of R is running multiple threads. -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] (performance) time in Windows vs Linux
Prof Brian Ripley wrote: I meant to write not so for 'top' in the final para. Ah, I'm not certain enough to know that htop works for threads as well...so I was quick to jump to agreeing with you. :-) I only know it works for multi-cores... Ray __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] (performance) time in Windows vs Linux
Hi, Ilias. I think that is not ok to compare performance in different plataforms of different machines. To compare times, is necessary you execute the code at the two plataforms (Linux and Windows) in the same machine. But the problem here is other. The advanced users explained it well. So I ask them: Is it possible build the R (under Linux) and force the compilation make it in 32 bits (at a 64 bits machine)? Thanks to all, Cezar Freitas --- Em seg, 29/6/09, I. Soumpasis nono@gmail.com escreveu: De: I. Soumpasis nono@gmail.com Assunto: Re: [R] (performance) time in Windows vs Linux Para: Cézar Freitas cafanselm...@yahoo.com.br Data: Segunda-feira, 29 de Junho de 2009, 5:53 Hi Cezar, I tried your code in a core duo laptop (@2.5Gz) with ubuntu x86_64 with 4GB of RAM. Both R and libraries are compiled from source. These are the results. user system elapsed 23.861 0.812 25.065 It seems better than the windows. Is there a posibility that on linux you have a big workspace, or something is consuming your memory and thus R is forced to use swap memory? Just a speculation. Ilias PS. BTW It uses only one core. 2009/6/26 Cézar Freitas cafanselm...@yahoo.com.br #windows time # user system elapsed # 27.81 0.00 27.82 #linux usual compilation time # user system elapsed # 52.635 0.016 52.748 #linux (my compilation) time # user system elapsed # 52.567 0.016 52.588 #==END OF CODE Veja quais são os assuntos do momento no Yahoo! +Buscados http://br.maisbuscados.yahoo.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Veja quais são os assuntos do momento no Yahoo! +Buscados http://br.maisbuscados.yahoo.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] (performance) time in Windows vs Linux
--- Em seg, 29/6/09, I. Soumpasis nono@gmail.com escreveu: De: I. Soumpasis nono@gmail.com Assunto: Re: [R] (performance) time in Windows vs Linux Para: Cézar Freitas cafanselm...@yahoo.com.br Data: Segunda-feira, 29 de Junho de 2009, 5:53 Hi Cezar, I tried your code in a core duo laptop (@2.5Gz) with ubuntu x86_64 with 4GB of RAM. Both R and libraries are compiled from source. These are the results. user system elapsed 23.861 0.812 25.065 It seems better than the windows. Is there a posibility that on linux you have a big workspace, or something is consuming your memory and thus R is forced to use swap memory? Just a speculation. Ilias PS. BTW It uses only one core. 2009/6/26 Cézar Freitas cafanselm...@yahoo.com.br #windows time # user system elapsed # 27.81 0.00 27.82 #linux usual compilation time # user system elapsed # 52.635 0.016 52.748 #linux (my compilation) time # user system elapsed # 52.567 0.016 52.588 #==END OF CODE Veja quais são os assuntos do momento no Yahoo! +Buscados http://br.maisbuscados.yahoo.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Veja quais são os assuntos do momento no Yahoo! +Buscados http://br.maisbuscados.yahoo.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] (performance) time in Windows vs Linux
2009/6/29 Cézar Freitas cafanselm...@yahoo.com.br Hi, Ilias. I think that is not ok to compare performance in different plataforms of different machines. To compare times, is necessary you execute the code at the two plataforms (Linux and Windows) in the same machine. But the problem here is other. The advanced users explained it well. This is true. So I tried the same computer with windows XP and ubuntu 8.10 64bit dual core @3Gz and 4MB RAM Windows 32bit results: user system elapsed 21.660.02 21.69 Linux 64bit Results user system elapsed 27.242 0.004 27.275 This difference is small and it is truly explained by what the advanced users have said. Regards, Ilias [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] (performance) time in Windows vs Linux
Hi, I. Soumpasis wrote: 2009/6/29 C騷ar Freitas cafanselm...@yahoo.com.br This is true. So I tried the same computer with windows XP and ubuntu 8.10 64bit dual core @3Gz and 4MB RAM Windows 32bit results: user system elapsed 21.660.02 21.69 Linux 64bit Results user system elapsed 27.242 0.004 27.275 This difference is small and it is truly explained by what the advanced users have said. One minor comment which I forgot to mention is that a difference of 6 seconds for system that ran 30 seconds is worth noting, but may not be statistically significant. Especially when we're now talking about two completely different OS' and, thus, two different ways of timing a program. If whatever data file you are using is also on a different file system, then one could be fragmented, etc. My point is that your test might be true (and others have given you reasons for it), but also don't worry too much about it... :-) Ray __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] (performance) time in Windows vs Linux
On Fri, 26-Jun-2009 at 04:37PM -0400, milton ruser wrote: | Hi there, | | I have both systems on a DELL 64bit machine. | I compiled R 2.9.0 on both systems, to get 64bits capability. | Surpriselly, on Linux (Ubuntu with I installed 3 month ago) I spent | 41s to run the same test you did, and less time (35s) under | Vista. In fact I had noticed that I not have gained time when | running under linux (I had done jobs that run for several | days). But somethings I gain with memory managment, because for | some programs or steps, windows say that memory is full, while | Linux run up to the end of the job. I think there's a simple explanation for both of those observations. Your Windows installation is more than likely using both cores (or 4 if you have a quad core) while your Linux is using only 1 of them. best -- ~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~. ___Patrick Connolly {~._.~} Great minds discuss ideas _( Y )_ Average minds discuss events (:_~*~_:) Small minds discuss people (_)-(_) . Eleanor Roosevelt ~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] (performance) time in Windows vs Linux
On Fri, Jun 26, 2009 at 12:23:35PM -0700, Cézar Freitas wrote: I supposed R on Linux should be faster (32 and 64 bit) than windows version. Is this difference because 64 bit R version is slower than 32 bits one? I started the machine in both sittuations and checked free memory. I suspect that the compiler is to blame. Download Intel's C and C++ compiler for linux (it is free for personal use), try to compile R with it, and see what results you get (and report them here!). Of course, if you have the time and are willing to tinker :) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] (performance) time in Windows vs Linux
Hi there, I have both systems on a DELL 64bit machine. I compiled R 2.9.0 on both systems, to get 64bits capability. Surpriselly, on Linux (Ubuntu with I installed 3 month ago) I spent 41s to run the same test you did, and less time (35s) under Vista. In fact I had noticed that I not have gained time when running under linux (I had done jobs that run for several days). But somethings I gain with memory managment, because for some programs or steps, windows say that memory is full, while Linux run up to the end of the job. good luck milton On Fri, Jun 26, 2009 at 4:01 PM, Zeljko Vrba zv...@ifi.uio.no wrote: On Fri, Jun 26, 2009 at 12:23:35PM -0700, Cézar Freitas wrote: I supposed R on Linux should be faster (32 and 64 bit) than windows version. Is this difference because 64 bit R version is slower than 32 bits one? I started the machine in both sittuations and checked free memory. I suspect that the compiler is to blame. Download Intel's C and C++ compiler for linux (it is free for personal use), try to compile R with it, and see what results you get (and report them here!). Of course, if you have the time and are willing to tinker :) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] (performance) time in Windows vs Linux
Yes, under 64-bit it is sometimes slower and it highly depends on the problem and the compiler you have. Note also that nobody managed to get a 64-bit Windows R binary compiled with gcc so far. Remember, 10 years ago there was the SUN Ultra Sparc III and above architecture, and gcc was known to produce extremely inefficient 64-bit binaries for that platform. Things got somewhat better in the meantime. With the tests I used 32-bit R compiled with gcc was roughly 10% slower on Windows than under Linux - but as I said, it depends on the problem. Trying loops versus matrix operations is a way to specify a two very different problems, for example. Uwe Ligges Cézar Freitas wrote: Hi, all. I began to migrate my R codes from Windows to Linux and surprised me with an old question. I simplified the problem and made a little test to compare times at same computer and the Linux time is worse (not so little) than Windows time: 28 vs 53 seconds. I make an example (below) to facilitate all to see the difference. I also build from source (it's my first time) a version of R to compare with the distributed (compiled) R version. The times are similar to the other Linux version. I supposed R on Linux should be faster (32 and 64 bit) than windows version. Is this difference because 64 bit R version is slower than 32 bits one? I started the machine in both sittuations and checked free memory. Tecnichal details: Machine: Intel Core 2 Duo DDR2 4 Gb RAM Windows version: XP Professional - 32 bits R version: 2.9* binaries Linux version: Ubuntu 8* (Hardy) - 64 bits R version: 2.9* binaries and 2.9* compiled from source Thanks to all, Cezar Freitas #code N = 5 n = 15000 #makes data dad = as.data.frame(cbind(sample(N,N,replace=FALSE), rpois(N,30))) names(dad) = c(id,age) aux = as.data.frame(cbind(sample(N,n,replace=FALSE), round(runif(n),4))) names(aux) = c(id,score) #calculates time set.seed(790) #to be equal to everyone system.time({ dad$score = 0 subdad = subset(dad, id%in%aux$id) for(k in 1:(dim(subdad)[1])){ temp = aux$score[aux$id==subdad$id[k]] if(length(temp)) subdad$score[k] = temp } }) #windows time # user system elapsed # 27.810.00 27.82 #linux usual compilation time # user system elapsed # 52.635 0.016 52.748 #linux (my compilation) time # user system elapsed # 52.567 0.016 52.588 #==END OF CODE Veja quais são os assuntos do momento no Yahoo! +Buscados http://br.maisbuscados.yahoo.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.