Re: x64 speed
Martin v. Löwis wrote: Is it the x64 working faster at its design sizes Another guess (still from the darkness of not having received the slightest clue what the test actually does): if it creates integers in range(2**32, 2**64), then they fit into a Python int on AMD64-Linux, but require a Python long on 32-bit Windows; long operations are much slower than int operations. .. I don't think we're doing a lot of bignum arithmetic, some masking operations etc etc. -- Robin Becker -- http://mail.python.org/mailman/listinfo/python-list
Re: x64 speed
... -- Ran 193 tests in 27.841s OK real0m28.150s user0m26.606s sys 0m0.917s [rpt...@localhost tests]$ magical how the total python time is less than the real time. time(1) also measures the Python startup and shutdown time, so I don't quite see the magic :-( yes stupid me :( FWIW: VMware VMs need the VMware tools installed to make their clocks work more or less. With Linux, you need some extra tweaks as well, otherwise the clocks are just completely unreliable. I do have the tools installed and from what I can see the clock isn't so far off. At least when I run the two tests side by side the vm run always finishes first. Of course that could be because vmware is stealing cpu somehow. See these notes: http://kb.vmware.com/selfservice/viewContent.do?language=en_USexternalId=1420 http://communities.vmware.com/message/782173 -- Robin Becker -- http://mail.python.org/mailman/listinfo/python-list
Re: x64 speed
Martin v. Löwis wrote: I follow David's guess that Linux does better IO than Windows (not knowing anything about the benchmark, of course) I originally thought it must be the vmware host stuff offloading IO to the second core, but watching with sysinternals didn't show a lot of extra stuff going on with the vm compared to just running on the host. I'm not talking about vmware. I'm suggesting that Linux ext3, and the Linux buffer handling, is just more efficient than NTFS, and the Windows buffer handling. If you split the total runtime into system time and user time, how do the 30s split up? ... so here is one for the vm clock is bad theorists :) [rpt...@localhost tests]$ time python25 runAll.py . . -- Ran 193 tests in 27.841s OK real0m28.150s user0m26.606s sys 0m0.917s [rpt...@localhost tests]$ magical how the total python time is less than the real time. -- Robin Becker -- http://mail.python.org/mailman/listinfo/python-list
Re: x64 speed
On Feb 4, 10:14 am, Robin Becker ro...@reportlab.com wrote: [rpt...@localhost tests]$ time python25 runAll.py . . -- Ran 193 tests in 27.841s OK real 0m28.150s user 0m26.606s sys 0m0.917s [rpt...@localhost tests]$ magical how the total python time is less than the real time. Not really. Python was still running at the time that it prints the time of the tests. So it's only natural that the wall time Python prints on just the tests is going to be smaller then the wall time time prints for the entire python process. Same for when it starts, some stuff is done in Python before it starts its timer. Regards Floris -- http://mail.python.org/mailman/listinfo/python-list
Re: x64 speed
Is it the x64 working faster at its design sizes Another guess (still from the darkness of not having received the slightest clue what the test actually does): if it creates integers in range(2**32, 2**64), then they fit into a Python int on AMD64-Linux, but require a Python long on 32-bit Windows; long operations are much slower than int operations. Regards, Martin -- http://mail.python.org/mailman/listinfo/python-list
Re: x64 speed
On 2009-02-04 11:14, Robin Becker wrote: Martin v. Löwis wrote: I follow David's guess that Linux does better IO than Windows (not knowing anything about the benchmark, of course) I originally thought it must be the vmware host stuff offloading IO to the second core, but watching with sysinternals didn't show a lot of extra stuff going on with the vm compared to just running on the host. I'm not talking about vmware. I'm suggesting that Linux ext3, and the Linux buffer handling, is just more efficient than NTFS, and the Windows buffer handling. If you split the total runtime into system time and user time, how do the 30s split up? ... so here is one for the vm clock is bad theorists :) [rpt...@localhost tests]$ time python25 runAll.py . . -- Ran 193 tests in 27.841s OK real0m28.150s user0m26.606s sys 0m0.917s [rpt...@localhost tests]$ magical how the total python time is less than the real time. time(1) also measures the Python startup and shutdown time, so I don't quite see the magic :-( FWIW: VMware VMs need the VMware tools installed to make their clocks work more or less. With Linux, you need some extra tweaks as well, otherwise the clocks are just completely unreliable. See these notes: http://kb.vmware.com/selfservice/viewContent.do?language=en_USexternalId=1420 http://communities.vmware.com/message/782173 -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Feb 04 2009) Python/Zope Consulting and Support ...http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ ::: Try our new mxODBC.Connect Python Database Interface for free ! eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ -- http://mail.python.org/mailman/listinfo/python-list
Re: x64 speed
Robin Becker wrote: Whilst doing some portability testing with reportlab I noticed a strange speedup for our unittest suite with python2.5 host win32 xp3 unittest time=42.2 seconds vmware RHEL x64 unittest time=30.9 seconds so it looks like the vmware emulated system is much faster. Is it the x64 working faster at its design sizes or perhaps the compiler or could it be the vmware system caching all writes etc etc? For the red hat x64 build the only special configuration was to use ucs2 I know that the VT bit stuff has made virtualization much better, but this seems a bit weird. Which vmware product? -- Tim Daneliuk tun...@tundraware.com PGP Key: http://www.tundraware.com/PGP/ -- http://mail.python.org/mailman/listinfo/python-list
Re: x64 speed
On Wed, Feb 4, 2009 at 2:36 AM, Robin Becker ro...@reportlab.com wrote: Whilst doing some portability testing with reportlab I noticed a strange speedup for our unittest suite with python2.5 host win32 xp3 unittest time=42.2 seconds vmware RHEL x64 unittest time=30.9 seconds so it looks like the vmware emulated system is much faster. Is it the x64 working faster at its design sizes or perhaps the compiler or could it be the vmware system caching all writes etc etc? For the red hat x64 build the only special configuration was to use ucs2 I know that the VT bit stuff has made virtualization much better, but this seems a bit weird. It can be many things of course depending on your configuration and what you are doing in your unit tests, but I don't find it weird at all. I often see faster results when IO is involed in my Ubuntu in vmware fusion than on mac os X - again, can be that the vm is not loaded (less python packages, faster import times), Linux better IO handling (I don't know whether this is true or not, but I could at least imagine that linux FS generally being faster than windows or mac os X ones, this could influence IO) It can also be compilers differences, 32 vs 64 bits as you say, etc... If you want to be sure, you should try a window VM :) David -- http://mail.python.org/mailman/listinfo/python-list
Re: x64 speed
Robin Becker wrote: Whilst doing some portability testing with reportlab I noticed a strange speedup for our unittest suite with python2.5 host win32 xp3 unittest time=42.2 seconds vmware RHEL x64 unittest time=30.9 seconds so it looks like the vmware emulated system is much faster. Is it the x64 working faster at its design sizes or perhaps the compiler or could it be the vmware system caching all writes etc etc? I follow David's guess that Linux does better IO than Windows (not knowing anything about the benchmark, of course) Regards, Martin -- http://mail.python.org/mailman/listinfo/python-list
Re: x64 speed
Robin Becker schrieb: Whilst doing some portability testing with reportlab I noticed a strange speedup for our unittest suite with python2.5 host win32 xp3 unittest time=42.2 seconds vmware RHEL x64 unittest time=30.9 seconds so it looks like the vmware emulated system is much faster. Is it the x64 working faster at its design sizes or perhaps the compiler or could it be the vmware system caching all writes etc etc? For the red hat x64 build the only special configuration was to use ucs2 I know that the VT bit stuff has made virtualization much better, but this seems a bit weird. AFAIK some VMs have difficulties with timers. For example, my virtualized KDE has that jumping icon when starting a program - and that's *much* faster jumping inside VBox :) So - are you sure it *is* faster? Diez -- http://mail.python.org/mailman/listinfo/python-list
Re: x64 speed
Robin Becker ro...@reportlab.com writes: so it looks like the vmware emulated system is much faster. Is it the x64 working faster at its design sizes or perhaps the compiler or could it be the vmware system caching all writes etc etc? For the red hat x64 build the only special configuration was to use ucs2 You have to control all these variables separately in order to know. But, 64 bit code is in general faster than 32 bit code when properly compiled: more cpu registers, wider moves when copying large blocks of data, floating point registers instead of the legacy stack-oriented FPU, etc. -- http://mail.python.org/mailman/listinfo/python-list
Re: x64 speed
I follow David's guess that Linux does better IO than Windows (not knowing anything about the benchmark, of course) I originally thought it must be the vmware host stuff offloading IO to the second core, but watching with sysinternals didn't show a lot of extra stuff going on with the vm compared to just running on the host. I'm not talking about vmware. I'm suggesting that Linux ext3, and the Linux buffer handling, is just more efficient than NTFS, and the Windows buffer handling. If you split the total runtime into system time and user time, how do the 30s split up? Regards, Martin -- http://mail.python.org/mailman/listinfo/python-list
Re: x64 speed
Diez B. Roggisch wrote: Robin Becker schrieb: Whilst doing some portability testing with reportlab I noticed a strange speedup for our unittest suite with python2.5 host win32 xp3 unittest time=42.2 seconds vmware RHEL x64 unittest time=30.9 seconds so it looks like the vmware emulated system is much faster. Is it the x64 working faster at its design sizes or perhaps the compiler or could it be the vmware system caching all writes etc etc? For the red hat x64 build the only special configuration was to use ucs2 I know that the VT bit stuff has made virtualization much better, but this seems a bit weird. AFAIK some VMs have difficulties with timers. For example, my virtualized KDE has that jumping icon when starting a program - and that's *much* faster jumping inside VBox :) .. Diez I started both in terminals and the host never wins :) -- Robin Becker -- http://mail.python.org/mailman/listinfo/python-list
Re: x64 speed
Tim Daneliuk wrote: .. Which vmware product? vmware server -- Robin Becker -- http://mail.python.org/mailman/listinfo/python-list
Re: x64 speed
Martin v. Löwis wrote: . I follow David's guess that Linux does better IO than Windows (not knowing anything about the benchmark, of course) Regards, Martin I originally thought it must be the vmware host stuff offloading IO to the second core, but watching with sysinternals didn't show a lot of extra stuff going on with the vm compared to just running on the host. -- Robin Becker -- http://mail.python.org/mailman/listinfo/python-list
Re: x64 speed
Paul Rubin wrote: Robin Becker ro...@reportlab.com writes: so it looks like the vmware emulated system is much faster. Is it the x64 working faster at its design sizes or perhaps the compiler or could it be the vmware system caching all writes etc etc? For the red hat x64 build the only special configuration was to use ucs2 You have to control all these variables separately in order to know. But, 64 bit code is in general faster than 32 bit code when properly compiled: more cpu registers, wider moves when copying large blocks of data, floating point registers instead of the legacy stack-oriented FPU, etc. I tried looking at the cpu usage whilst running these and by eye it seemed that the host system was running more parallel stuff than the vmware vm. -- Robin Becker -- http://mail.python.org/mailman/listinfo/python-list