Re: x64 speed

2009-02-05 Thread Robin Becker

Martin v. Löwis wrote:

Is it the
x64 working faster at its design sizes


Another guess (still from the darkness of not having received the
slightest clue what the test actually does): if it creates integers
in range(2**32, 2**64), then they fit into a Python int on AMD64-Linux,
but require a Python long on 32-bit Windows; long operations are much
slower than int operations.


..
I don't think we're doing a lot of bignum arithmetic, some masking operations 
etc etc.


--
Robin Becker

--
http://mail.python.org/mailman/listinfo/python-list


Re: x64 speed

2009-02-05 Thread Robin Becker

...

--
Ran 193 tests in 27.841s

OK

real0m28.150s
user0m26.606s
sys 0m0.917s
[rpt...@localhost tests]$

magical how the total python time is less than the real time.


time(1) also measures the Python startup and shutdown time, so
I don't quite see the magic :-(



yes stupid me :(


FWIW: VMware VMs need the VMware tools installed to make their
clocks work more or less. With Linux, you need some extra tweaks
as well, otherwise the clocks are just completely unreliable.



I do have the tools installed and from what I can see the clock isn't so far 
off. At least when I run the two tests side by side the vm run always finishes 
first. Of course that could be because vmware is stealing cpu somehow.



See these notes:

http://kb.vmware.com/selfservice/viewContent.do?language=en_USexternalId=1420
http://communities.vmware.com/message/782173




--
Robin Becker

--
http://mail.python.org/mailman/listinfo/python-list


Re: x64 speed

2009-02-04 Thread Robin Becker

Martin v. Löwis wrote:

I follow David's guess that Linux does better IO than Windows (not
knowing anything about the benchmark, of course)


I originally thought it must be the vmware host stuff offloading IO to
the second core, but watching with sysinternals didn't show a lot of
extra stuff going on with the vm compared to just running on the host.


I'm not talking about vmware. I'm suggesting that Linux ext3, and the
Linux buffer handling, is just more efficient than NTFS, and the Windows
buffer handling.

If you split the total runtime into system time and user time, how do
the 30s split up?

...
so here is one for the vm clock is bad theorists :)



[rpt...@localhost tests]$ time python25 runAll.py
.


.

--
Ran 193 tests in 27.841s

OK

real0m28.150s
user0m26.606s
sys 0m0.917s
[rpt...@localhost tests]$


magical how the total python time is less than the real time.
--
Robin Becker
--
http://mail.python.org/mailman/listinfo/python-list


Re: x64 speed

2009-02-04 Thread Floris Bruynooghe
On Feb 4, 10:14 am, Robin Becker ro...@reportlab.com wrote:
  [rpt...@localhost tests]$ time python25 runAll.py
  .

 .

  --
  Ran 193 tests in 27.841s

  OK

  real    0m28.150s
  user    0m26.606s
  sys     0m0.917s
  [rpt...@localhost tests]$

 magical how the total python time is less than the real time.

Not really.  Python was still running at the time that it prints the
time of the tests.  So it's only natural that the wall time Python
prints on just the tests is going to be smaller then the wall time
time prints for the entire python process.  Same for when it starts,
some stuff is done in Python before it starts its timer.

Regards
Floris

--
http://mail.python.org/mailman/listinfo/python-list


Re: x64 speed

2009-02-04 Thread Martin v. Löwis
 Is it the
 x64 working faster at its design sizes

Another guess (still from the darkness of not having received the
slightest clue what the test actually does): if it creates integers
in range(2**32, 2**64), then they fit into a Python int on AMD64-Linux,
but require a Python long on 32-bit Windows; long operations are much
slower than int operations.

Regards,
Martin
--
http://mail.python.org/mailman/listinfo/python-list


Re: x64 speed

2009-02-04 Thread M.-A. Lemburg
On 2009-02-04 11:14, Robin Becker wrote:
 Martin v. Löwis wrote:
 I follow David's guess that Linux does better IO than Windows (not
 knowing anything about the benchmark, of course)

 I originally thought it must be the vmware host stuff offloading IO to
 the second core, but watching with sysinternals didn't show a lot of
 extra stuff going on with the vm compared to just running on the host.

 I'm not talking about vmware. I'm suggesting that Linux ext3, and the
 Linux buffer handling, is just more efficient than NTFS, and the Windows
 buffer handling.

 If you split the total runtime into system time and user time, how do
 the 30s split up?
 ...
 so here is one for the vm clock is bad theorists :)
 
 
 [rpt...@localhost tests]$ time python25 runAll.py
 .
 
 .
 --
 Ran 193 tests in 27.841s

 OK

 real0m28.150s
 user0m26.606s
 sys 0m0.917s
 [rpt...@localhost tests]$
 
 magical how the total python time is less than the real time.

time(1) also measures the Python startup and shutdown time, so
I don't quite see the magic :-(

FWIW: VMware VMs need the VMware tools installed to make their
clocks work more or less. With Linux, you need some extra tweaks
as well, otherwise the clocks are just completely unreliable.

See these notes:

http://kb.vmware.com/selfservice/viewContent.do?language=en_USexternalId=1420
http://communities.vmware.com/message/782173

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Feb 04 2009)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


::: Try our new mxODBC.Connect Python Database Interface for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
   http://www.egenix.com/company/contact/
--
http://mail.python.org/mailman/listinfo/python-list


Re: x64 speed

2009-02-03 Thread Tim Daneliuk
Robin Becker wrote:
 Whilst doing some portability testing with reportlab I noticed a strange
 speedup for our unittest suite with python2.5
 
 host  win32 xp3 unittest time=42.2 seconds
 vmware RHEL x64 unittest time=30.9 seconds
 
 so it looks like the vmware emulated system is much faster. Is it the
 x64 working faster at its design sizes or perhaps the compiler or could
 it be the vmware system caching all writes etc etc? For the red hat x64
 build the only special configuration was to use ucs2
 
 I know that the VT bit stuff has made virtualization much better, but
 this seems a bit weird.

Which vmware product?

-- 

Tim Daneliuk tun...@tundraware.com
PGP Key: http://www.tundraware.com/PGP/
--
http://mail.python.org/mailman/listinfo/python-list


Re: x64 speed

2009-02-03 Thread David Cournapeau
On Wed, Feb 4, 2009 at 2:36 AM, Robin Becker ro...@reportlab.com wrote:
 Whilst doing some portability testing with reportlab I noticed a strange
 speedup for our unittest suite with python2.5

 host  win32 xp3 unittest time=42.2 seconds
 vmware RHEL x64 unittest time=30.9 seconds

 so it looks like the vmware emulated system is much faster. Is it the x64
 working faster at its design sizes or perhaps the compiler or could it be
 the vmware system caching all writes etc etc? For the red hat x64 build the
 only special configuration was to use ucs2

 I know that the VT bit stuff has made virtualization much better, but this
 seems a bit weird.

It can be many things of course depending on your configuration and
what you are doing in your unit tests, but I don't find it weird at
all. I often see faster results when IO is involed in my Ubuntu in
vmware fusion than on mac os X - again, can be that the vm is not
loaded (less python packages, faster import times), Linux better IO
handling (I don't know whether this is true or not, but I could at
least imagine that linux FS generally being faster than windows or mac
os X ones, this could influence IO)

It can also be compilers differences, 32 vs 64 bits as you say, etc...
If you want to be sure, you should try a window VM :)

David
--
http://mail.python.org/mailman/listinfo/python-list


Re: x64 speed

2009-02-03 Thread Martin v. Löwis
Robin Becker wrote:
 Whilst doing some portability testing with reportlab I noticed a strange
 speedup for our unittest suite with python2.5
 
 host  win32 xp3 unittest time=42.2 seconds
 vmware RHEL x64 unittest time=30.9 seconds
 
 so it looks like the vmware emulated system is much faster. Is it the
 x64 working faster at its design sizes or perhaps the compiler or could
 it be the vmware system caching all writes etc etc? 

I follow David's guess that Linux does better IO than Windows (not
knowing anything about the benchmark, of course)

Regards,
Martin
--
http://mail.python.org/mailman/listinfo/python-list


Re: x64 speed

2009-02-03 Thread Diez B. Roggisch

Robin Becker schrieb:
Whilst doing some portability testing with reportlab I noticed a strange 
speedup for our unittest suite with python2.5


host  win32 xp3 unittest time=42.2 seconds
vmware RHEL x64 unittest time=30.9 seconds

so it looks like the vmware emulated system is much faster. Is it the 
x64 working faster at its design sizes or perhaps the compiler or could 
it be the vmware system caching all writes etc etc? For the red hat x64 
build the only special configuration was to use ucs2


I know that the VT bit stuff has made virtualization much better, but 
this seems a bit weird.



AFAIK some VMs have difficulties with timers. For example, my 
virtualized KDE has that jumping icon when starting a program - and 
that's *much* faster jumping inside VBox :)


So - are you sure it *is* faster?

Diez
--
http://mail.python.org/mailman/listinfo/python-list


Re: x64 speed

2009-02-03 Thread Paul Rubin
Robin Becker ro...@reportlab.com writes:
 so it looks like the vmware emulated system is much faster. Is it the
 x64 working faster at its design sizes or perhaps the compiler or
 could it be the vmware system caching all writes etc etc? For the red
 hat x64 build the only special configuration was to use ucs2

You have to control all these variables separately in order to know.
But, 64 bit code is in general faster than 32 bit code when properly
compiled: more cpu registers, wider moves when copying large blocks of
data, floating point registers instead of the legacy stack-oriented
FPU, etc.
--
http://mail.python.org/mailman/listinfo/python-list


Re: x64 speed

2009-02-03 Thread Martin v. Löwis
 I follow David's guess that Linux does better IO than Windows (not
 knowing anything about the benchmark, of course)

 I originally thought it must be the vmware host stuff offloading IO to
 the second core, but watching with sysinternals didn't show a lot of
 extra stuff going on with the vm compared to just running on the host.

I'm not talking about vmware. I'm suggesting that Linux ext3, and the
Linux buffer handling, is just more efficient than NTFS, and the Windows
buffer handling.

If you split the total runtime into system time and user time, how do
the 30s split up?

Regards,
Martin
--
http://mail.python.org/mailman/listinfo/python-list


Re: x64 speed

2009-02-03 Thread Robin Becker

Diez B. Roggisch wrote:

Robin Becker schrieb:
Whilst doing some portability testing with reportlab I noticed a 
strange speedup for our unittest suite with python2.5


host  win32 xp3 unittest time=42.2 seconds
vmware RHEL x64 unittest time=30.9 seconds

so it looks like the vmware emulated system is much faster. Is it the 
x64 working faster at its design sizes or perhaps the compiler or 
could it be the vmware system caching all writes etc etc? For the red 
hat x64 build the only special configuration was to use ucs2


I know that the VT bit stuff has made virtualization much better, but 
this seems a bit weird.



AFAIK some VMs have difficulties with timers. For example, my 
virtualized KDE has that jumping icon when starting a program - and 
that's *much* faster jumping inside VBox :)



..

Diez

I started both in terminals and the host never wins :)

--
Robin Becker
--
http://mail.python.org/mailman/listinfo/python-list


Re: x64 speed

2009-02-03 Thread Robin Becker

Tim Daneliuk wrote:
..


Which vmware product?



vmware server
--
Robin Becker
--
http://mail.python.org/mailman/listinfo/python-list


Re: x64 speed

2009-02-03 Thread Robin Becker

Martin v. Löwis wrote:
.

I follow David's guess that Linux does better IO than Windows (not
knowing anything about the benchmark, of course)

Regards,
Martin


I originally thought it must be the vmware host stuff offloading IO to 
the second core, but watching with sysinternals didn't show a lot of 
extra stuff going on with the vm compared to just running on the host.

--
Robin Becker
--
http://mail.python.org/mailman/listinfo/python-list


Re: x64 speed

2009-02-03 Thread Robin Becker

Paul Rubin wrote:

Robin Becker ro...@reportlab.com writes:

so it looks like the vmware emulated system is much faster. Is it the
x64 working faster at its design sizes or perhaps the compiler or
could it be the vmware system caching all writes etc etc? For the red
hat x64 build the only special configuration was to use ucs2


You have to control all these variables separately in order to know.
But, 64 bit code is in general faster than 32 bit code when properly
compiled: more cpu registers, wider moves when copying large blocks of
data, floating point registers instead of the legacy stack-oriented
FPU, etc.


I tried looking at the cpu usage whilst running these and by eye it 
seemed that the host system was running more parallel stuff than the 
vmware vm.

--
Robin Becker
--
http://mail.python.org/mailman/listinfo/python-list