Re: [OT] IIS7/isapi/tomcat performance

2011-03-03 Thread Tony Anecito
Hi Chuck,

You did not see my earlier response where I came to the same conclusion  about 
the types after looking at some other sites including a wiki. Yes there was 
some 
confusion  but now I am clear that it is compiler dependant as I said earlier.
 
Thanks,
-Tony


  

-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



Re: [OT] IIS7/isapi/tomcat performance

2011-03-02 Thread Christopher Schultz
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Tony,

On 3/1/2011 6:27 PM, Tony Anecito wrote:
 I believe the effect of compression is relative. In other words for a big 
 program with lots of 64-bit pointers and 64-bit longs it is helps but for 
 small 
 programs it does not.

A long in Java is always 64 bits. Those /will/ be faster on a 64-bit
architecture. The only reason any of this is a problem is because
pointers (somewhat) unexpectedly double in size when moving from a
32-bit to a 64-bit platform. If you were running fine in a 128MiB heap
on a 32-bit machine, you may well have to increase your heap size on a
64-bit machine just to store the exact same set of objects.

 I would hope the full 64-bit data bus would be used. So you think 32-pins on 
 the 
 processor are not used when running a 32-bit process?

It depends upon exactly what the processor id doing. Those chips with
bundled x86 cores will use the x86 core (which is /only/ 32-bit, so
there's no option for 64-bit operations). Those chips which have only
x86-64 chips will either use 64 bits to manipulate 32-bit data (and
effectively waste the 32 most significant bits) or wave their hands
wildly and achieve some sort of miracle where 32-bit processes run twice
as fast because of a wider word size.

- -chris
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.10 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk1u+RsACgkQ9CaO5/Lv0PCXegCfYWZr5Z8gOpHLH4g0FM3aJE5Z
ovEAn02zREkR5mqq1wX4dagQAq9MvACz
=v55r
-END PGP SIGNATURE-

-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



Re: [OT] IIS7/isapi/tomcat performance

2011-03-02 Thread Christopher Schultz
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Chuck,

On 3/1/2011 6:09 PM, Caldarale, Charles R wrote:
 From: Christopher Schultz [mailto:ch...@christopherschultz.net] 
 Subject: Re: [OT] IIS7/isapi/tomcat performance
 
 I don't understand why communicating a 64-bit value over a 
 64-bit bus would take longer than communicating a 32-bit 
 value over a 64-bit bus:
 
 Because you get *two* 32-bit values for one transfer, not just one. 

If, as you say, Intel can move 64 /bytes/ across a data path (if you
prefer that phrase over the bus) then the word size really does make a
difference, here. They should be getting 16 32-bit words across such a
data path or 8 64-bit words.

If the pointers are doubling in size, this makes 64-bit mode go slower
because you get half the throughput when using word-sized values. Since
pointers in general are word-sized, they always suffer while other
(usually smaller) data does not.

The key is that the data path(s) are actually much wider than the word
size, which I didn't realize.

 I also get that some processors (like Itanium) have an x84
 processor core on the die
 
 (Presumably, you meant x86.)  Sorry, Itanium was notoriously bad at running 
 32-bit apps.

I did mean x86. Lots of typing yesterday. The new Itaniums are supposed
to be actually worth it, though.

 getting the data from point A to point B shouldn't matter
 
 Sure it does, if you can batch multiple operand accesses together (which 
 current Intel cores do).
 
 I suppose of the CPU knew it was in a 32-bit mode, it could 
 adjust the number of clock ticks it had to wait around for 
 32-bit data to go through an adder, but that seems overly 
 complicated for a straightforward CPU task.
 
 Simple adders have only used one cycle for decades, regardless of the width.

If the clock tick is long enough :)

- -chris
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.10 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk1u/FUACgkQ9CaO5/Lv0PBBHACfQsXMTwCmZywZrihKJI3M0k5c
BdoAn3VrrewxdTHZU0TZvR1pbQcKFwVj
=1Png
-END PGP SIGNATURE-

-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



Re: [OT] IIS7/isapi/tomcat performance

2011-03-02 Thread Tony Anecito
Actually according to the IBM porting guide longs are different byte lengths 
depending upon what frame of reference they are speaking to.
On page 4 of the following port guide:

http://public.dhe.ibm.com/software/dw/jdk/64bitporting/64BitJavaPortingGuide.pdf

It states:For Windows, on 32-bit systems, integers, longs and pointers are all 
32-bits. On 64-bit systems, integers and
longs remain 32-bits, but pointers become 64-bits and long longs are 
64-bits.integers remain 32-bits and longs and pointers become 64-bits.
I could have interpreted this wrong but from a OS standpoint native code this 
is 
what they said. Now if the byte code is transalated to native code (which it 
must be to run). This would explain why Windows might seem to run faster than 
Linux for 64-bit.
 
Regarding bus usage I agree with Chuck's explanation about usage that the 
processors and I said so in a previous message.
 
Regards,
-Tony



- Original Message 
From: Christopher Schultz ch...@christopherschultz.net
To: Tomcat Users List users@tomcat.apache.org
Sent: Wed, March 2, 2011 7:12:43 PM
Subject: Re: [OT] IIS7/isapi/tomcat performance

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Tony,

On 3/1/2011 6:27 PM, Tony Anecito wrote:
 I believe the effect of compression is relative. In other words for a big 
 program with lots of 64-bit pointers and 64-bit longs it is helps but for 
 small 

 programs it does not.

A long in Java is always 64 bits. Those /will/ be faster on a 64-bit
architecture. The only reason any of this is a problem is because
pointers (somewhat) unexpectedly double in size when moving from a
32-bit to a 64-bit platform. If you were running fine in a 128MiB heap
on a 32-bit machine, you may well have to increase your heap size on a
64-bit machine just to store the exact same set of objects.

 I would hope the full 64-bit data bus would be used. So you think 32-pins on 
the 

 processor are not used when running a 32-bit process?

It depends upon exactly what the processor id doing. Those chips with
bundled x86 cores will use the x86 core (which is /only/ 32-bit, so
there's no option for 64-bit operations). Those chips which have only
x86-64 chips will either use 64 bits to manipulate 32-bit data (and
effectively waste the 32 most significant bits) or wave their hands
wildly and achieve some sort of miracle where 32-bit processes run twice
as fast because of a wider word size.

- -chris
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.10 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk1u+RsACgkQ9CaO5/Lv0PCXegCfYWZr5Z8gOpHLH4g0FM3aJE5Z
ovEAn02zREkR5mqq1wX4dagQAq9MvACz
=v55r
-END PGP SIGNATURE-

-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org

For AIX and Linux, on 32-bit systems, integers, longs and pointers are all 
32-bits. On 64-bit systems, 




-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



Re: [OT] IIS7/isapi/tomcat performance

2011-03-02 Thread Tony Anecito
On the wiki Java long is 64-bits not sure what a Long is. So IBM is thinking 
C,C++ a long is 32bits which is what the paper meant.

So I as wrong.

Regards,
-Tony



- Original Message 
From: Tony Anecito adanec...@yahoo.com
To: Tomcat Users List users@tomcat.apache.org
Sent: Wed, March 2, 2011 9:15:09 PM
Subject: Re: [OT] IIS7/isapi/tomcat performance

Actually according to the IBM porting guide longs are different byte lengths 
depending upon what frame of reference they are speaking to.
On page 4 of the following port guide:

http://public.dhe.ibm.com/software/dw/jdk/64bitporting/64BitJavaPortingGuide.pdf

It states:For Windows, on 32-bit systems, integers, longs and pointers are all 
32-bits. On 64-bit systems, integers and
longs remain 32-bits, but pointers become 64-bits and long longs are 
64-bits.integers remain 32-bits and longs and pointers become 64-bits.
I could have interpreted this wrong but from a OS standpoint native code this 
is 

what they said. Now if the byte code is transalated to native code (which it 
must be to run). This would explain why Windows might seem to run faster than 
Linux for 64-bit.
 
Regarding bus usage I agree with Chuck's explanation about usage that the 
processors and I said so in a previous message.
 
Regards,
-Tony



- Original Message 
From: Christopher Schultz ch...@christopherschultz.net
To: Tomcat Users List users@tomcat.apache.org
Sent: Wed, March 2, 2011 7:12:43 PM
Subject: Re: [OT] IIS7/isapi/tomcat performance

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Tony,

On 3/1/2011 6:27 PM, Tony Anecito wrote:
 I believe the effect of compression is relative. In other words for a big 
 program with lots of 64-bit pointers and 64-bit longs it is helps but for 
 small 


 programs it does not.

A long in Java is always 64 bits. Those /will/ be faster on a 64-bit
architecture. The only reason any of this is a problem is because
pointers (somewhat) unexpectedly double in size when moving from a
32-bit to a 64-bit platform. If you were running fine in a 128MiB heap
on a 32-bit machine, you may well have to increase your heap size on a
64-bit machine just to store the exact same set of objects.

 I would hope the full 64-bit data bus would be used. So you think 32-pins on 
the 

 processor are not used when running a 32-bit process?

It depends upon exactly what the processor id doing. Those chips with
bundled x86 cores will use the x86 core (which is /only/ 32-bit, so
there's no option for 64-bit operations). Those chips which have only
x86-64 chips will either use 64 bits to manipulate 32-bit data (and
effectively waste the 32 most significant bits) or wave their hands
wildly and achieve some sort of miracle where 32-bit processes run twice
as fast because of a wider word size.

- -chris
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.10 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk1u+RsACgkQ9CaO5/Lv0PCXegCfYWZr5Z8gOpHLH4g0FM3aJE5Z
ovEAn02zREkR5mqq1wX4dagQAq9MvACz
=v55r
-END PGP SIGNATURE-

-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org

For AIX and Linux, on 32-bit systems, integers, longs and pointers are all 
32-bits. On 64-bit systems, 




-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org


  

-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



RE: [OT] IIS7/isapi/tomcat performance

2011-03-02 Thread Caldarale, Charles R
 From: Tony Anecito [mailto:adanec...@yahoo.com] 
 Subject: Re: [OT] IIS7/isapi/tomcat performance

 On page 4 of the following port guide:
 http://public.dhe.ibm.com/software/dw/jdk/64bitporting/64BitJavaPortingGuide.pdf

 It states:For Windows, on 32-bit systems, integers, longs and 
 pointers are all 32-bits. On 64-bit systems, integers and longs
 remain 32-bits, but pointers become 64-bits and long longs are 
 64-bits.

That ancient porting guide is misleading in several respects, one in particular 
being that the OS determines the size of language-specific types.  That is 
incorrect; it's the *compiler* being used that makes that determination, not 
the platform or the OS.

 This would explain why Windows might seem to run faster than 
 Linux for 64-bit.

Sorry, that's completely false.  As Chris pointed out, Java non-reference type 
sizes are fixed, and are completely independent of the platform the Java 
program is running on.  The only thing that changes between a 32-bit JVM and a 
64-bit one is the size of a reference (pointer).  Even in the C and C++ code 
that makes up the core of the JVM, the programmers studiously avoid use of 
ambiguous C types such as int and long anywhere that it might make a 
difference, and instead use explicitly sized types.

 - Chuck


THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL AND/OR OTHERWISE PROPRIETARY 
MATERIAL and is thus for use only by the intended recipient. If you received 
this in error, please contact the sender and delete the e-mail and its 
attachments from all computers.


-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



Re: [OT] IIS7/isapi/tomcat performance

2011-03-01 Thread Christopher Schultz
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Tony,

On 2/28/2011 2:57 PM, Tony Anecito wrote:
 Since the memory pointers are larger you may need to increase your heap size 
 but 
 you can compress the address pointers.

+1

 Also, if you use JNI and it is 32-bit then you will have unexpected issues 
 same 
 thing with any native libs your try to use.

+1

 Generally it will be up to 20% slower due to the pointers.

Can you explain that claim? Unless the OP is using compressed pointers
(which will require a decode in order to dereference), why would the
performance drop when using 64-bit pointers instead of 32-bit pointers.
Presumably, the CPU has 64-bit (or bigger) registers and can handle
64-bit numbers just as fast as 32-bit numbers. Or do modern CPUs run in
g a32-bit mode where the hardware doesn't bother to add-out to the 33+ bits?

- -chris
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.10 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk1tEMcACgkQ9CaO5/Lv0PBa8ACgmRjggPsYHma8tShCNK2WfOJd
Qv8AoJ0KGEVwKQRDfSvwAvoF2Is5oHoW
=Anih
-END PGP SIGNATURE-

-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



Re: [OT] IIS7/isapi/tomcat performance

2011-03-01 Thread Tony Anecito
Hi Chris,

The performance degregation for 64 bit versus 32 bit has been the subject of 
much concern in the java community.

Here is the number I mentioned straight from Oracle itself:
http://www.oracle.com/technetwork/java/hotspotfaq-138619.html

What are the performance characteristics of 64-bit versus 32-bit VMs? 
Generally, the benefits of being able to address larger amounts of memory come 
with a small performance loss in 64-bit VMs versus running the same application 
on a 32-bit VM.  This is due to the fact that every native pointer in the 
system 
takes up 8 bytes instead of 4.  The loading of this extra data has an impact on 
memory usage which translates to slightly slower execution depending on how 
many 
pointers get loaded during the execution of your Java program.  The good news 
is 
that with AMD64 and EM64T platforms running in 64-bit mode, the Java VM gets 
some additional registers which it can use to generate more efficient native 
instruction sequences.  These extra registers increase performance to the point 
where there is often no performance loss at all when comparing 32 to 64-bit 
execution speed.   

The performance difference comparing an application running on a 64-bit 
platform 
versus a 32-bit platform on SPARC is on the order of 10-20% degradation when 
you 
move to a 64-bit VM.  On AMD64 and EM64T platforms this difference ranges from 
0-15% depending on the amount of pointer accessing your application performs.   




If you google using the keywords: java 64-bit vs 32-bit performance
You will find alot of discussion about this.

Regards,
-Tony


- Original Message 
From: Christopher Schultz ch...@christopherschultz.net
To: Tomcat Users List users@tomcat.apache.org
Sent: Tue, March 1, 2011 8:29:11 AM
Subject: Re: [OT] IIS7/isapi/tomcat performance

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Tony,

On 2/28/2011 2:57 PM, Tony Anecito wrote:
 Since the memory pointers are larger you may need to increase your heap size 
but 

 you can compress the address pointers.

+1

 Also, if you use JNI and it is 32-bit then you will have unexpected issues 
 same 

 thing with any native libs your try to use.

+1

 Generally it will be up to 20% slower due to the pointers.

Can you explain that claim? Unless the OP is using compressed pointers
(which will require a decode in order to dereference), why would the
performance drop when using 64-bit pointers instead of 32-bit pointers.
Presumably, the CPU has 64-bit (or bigger) registers and can handle
64-bit numbers just as fast as 32-bit numbers. Or do modern CPUs run in
g a32-bit mode where the hardware doesn't bother to add-out to the 33+ bits?

- -chris
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.10 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk1tEMcACgkQ9CaO5/Lv0PBa8ACgmRjggPsYHma8tShCNK2WfOJd
Qv8AoJ0KGEVwKQRDfSvwAvoF2Is5oHoW
=Anih
-END PGP SIGNATURE-

-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org




-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



Re: [OT] IIS7/isapi/tomcat performance

2011-03-01 Thread Tony Anecito
Also, I have not programmed in assembly language or in hexadecimal  for some 
time but I would hope that for a 32-bit java process running on a 64-bit 
processor I would fetch a 32-bit pointer and maybe a 32-bit long on a 64-bit 
data bus. Remember we are talking about pointers in code coming into the 
processor via the data bus.

Interesting enough for AIX and Linux a long is 64bit for 64-bit java versus 
32-bit for 64-bit windows. So it looks like for Linux it would be slower than 
windows.
See: 
http://public.dhe.ibm.com/software/dw/jdk/64bitporting/64BitJavaPortingGuide.pdf

Bottom line on how much worse things get is based upon how many pointers and 
longs are used for 64-bit java that are used.

Regards,
-Tony



- Original Message 
From: Tony Anecito adanec...@yahoo.com
To: Tomcat Users List users@tomcat.apache.org
Sent: Tue, March 1, 2011 12:44:37 PM
Subject: Re: [OT] IIS7/isapi/tomcat performance

Hi Chris,

The performance degregation for 64 bit versus 32 bit has been the subject of 
much concern in the java community.

Here is the number I mentioned straight from Oracle itself:
http://www.oracle.com/technetwork/java/hotspotfaq-138619.html

What are the performance characteristics of 64-bit versus 32-bit VMs? 
Generally, the benefits of being able to address larger amounts of memory come 
with a small performance loss in 64-bit VMs versus running the same application 
on a 32-bit VM.  This is due to the fact that every native pointer in the 
system 

takes up 8 bytes instead of 4.  The loading of this extra data has an impact on 
memory usage which translates to slightly slower execution depending on how 
many 

pointers get loaded during the execution of your Java program.  The good news 
is 

that with AMD64 and EM64T platforms running in 64-bit mode, the Java VM gets 
some additional registers which it can use to generate more efficient native 
instruction sequences.  These extra registers increase performance to the point 
where there is often no performance loss at all when comparing 32 to 64-bit 
execution speed.   

The performance difference comparing an application running on a 64-bit 
platform 

versus a 32-bit platform on SPARC is on the order of 10-20% degradation when 
you 

move to a 64-bit VM.  On AMD64 and EM64T platforms this difference ranges from 
0-15% depending on the amount of pointer accessing your application performs.   




If you google using the keywords: java 64-bit vs 32-bit performance
You will find alot of discussion about this.

Regards,
-Tony


- Original Message 
From: Christopher Schultz ch...@christopherschultz.net
To: Tomcat Users List users@tomcat.apache.org
Sent: Tue, March 1, 2011 8:29:11 AM
Subject: Re: [OT] IIS7/isapi/tomcat performance

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Tony,

On 2/28/2011 2:57 PM, Tony Anecito wrote:
 Since the memory pointers are larger you may need to increase your heap size 
but 

 you can compress the address pointers.

+1

 Also, if you use JNI and it is 32-bit then you will have unexpected issues 
 same 


 thing with any native libs your try to use.

+1

 Generally it will be up to 20% slower due to the pointers.

Can you explain that claim? Unless the OP is using compressed pointers
(which will require a decode in order to dereference), why would the
performance drop when using 64-bit pointers instead of 32-bit pointers.
Presumably, the CPU has 64-bit (or bigger) registers and can handle
64-bit numbers just as fast as 32-bit numbers. Or do modern CPUs run in
g a32-bit mode where the hardware doesn't bother to add-out to the 33+ bits?

- -chris
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.10 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk1tEMcACgkQ9CaO5/Lv0PBa8ACgmRjggPsYHma8tShCNK2WfOJd
Qv8AoJ0KGEVwKQRDfSvwAvoF2Is5oHoW
=Anih
-END PGP SIGNATURE-

-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org




-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org




-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



Re: [OT] IIS7/isapi/tomcat performance

2011-03-01 Thread Christopher Schultz
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Tony,

On 3/1/2011 3:24 PM, Tony Anecito wrote:
 Also, I have not programmed in assembly language or in hexadecimal  for some 
 time but I would hope that for a 32-bit java process running on a 64-bit 
 processor I would fetch a 32-bit pointer and maybe a 32-bit long on a 64-bit 
 data bus. Remember we are talking about pointers in code coming into the 
 processor via the data bus.

The bus on a 64-bit architecture has better be at least 64-bits wide,
otherwise nothing works right. They used to run 64-bit OSs on 32-bit
hardware and everything took twice as long because the bus was only
32-bit and so every piece of (64-bit) data took double the time to
transmit. Booting 64-bit WinNT would take a looong time.

I don't understand why communicating a 64-bit value over a 64-bit bus
would take longer than communicating a 32-bit value over a 64-bit bus:
the clock speed of the bus is the same... the only difference between
the two scenarios is that the user doesn't care about the upper 32-bits
of data.

The only thing that makes sense to me intuitively at this point (I'm
still reading) is that using compressed object pointers slows things down.

 Interesting enough for AIX and Linux a long is 64bit for 64-bit java versus 
 32-bit for 64-bit windows. So it looks like for Linux it would be slower than 
 windows.
 See: 
 http://public.dhe.ibm.com/software/dw/jdk/64bitporting/64BitJavaPortingGuide.pdf

That's interesting, though it doesn't specify what compiler is being
used. The only thing that makes a long value 32-bit or 64-bit is the
compiler compiling the code where the word long is present. Java fixes
the size of all native data types, so a Java long is always 64-bits
regardless of the architecture. ISO C declares that long is at least
32-bit, short is at least 16-bit, and plain-old int is somewhere in
between whatever short and long turn out to be.

That document seems to imply that the OS decides what the type widths
are, and that only matters when interfacing with OS calls: if you call
brk() and it expects a 64-bit value, if you provide a 32-bit one, bad
things will happen.

 Bottom line on how much worse things get is based upon how many pointers and 
 longs are used for 64-bit java that are used.

I still don't get why moving 64-bit values around is slower than moving
32-bit values around: the bus is 64-bits no matter what mode you're in.
I *do* get that compressed pointers slow things down. I *do* get that
the heap will grow somewhere approaching twice the size as in a 32-bit
JVM. I also get that some processors (like Itanium) have an x84
processor core on the die, so that processor can avoid (uselessly)
performing 64-bit operations on 32-bit data, but getting the data from
point A to point B shouldn't matter. Also, performing 64-bit operations
on 32-bit data should take just as long as performing 64-bit operations
on 64-bit data: the ALU goes as fast as it's designed to go.

I suppose of the CPU knew it was in a 32-bit mode, it could adjust the
number of clock ticks it had to wait around for 32-bit data to go
through an adder, but that seems overly complicated for a
straightforward CPU task.

- -chris
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.10 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk1tXrsACgkQ9CaO5/Lv0PBJfgCfXoAqt/K8TzqGk5AYO2+g4n7J
OsMAoIbJ1nRUFVDilUDdkQTTOrRoMNWb
=d3UM
-END PGP SIGNATURE-

-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



Re: [OT] IIS7/isapi/tomcat performance

2011-03-01 Thread Tony Anecito
Hi Chris,
 
I guess you have not read my last email yet. I think of it as putting two  
32-bit pieces of info on a 64-bit data bus whereas for two 64-bit pieces of  
information it takes two fetches or twice as long on the same hardware.  
Depending upon the number of bytes for each data type for 32-bit versus 64-bit  
20% performance reduction makes sense.

As for compressing the pointers all I read is it improves response time  so 
that 
maybe running on 64-bit java the program is only 1% slower. I am  assuming the 
pointers are compressed after the first pass or even before the  byte code is 
run.
 
Regards,
-Tony



  

-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



Re: [OT] IIS7/isapi/tomcat performance

2011-03-01 Thread Christopher Schultz
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Tony,

On 3/1/2011 4:19 PM, Tony Anecito wrote:
 I guess you have not read my last email yet. I think of it as putting two  
 32-bit pieces of info on a 64-bit data bus whereas for two 64-bit pieces of  
 information it takes two fetches or twice as long on the same hardware.

Are you saying that a 32-bit JVM running on a 64-bit machine somehow
utilizes the 64-bit bus? Malarkey. Perhaps the CPU as part of its
instruction re-ordering can do this, but I seriously doubt that a 32-bit
process on a 64-bit CPU gains a performance boost over that same 32-bit
process running on a 32-bit CPU (which is what the above would imply).

 As for compressing the pointers all I read is it improves response time

I can't believe that for a second. It actually slows things down. The
only reason to compress pointers is so that your heap size doesn't
roughly double when switching to 64-bit. The problem is that while the
transition from 32-bit to 64-bit architecture now allows many orders of
magnitude more memory to be accessed by each process (this is especially
important for Java heaps), the amount of memory installed in servers has
not really changed. 5 years ago, it wasn't uncommon for a 32-bit server
to have 32GiB of memory. These days, a similar 64-bit server might still
only have 32GiB of memory.

 so that 
 maybe running on 64-bit java the program is only 1% slower. I am assuming the 
 pointers are compressed after the first pass or even before the byte code is 
 run.

The pointers are compressed as the objects (really the references to
them) are created. The problem is that they must be uncompressed for
every dereference. It has nothing to do with the bytecode.

- -chris
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.10 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk1tbJUACgkQ9CaO5/Lv0PBZ3ACgrVFqcPNcIe+P3U1HW3QzRXpS
L3oAnj82GTkXoQcOwxYskRLXWwsrFTcn
=w2cy
-END PGP SIGNATURE-

-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



RE: [OT] IIS7/isapi/tomcat performance

2011-03-01 Thread Caldarale, Charles R
 From: Christopher Schultz [mailto:ch...@christopherschultz.net] 
 Subject: Re: [OT] IIS7/isapi/tomcat performance

 Are you saying that a 32-bit JVM running on a 64-bit machine 
 somehow utilizes the 64-bit bus?  Malarkey.

I wouldn't bet on that.  Intel goes to great pains to insure all of the buses 
are fully utilized.  On a 64-bit machine, all of the data paths from RAM up to 
the L1 operand cache will be able to move twice the number of items per cycle 
when the items are only 32 bits wide.  Between the L1 cache and the superscalar 
execution core, there may be less of a gain, but since the core contains three 
ALUs and separate load and store sections to service them, memory operations 
are combined wherever possible to get data in and out as fast as possible.

 - Chuck


THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL AND/OR OTHERWISE PROPRIETARY 
MATERIAL and is thus for use only by the intended recipient. If you received 
this in error, please contact the sender and delete the e-mail and its 
attachments from all computers.



RE: [OT] IIS7/isapi/tomcat performance

2011-03-01 Thread Caldarale, Charles R
 From: Christopher Schultz [mailto:ch...@christopherschultz.net] 
 Subject: Re: [OT] IIS7/isapi/tomcat performance

 I don't understand why communicating a 64-bit value over a 
 64-bit bus would take longer than communicating a 32-bit 
 value over a 64-bit bus:

Because you get *two* 32-bit values for one transfer, not just one. 
 
BTW, it's somewhat pointless to use the unqualified term bus when referring 
to modern CPU architecture.  Now that Intel has finally figured out how to make 
multi-processor systems run at a reasonable speed by using techniques we 
implemented back in the 1960s, along with the advent of multiple memory cache 
levels, there's no longer a single bus to be concerned with.  Most of them are 
wider than 64 bits in order to move as much data as possible; even ten years 
ago, Intel was moving 64 _bytes_ at a time on most of the data paths.

 I also get that some processors (like Itanium) have an x84
 processor core on the die

(Presumably, you meant x86.)  Sorry, Itanium was notoriously bad at running 
32-bit apps.

 getting the data from point A to point B shouldn't matter

Sure it does, if you can batch multiple operand accesses together (which 
current Intel cores do).

 I suppose of the CPU knew it was in a 32-bit mode, it could 
 adjust the number of clock ticks it had to wait around for 
 32-bit data to go through an adder, but that seems overly 
 complicated for a straightforward CPU task.

Simple adders have only used one cycle for decades, regardless of the width.

 - Chuck


THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL AND/OR OTHERWISE PROPRIETARY 
MATERIAL and is thus for use only by the intended recipient. If you received 
this in error, please contact the sender and delete the e-mail and its 
attachments from all computers.



Re: [OT] IIS7/isapi/tomcat performance

2011-03-01 Thread Christopher Schultz
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Chuck,

On 3/1/2011 5:42 PM, Caldarale, Charles R wrote:
 From: Christopher Schultz [mailto:ch...@christopherschultz.net] 
 Subject: Re: [OT] IIS7/isapi/tomcat performance
 
 Are you saying that a 32-bit JVM running on a 64-bit machine 
 somehow utilizes the 64-bit bus?  Malarkey.
 
 I wouldn't bet on that.  Intel goes to great pains to insure all of
 the buses are fully utilized.  On a 64-bit machine, all of the data
 paths from RAM up to the L1 operand cache will be able to move twice
 the number of items per cycle when the items are only 32 bits wide.

The question I have is how does the bus controller know that there are
multiple 32-bit values coming down the line, and that it can send them
simultaneously down the bus? There's more data to be sent over the bus
than just pointers to other pieces of data. You have to move the
instruction itself, etc. so there's lots of opportunities for other data
to get in the way of this DRR-style data transfer across the bus.

 Between the L1 cache and the superscalar execution core, there may be
 less of a gain, but since the core contains three ALUs and separate
 load and store sections to service them, memory operations are
 combined wherever possible to get data in and out as fast as
 possible.

I buy this argument, but that would only affect the processing of, say,
a 64-bit pointer within the core... not the speed of passing that
pointer around the rest of the machine. As you say, probably less of a gain.

I'd love to see some real documentation and/or testing on this type of
stuff. I certainly am somewhat naïve when it comes to details this low,
but my intuition tells me that the CPU and bus aren't magic :)

- -chris
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.10 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk1tfa4ACgkQ9CaO5/Lv0PBxlQCgjvY/NcigAvD/jXIWfckKUbju
tUgAn2bfMa3iEuQeUe0j2ZqmgVxGn+dx
=Vubd
-END PGP SIGNATURE-

-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



Re: [OT] IIS7/isapi/tomcat performance

2011-03-01 Thread Tony Anecito
I believe the effect of compression is relative. In other words for a big 
program with lots of 64-bit pointers and 64-bit longs it is helps but for small 
programs it does not.

I would hope the full 64-bit data bus would be used. So you thing 32-pins on 
the 
processor are not used when running a 32-bit process? I am not saying you are 
not correct but I will check into it since I am curious and let you know what I 
find.

I have never mentioned byte code as pointers all my referneces are to 
executable 
code or what the processor actually runs.

Regards,
-Tony


 


- Original Message 
From: Christopher Schultz ch...@christopherschultz.net
To: Tomcat Users List users@tomcat.apache.org
Sent: Tue, March 1, 2011 3:00:53 PM
Subject: Re: [OT] IIS7/isapi/tomcat performance

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Tony,

On 3/1/2011 4:19 PM, Tony Anecito wrote:
 I guess you have not read my last email yet. I think of it as putting two  
 32-bit pieces of info on a 64-bit data bus whereas for two 64-bit pieces of  
 information it takes two fetches or twice as long on the same hardware.

Are you saying that a 32-bit JVM running on a 64-bit machine somehow
utilizes the 64-bit bus? Malarkey. Perhaps the CPU as part of its
instruction re-ordering can do this, but I seriously doubt that a 32-bit
process on a 64-bit CPU gains a performance boost over that same 32-bit
process running on a 32-bit CPU (which is what the above would imply).

 As for compressing the pointers all I read is it improves response time

I can't believe that for a second. It actually slows things down. The
only reason to compress pointers is so that your heap size doesn't
roughly double when switching to 64-bit. The problem is that while the
transition from 32-bit to 64-bit architecture now allows many orders of
magnitude more memory to be accessed by each process (this is especially
important for Java heaps), the amount of memory installed in servers has
not really changed. 5 years ago, it wasn't uncommon for a 32-bit server
to have 32GiB of memory. These days, a similar 64-bit server might still
only have 32GiB of memory.

 so that 
 maybe running on 64-bit java the program is only 1% slower. I am assuming the 
 pointers are compressed after the first pass or even before the byte code is 
 run.

The pointers are compressed as the objects (really the references to
them) are created. The problem is that they must be uncompressed for
every dereference. It has nothing to do with the bytecode.

- -chris
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.10 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk1tbJUACgkQ9CaO5/Lv0PBZ3ACgrVFqcPNcIe+P3U1HW3QzRXpS
L3oAnj82GTkXoQcOwxYskRLXWwsrFTcn
=w2cy
-END PGP SIGNATURE-

-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org




-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



Re: [OT] IIS7/isapi/tomcat performance

2011-03-01 Thread Tony Anecito
Thanks Chuck I agree.
I used to design hardware back in the 80-mid 90's so understand what you are 
saying but have not kept up with actual designs since then. I jumped over to 
software after that.
I know I simplify some things but hope I still am correct. Feel free to correct 
me I will try to not get emotional about it but I do miss my 8080 and Z80.

-Tony



- Original Message 
From: Caldarale, Charles R chuck.caldar...@unisys.com
To: Tomcat Users List users@tomcat.apache.org
Sent: Tue, March 1, 2011 4:09:10 PM
Subject: RE: [OT] IIS7/isapi/tomcat performance

 From: Christopher Schultz [mailto:ch...@christopherschultz.net] 
 Subject: Re: [OT] IIS7/isapi/tomcat performance

 I don't understand why communicating a 64-bit value over a 
 64-bit bus would take longer than communicating a 32-bit 
 value over a 64-bit bus:

Because you get *two* 32-bit values for one transfer, not just one. 

BTW, it's somewhat pointless to use the unqualified term bus when referring 
to 
modern CPU architecture.  Now that Intel has finally figured out how to make 
multi-processor systems run at a reasonable speed by using techniques we 
implemented back in the 1960s, along with the advent of multiple memory cache 
levels, there's no longer a single bus to be concerned with.  Most of them are 
wider than 64 bits in order to move as much data as possible; even ten years 
ago, Intel was moving 64 _bytes_ at a time on most of the data paths.

 I also get that some processors (like Itanium) have an x84
 processor core on the die

(Presumably, you meant x86.)  Sorry, Itanium was notoriously bad at running 
32-bit apps.

 getting the data from point A to point B shouldn't matter

Sure it does, if you can batch multiple operand accesses together (which 
current 
Intel cores do).

 I suppose of the CPU knew it was in a 32-bit mode, it could 
 adjust the number of clock ticks it had to wait around for 
 32-bit data to go through an adder, but that seems overly 
 complicated for a straightforward CPU task.

Simple adders have only used one cycle for decades, regardless of the width.

- Chuck


THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL AND/OR OTHERWISE PROPRIETARY 
MATERIAL and is thus for use only by the intended recipient. If you received 
this in error, please contact the sender and delete the e-mail and its 
attachments from all computers.




-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



RE: [OT] IIS7/isapi/tomcat performance

2011-03-01 Thread Caldarale, Charles R
 From: Christopher Schultz [mailto:ch...@christopherschultz.net] 
 Subject: Re: [OT] IIS7/isapi/tomcat performance

 The question I have is how does the bus controller know 
 that there are multiple 32-bit values coming down the line,
 and that it can send them simultaneously down the bus?

A traditional bus controller hasn't been used in quite some time, and buses 
themselves are rapidly being replaced by point-to-point connections (finally), 
at least in terms of CPUs accessing memory.  The interface between the L1 
operand cache and the multiple ALUs is under control of a scheduler that's 
aware of the possible 72 simultaneous loads and stores going on, so it can 
combine accesses as it sees fit.  Accesses between lower-level caches and 
actual RAM have always been wider than the data path within a core.

 There's more data to be sent over the bus than just pointers 
 to other pieces of data.

Of course - except there is no the bus.

 You have to move the instruction itself

Not these days.  The instruction will be loaded from memory once, broken (and 
combined) into micro-ops, and those are stored in the instruction cache.  If 
you're getting i-cache much beyond single digit percentages, your performance 
will be horrible.

 so there's lots of opportunities for other data
 to get in the way of this DRR-style data transfer
 across the bus.

Your continued use of the phrase the bus is rather quaint...

 that would only affect the processing of, say, a 64-bit pointer 
 within the core...

No, it affects all data, not just pointers.

 I'd love to see some real documentation and/or testing on this 
 type of stuff.

http://www.intel.com/products/processor/manuals/

Start with this one:
http://www.intel.com/Assets/PDF/manual/253665.pdf

 my intuition tells me that the CPU and bus aren't magic :)

Compared to just a few years ago, they are.

 - Chuck


THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL AND/OR OTHERWISE PROPRIETARY 
MATERIAL and is thus for use only by the intended recipient. If you received 
this in error, please contact the sender and delete the e-mail and its 
attachments from all computers.