Re: A performance patch for PDFInfo class
Kevin O'Neill kevin@rocketred.To: FOP Developers [EMAIL PROTECTED] com.au cc: (bcc: Thomas Seremet/GCR-NAmerica/GRN) Subject: Re: A performance patch for PDFInfo class 11/13/2002 05:41 AM Please respond to fop-dev snip/ Some more insight or confusion. The byte code maybe similar in the sense that String uses .concat() and StringBuffer uses new StringBuffer().append to do their individual concatenations but the way they are treated by the JVM is not the same. Of course not all JVM's are created equal but Strings are stored as constants thus you see the ldc opcode when creating Strings. Even though a String holds on to an internal character array as does a StringBuffer, a String creates a new String when it is concatentating another String to itself NOT modifing its internal character array. On the other hand a StringBuffer actually modifies its internal character array to represent the new String though this is accomplished by increasing the array size by creating a new char array. This is only part of the story since there is a difference between Strings stored in the so-called constants pool and new Strings created during runtime which do not go into the constants-pool automatically. Very good info concerning this on the web. Okay firstly, please look at the pcode generated earlier in this thread. It was generated by the moder compiler on jdk 1.4.0 the string addition is indeed concatinated using StringBuffer.append() public String testStringBufferChained() { return (new StringBuffer().append(this ) .append(makeString(is )) .append(a ) .append(makeString(test))).toString(); } public String testStringAdd() { return this + makeString(is ) + a + makeString(test); } becomes ... Method java.lang.String testStringBufferChained() 0 new #2 Class java.lang.StringBuffer 3 dup 4 invokespecial #3 Method java.lang.StringBuffer() 7 ldc #4 String this 9 invokevirtual #5 Method java.lang.StringBuffer append(java.lang.String) 12 aload_0 13 ldc #6 String is 15 invokespecial #7 Method java.lang.String makeString(java.lang.String) 18 invokevirtual #5 Method java.lang.StringBuffer append(java.lang.String) 21 ldc #8 String a 23 invokevirtual #5 Method java.lang.StringBuffer append(java.lang.String) 26 aload_0 27 ldc #9 String test 29 invokespecial #7 Method java.lang.String makeString(java.lang.String) 32 invokevirtual #5 Method java.lang.StringBuffer append(java.lang.String) 35 invokevirtual #10 Method java.lang.String toString() 38 areturn Method java.lang.String testStringAdd() 0 new #2 Class java.lang.StringBuffer 3 dup 4 invokespecial #3 Method java.lang.StringBuffer() 7 ldc #4 String this 9 invokevirtual #5 Method java.lang.StringBuffer append(java.lang.String) 12 aload_0 13 ldc #6 String is 15 invokespecial #7 Method java.lang.String makeString(java.lang.String) 18 invokevirtual #5 Method java.lang.StringBuffer append(java.lang.String) 21 ldc #8 String a 23 invokevirtual #5 Method java.lang.StringBuffer append(java.lang.String) 26 aload_0 27 ldc #9 String test 29 invokespecial #7 Method java.lang.String makeString(java.lang.String) 32 invokevirtual #5 Method java.lang.StringBuffer append(java.lang.String) 35 invokevirtual #10 Method java.lang.String toString() 38 areturn So once again I say these are these are identical, the underlying VM has no idea of any difference. I would though contend that the String addition is easier to read. I know I also came in late into this thread. I would like to say I am very excited
Re: A performance patch for PDFInfo class
Tom, I know I also came in late into this thread. I would like to say I am very excited about the possibilities of FOP in the future and the already realized gains FOP gives. My overall grasp of FOP at this moment is still limited but I have been delving into the code and using it for a current project for the past 2 months. In no way am I trying to be insulting to anyone. I realize everyone here is very well versed in various areas of programming. Performance has always been an interest to me in general so I enjoy these types of topics. I don't mind being proven wrong. Yes. You are correct on your assumptions based on your code. Maybe I should have been more clear and read the initial beginning to this thread but I would like to clear two points up, my apologies. I was trying to refer to two seperate issues that were being discussed, more so to what happens during runtime then compile time. The first point was what the String and StringBuffer java code looks in the pcode. String does do concantenation with the concat method and StringBuffer with append?. It does do that except under certain conditions. So, what are the conditions under the + operator? -If there is a chain of anonymous strings being initialized to a variable a LOAD happens. -If there is 1 reference to a String at the head or end of a chain of anonymous Strings a String.concat happens -If there are only 2 references to a String(s) being concantenated using + operator a String.concat happens -If the chain contains 2 or more anonymous strings seperated by at least 1 reference to a String StringBuffer.append happens -If the chain contains 3 or more references to a String(s) StringBuffer.append happens -If the chain contains at least 2 references to a String(s) and at least 1 anonymous String StringBuffer.append happens What are the conditions for += operator? -They are the same as above except for the additional concantenation that happens implicitly to the variable adding to itself. So to the statement that StringBuffer.append happens in String addition is only partly true given String.concat happens in a subset of the cases. Great stuff. I hadn't done tests with two Strings and I think that the scenarios you provide us with are great. snip / When I looked at the generated pcode I couldn't understand why there where so many (any) valueOf operations, so I regenerated the code using 1.4.1, and guess what ... it's very different. Sun have made some considerable leeps forward in String handling. Method void concatTests() 0 ldc #12 String foo 2 astore_1 3 new #2 Class java.lang.StringBuffer 6 dup 7 invokespecial #3 Method java.lang.StringBuffer() 10 aload_1 11 invokevirtual #5 Method java.lang.StringBuffer append(java.lang.String) 14 ldc #13 String bar 16 invokevirtual #5 Method java.lang.StringBuffer append(java.lang.String) 19 invokevirtual #10 Method java.lang.String toString() 22 astore_1 23 ldc #14 String sofarsogood 25 astore_2 26 new #2 Class java.lang.StringBuffer 29 dup 30 invokespecial #3 Method java.lang.StringBuffer() 33 ldc #15 String boofar 35 invokevirtual #5 Method java.lang.StringBuffer append(java.lang.String) 38 aload_1 39 invokevirtual #5 Method java.lang.StringBuffer append(java.lang.String) 42 invokevirtual #10 Method java.lang.String toString() 45 astore_3 46 new #2 Class java.lang.StringBuffer 49 dup 50 invokespecial #3 Method java.lang.StringBuffer() 53 aload_2 54 invokevirtual #5 Method java.lang.StringBuffer append(java.lang.String) 57 aload_3 58 invokevirtual #5 Method java.lang.StringBuffer append(java.lang.String) 61 invokevirtual #10 Method java.lang.String toString() 64 astore 4 66 new #2 Class java.lang.StringBuffer 69 dup 70 invokespecial #3 Method java.lang.StringBuffer() 73 aload_1 74 invokevirtual #5 Method java.lang.StringBuffer append(java.lang.String) 77 aload_2 78 invokevirtual #5 Method java.lang.StringBuffer append(java.lang.String) 81 aload_3 82 invokevirtual #5 Method java.lang.StringBuffer append(java.lang.String) 85 invokevirtual #10 Method java.lang.String toString() 88 astore 5 90 new #2 Class java.lang.StringBuffer 93 dup 94 invokespecial #3 Method java.lang.StringBuffer() 97 aload_0 98 ldc #16 String goo 100 invokespecial #7 Method java.lang.String makeString(java.lang.String) 103 invokevirtual #5 Method java.lang.StringBuffer append(java.lang.String) 106 aload_0 107 ldc #13 String bar 109 invokespecial #7 Method java.lang.String makeString(java.lang.String) 112 invokevirtual #5 Method java.lang.StringBuffer append(java.lang.String) 115 invokevirtual #10 Method java.lang.String toString() 118 astore 6 120 new #2 Class java.lang.StringBuffer 123 dup 124 invokespecial #3 Method java.lang.StringBuffer() 127 ldc #17 String boo 129 invokevirtual #5 Method java.lang.StringBuffer append(java.lang.String) 132 aload_1 133 invokevirtual
Re: A performance patch for PDFInfo class
Don't be! I think this discussion is very healthy. Thanks to all who are participating. On 15 Nov 2002 08:50:19 +1100 Kevin O'Neill wrote: ps: Sorry if this is boring anyone :(. Jeremias Maerki - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, email: [EMAIL PROTECTED]
Re: A performance patch for PDFInfo class
snip/ Some more insight or confusion. The byte code maybe similar in the sense that String uses .concat() and StringBuffer uses new StringBuffer().append to do their individual concatenations but the way they are treated by the JVM is not the same. Of course not all JVM's are created equal but Strings are stored as constants thus you see the ldc opcode when creating Strings. Even though a String holds on to an internal character array as does a StringBuffer, a String creates a new String when it is concatentating another String to itself NOT modifing its internal character array. On the other hand a StringBuffer actually modifies its internal character array to represent the new String though this is accomplished by increasing the array size by creating a new char array. This is only part of the story since there is a difference between Strings stored in the so-called constants pool and new Strings created during runtime which do not go into the constants-pool automatically. Very good info concerning this on the web. Okay firstly, please look at the pcode generated earlier in this thread. It was generated by the moder compiler on jdk 1.4.0 the string addition is indeed concatinated using StringBuffer.append() public String testStringBufferChained() { return (new StringBuffer().append(this ) .append(makeString(is )) .append(a ) .append(makeString(test))).toString(); } public String testStringAdd() { return this + makeString(is ) + a + makeString(test); } becomes ... Method java.lang.String testStringBufferChained() 0 new #2 Class java.lang.StringBuffer 3 dup 4 invokespecial #3 Method java.lang.StringBuffer() 7 ldc #4 String this 9 invokevirtual #5 Method java.lang.StringBuffer append(java.lang.String) 12 aload_0 13 ldc #6 String is 15 invokespecial #7 Method java.lang.String makeString(java.lang.String) 18 invokevirtual #5 Method java.lang.StringBuffer append(java.lang.String) 21 ldc #8 String a 23 invokevirtual #5 Method java.lang.StringBuffer append(java.lang.String) 26 aload_0 27 ldc #9 String test 29 invokespecial #7 Method java.lang.String makeString(java.lang.String) 32 invokevirtual #5 Method java.lang.StringBuffer append(java.lang.String) 35 invokevirtual #10 Method java.lang.String toString() 38 areturn Method java.lang.String testStringAdd() 0 new #2 Class java.lang.StringBuffer 3 dup 4 invokespecial #3 Method java.lang.StringBuffer() 7 ldc #4 String this 9 invokevirtual #5 Method java.lang.StringBuffer append(java.lang.String) 12 aload_0 13 ldc #6 String is 15 invokespecial #7 Method java.lang.String makeString(java.lang.String) 18 invokevirtual #5 Method java.lang.StringBuffer append(java.lang.String) 21 ldc #8 String a 23 invokevirtual #5 Method java.lang.StringBuffer append(java.lang.String) 26 aload_0 27 ldc #9 String test 29 invokespecial #7 Method java.lang.String makeString(java.lang.String) 32 invokevirtual #5 Method java.lang.StringBuffer append(java.lang.String) 35 invokevirtual #10 Method java.lang.String toString() 38 areturn So once again I say these are these are identical, the underlying VM has no idea of any difference. I would though contend that the String addition is easier to read. -k. -- If you don't test then your code is only a collection of bugs which apparently behave like a working program. Website: http://www.rocketred.com.au/blogs/kevin/ - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, email: [EMAIL PROTECTED]
Re: A performance patch for PDFInfo class
Hi Kevin. Im sorry I got in to the dicussion att the wrong time... There is NO difference in the garbage created by String a = foo + bar; and StringBuffer xxx.append(foo).append(bar); They compile to the same thing. Have a look at the pcode. You are totaly right, in just adding strings together like: public String testStringBufferChained() { return (new StringBuffer().append(this ) .append(makeString(is )) .append(a ) .append(makeString(test))).toString(); } And there are nothing to gain. But there are to many places where strings are added over and over to each other, and that was what I was talking about. Butt never mind I have to read the mails better next time. /Henrik Kevin O'Neill [EMAIL PROTECTED] 2002-11-12 20:26 Please respond to fop-dev To:FOP Developers [EMAIL PROTECTED] cc: Subject:Re: A performance patch for PDFInfo class On Wed, 2002-11-13 at 02:25, Henrik Olsson wrote: StringBuffer xxx.append(foo).append(bar); understanding what the compiler does is the secret to optimizing Strings. Hi Kevin. Its not an issue of what code is fastest here, its about creation and destuction of objects. There is NO difference in the garbage created by String a = foo + bar; and StringBuffer xxx.append(foo).append(bar); They compile to the same thing. Have a look at the pcode. I have done several measurements on the fop to find the bottlenecks and one of them are strings objects. I think we agree on that gc is slow and one way to avoid gc its to use StringBuffers instead of Strings while we are putting them together. I have runned some profileing on the fop-0.20.4 and my tuned one (patch 14013) with the same fo-files. And there are 680010 Strings in created in the fop-0.20.4 compared to 170395 Strings created in the tuned one, and this gives us a hint that the gc dont have to run that offen and we can do some real work instead, speed increased with 20-30%. There is alot of += these should be removed as part of the move towards 1.0 Im also working on another preformence problems with properties and makers but it takes a step from the fo-spec and needs to know some things about the layout (pre parse the xsl before adding the xml data to it). And it will increase the speed the bigger a layout gets... And it can compete with commercial pruducts as StreamServe. Se xsl chart at http://nohardcore.tripod.com/fop/FOP-test.zip If You are interested in the code just say so. Always. /Henrik -- If you don't test then your code is only a collection of bugs which apparently behave like a working program. Website: http://www.rocketred.com.au/blogs/kevin/ - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, email: [EMAIL PROTECTED]
Re: A performance patch for PDFInfo class
ok, sorry for the disturbance. Laszlo Hornyak ps: StringBuffering code: time for test: 45479 String += code: time for test: 52011 difference: 14.36% java version 1.4.1_01 Java(TM) 2 Runtime Environment, Standard Edition (build 1.4.1_01-b01) Java HotSpot(TM) Client VM (build 1.4.1_01-b01, mixed mode) On Tue, Nov 12, 2002 at 09:13:36AM +0100, Keiron Liddle wrote: On Mon, 2002-11-11 at 23:44, Kevin O'Neill wrote: Like anybody else there are times when I optimize as I go, but I really try and keep in mind, is this the simplest thing I could do? Fighting the urge to apply optimizations as you go is hard sometimes but in my experience leads to a better code base. This is the most important statement IMO. Especially with Fop, if it is not simple there we are going to dig ourselves into a hole. Things are complicated enough without making convoluted and confusing optimisations. Keep parts simple and have a well defined contract between parts. Keiron. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, email: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, email: [EMAIL PROTECTED]
Re: A performance patch for PDFInfo class
On Tue, 2002-11-12 at 20:22, Laszlo Hornyak wrote: ok, sorry for the disturbance. Laszlo Hornyak ps: StringBuffering code: time for test: 45479 String += code: time for test: 52011 += is slow + is faster. difference: 14.36% java version 1.4.1_01 Java(TM) 2 Runtime Environment, Standard Edition (build 1.4.1_01-b01) Java HotSpot(TM) Client VM (build 1.4.1_01-b01, mixed mode) On Tue, Nov 12, 2002 at 09:13:36AM +0100, Keiron Liddle wrote: On Mon, 2002-11-11 at 23:44, Kevin O'Neill wrote: Like anybody else there are times when I optimize as I go, but I really try and keep in mind, is this the simplest thing I could do? Fighting the urge to apply optimizations as you go is hard sometimes but in my experience leads to a better code base. This is the most important statement IMO. Especially with Fop, if it is not simple there we are going to dig ourselves into a hole. Things are complicated enough without making convoluted and confusing optimisations. Keep parts simple and have a well defined contract between parts. Keiron. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, email: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, email: [EMAIL PROTECTED] -- If you don't test then your code is only a collection of bugs which apparently behave like a working program. Website: http://www.rocketred.com.au/blogs/kevin/ - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, email: [EMAIL PROTECTED]
Re: A performance patch for PDFInfo class
On Tue, 2002-11-12 at 22:24, Henrik Olsson wrote: On Tue, 2002-11-12 at 20:22, Laszlo Hornyak wrote: ok, sorry for the disturbance. Laszlo Hornyak ps: StringBuffering code: time for test: 45479 String += code: time for test: 52011 += is slow + is faster. The important thing here are not realy the time it takes to do += or + to strings, the biggest cost in handling strings is the creation and gc Creation and destruction of string objects costs alot in the FOP since there are huge misuse of them. se patch 14013 that runs 20-30% faster and the biggest things are optimization of string handling. So I think StringBuffers shall be used instead of adding strings together. The add and store operation is slow. eg String a = foo; a += bar; String a = foo + bar; compiles to approximatly StringBuffer xxx.append(foo).append(bar); Which is faster than StringBuffer xxx.append(foo); xxx.append(bar); It's even slightly faster than StringBuffer xxx.append(foo).append(bar); understanding what the compiler does is the secret to optimizing Strings. For details of why see my earlier post. /Henrik -k. -- If you don't test then your code is only a collection of bugs which apparently behave like a working program. Website: http://www.rocketred.com.au/blogs/kevin/ - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, email: [EMAIL PROTECTED]
Re: A performance patch for PDFInfo class
On Tue, 2002-11-12 at 20:22, Laszlo Hornyak wrote: ok, sorry for the disturbance. Laszlo Hornyak ps: StringBuffering code: time for test: 45479 String += code: time for test: 52011 difference: 14.36% java version 1.4.1_01 Java(TM) 2 Runtime Environment, Standard Edition (build 1.4.1_01-b01) Java HotSpot(TM) Client VM (build 1.4.1_01-b01, mixed mode) I think everyone involved with FOP (from occasional contributors like me, to the more stable and prolific commiters) appreciates the effort you made. I wanted to ensure that your efforts where well directed and you had the knowledge at had to increase the effectiveness of your efforts. Please never be sorry for disturbance, rock the boat, make me (and others) explain our selves. I apologize if you took offense to my remarks, none was intended. I'm not doubting that the += is slower (that is plainly obvious from the pcode). I do think that there are places for using StringBuffers, just know when to use them. On Tue, Nov 12, 2002 at 09:13:36AM +0100, Keiron Liddle wrote: On Mon, 2002-11-11 at 23:44, Kevin O'Neill wrote: Like anybody else there are times when I optimize as I go, but I really try and keep in mind, is this the simplest thing I could do? Fighting the urge to apply optimizations as you go is hard sometimes but in my experience leads to a better code base. This is the most important statement IMO. Especially with Fop, if it is not simple there we are going to dig ourselves into a hole. Things are complicated enough without making convoluted and confusing optimisations. Keep parts simple and have a well defined contract between parts. Keiron. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, email: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, email: [EMAIL PROTECTED] -- If you don't test then your code is only a collection of bugs which apparently behave like a working program. Website: http://www.rocketred.com.au/blogs/kevin/ - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, email: [EMAIL PROTECTED]
Re: A performance patch for PDFInfo class
StringBuffer xxx.append(foo).append(bar); understanding what the compiler does is the secret to optimizing Strings. Hi Kevin. Its not an issue of what code is fastest here, its about creation and destuction of objects. I have done several measurements on the fop to find the bottlenecks and one of them are strings objects. I think we agree on that gc is slow and one way to avoid gc its to use StringBuffers instead of Strings while we are putting them together. I have runned some profileing on the fop-0.20.4 and my tuned one (patch 14013) with the same fo-files. And there are 680010 Strings in created in the fop-0.20.4 compared to 170395 Strings created in the tuned one, and this gives us a hint that the gc dont have to run that offen and we can do some real work instead, speed increased with 20-30%. Im also working on another preformence problems with properties and makers but it takes a step from the fo-spec and needs to know some things about the layout (pre parse the xsl before adding the xml data to it). And it will increase the speed the bigger a layout gets... And it can compete with commercial pruducts as StreamServe. Se xsl chart at http://nohardcore.tripod.com/fop/FOP-test.zip If You are interested in the code just say so. /Henrik
Re: A performance patch for PDFInfo class
On Tue, 2002-11-12 at 16:25, Henrik Olsson wrote: StringBuffer xxx.append(foo).append(bar); understanding what the compiler does is the secret to optimizing Strings. Hi Kevin. Its not an issue of what code is fastest here, its about creation and destuction of objects. Surely functionality and design count before optimisations. I have done several measurements on the fop to find the bottlenecks and one of them are strings objects. I think we agree on that gc is slow and one way to avoid gc its to use StringBuffers instead of Strings while we are putting them together. I have runned some profileing on the fop-0.20.4 and my tuned one (patch 14013) with the same fo-files. And there are 680010 Strings in created in the fop-0.20.4 compared to 170395 Strings created in the tuned one, and this gives us a hint that the gc dont have to run that offen and we can do some real work instead, speed increased with 20-30%. Im also working on another preformence problems with properties and makers but it takes a step from the fo-spec and needs to know some things about the layout (pre parse the xsl before adding the xml data to it). And it will increase the speed the bigger a layout gets... And it can compete with commercial pruducts as StreamServe. Se xsl chart at http://nohardcore.tripod.com/fop/FOP-test.zip If You are interested in the code just say so. /Henrik - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, email: [EMAIL PROTECTED]
Re: A performance patch for PDFInfo class
Keiron Liddle wrote: Surely functionality and design count before optimisations. Sad to here that, I think that it should be considered while you design and create functionality. Since the FOP are realy doing the job for me, just some prefomeance issue. /Henrik
Re: A performance patch for PDFInfo class
Henrik Olsson wrote: Sad to here that, I think that it should be considered while you design and create functionality. Real optimizations should take place only after careful measurements, which possible only when your stuff works, but you cannot design, prototype or implement with optimization in you mind. Of course, design should consider and allow optimizations to some degree, but above all it should be clear and robust. Since the FOP are realy doing the job for me, just some prefomeance issue. I believe there is a misunderstanding here, Keiron is talking about FOP1.0, which is under redesign nowadays and you - probably about maintenance branch, which is actually finished, frozen and therefore ready to optimizations, although the community decided not to scatter resources and concentrate on the trunk. -- Oleg Tkachenko eXperanto team Multiconn Technologies, Israel - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, email: [EMAIL PROTECTED]
Re: A performance patch for PDFInfo class
On Wed, 2002-11-13 at 02:25, Henrik Olsson wrote: StringBuffer xxx.append(foo).append(bar); understanding what the compiler does is the secret to optimizing Strings. Hi Kevin. Its not an issue of what code is fastest here, its about creation and destuction of objects. There is NO difference in the garbage created by String a = foo + bar; and StringBuffer xxx.append(foo).append(bar); They compile to the same thing. Have a look at the pcode. I have done several measurements on the fop to find the bottlenecks and one of them are strings objects. I think we agree on that gc is slow and one way to avoid gc its to use StringBuffers instead of Strings while we are putting them together. I have runned some profileing on the fop-0.20.4 and my tuned one (patch 14013) with the same fo-files. And there are 680010 Strings in created in the fop-0.20.4 compared to 170395 Strings created in the tuned one, and this gives us a hint that the gc dont have to run that offen and we can do some real work instead, speed increased with 20-30%. There is alot of += these should be removed as part of the move towards 1.0 Im also working on another preformence problems with properties and makers but it takes a step from the fo-spec and needs to know some things about the layout (pre parse the xsl before adding the xml data to it). And it will increase the speed the bigger a layout gets... And it can compete with commercial pruducts as StreamServe. Se xsl chart at http://nohardcore.tripod.com/fop/FOP-test.zip If You are interested in the code just say so. Always. /Henrik -- If you don't test then your code is only a collection of bugs which apparently behave like a working program. Website: http://www.rocketred.com.au/blogs/kevin/ - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, email: [EMAIL PROTECTED]
Re: A performance patch for PDFInfo class
Kevin O'Neill kevin@rocketred.To: FOP Developers [EMAIL PROTECTED] com.au cc: (bcc: Thomas Seremet/GCR-NAmerica/GRN) Subject: Re: A performance patch for PDFInfo class 11/12/2002 02:26 PM Please respond to fop-dev On Wed, 2002-11-13 at 02:25, Henrik Olsson wrote: StringBuffer xxx.append(foo).append(bar); understanding what the compiler does is the secret to optimizing Strings. Hi Kevin. Its not an issue of what code is fastest here, its about creation and destuction of objects. There is NO difference in the garbage created by String a = foo + bar; and StringBuffer xxx.append(foo).append(bar); They compile to the same thing. Have a look at the pcode. Some more insight or confusion. The byte code maybe similar in the sense that String uses .concat() and StringBuffer uses new StringBuffer().append to do their individual concatenations but the way they are treated by the JVM is not the same. Of course not all JVM's are created equal but Strings are stored as constants thus you see the ldc opcode when creating Strings. Even though a String holds on to an internal character array as does a StringBuffer, a String creates a new String when it is concatentating another String to itself NOT modifing its internal character array. On the other hand a StringBuffer actually modifies its internal character array to represent the new String though this is accomplished by increasing the array size by creating a new char array. This is only part of the story since there is a difference between Strings stored in the so-called constants pool and new Strings created during runtime which do not go into the constants-pool automatically. Very good info concerning this on the web. My point is that that the amount of garbage=unreferenced Objects in my view would be substantially more from Strings way of concatentation then StringBuffers if the StringBuffer is well managed in terms of capacity and usage. A quick skim with a decompiler/editor and some tests will show this. I'm sure most everyone here understands Java Strings and StringBuffers, so I am sorry for the spiel in advance. Tom I have done several measurements on the fop to find the bottlenecks and one of them are strings objects. I think we agree on that gc is slow and one way to avoid gc its to use StringBuffers instead of Strings while we are putting them together. I have runned some profileing on the fop-0.20.4 and my tuned one (patch 14013) with the same fo-files. And there are 680010 Strings in created in the fop-0.20.4 compared to 170395 Strings created in the tuned one, and this gives us a hint that the gc dont have to run that offen and we can do some real work instead, speed increased with 20-30%. There is alot of += these should be removed as part of the move towards 1.0 Im also working on another preformence problems with properties and makers but it takes a step from the fo-spec and needs to know some things about the layout (pre parse the xsl before adding the xml data to it). And it will increase the speed the bigger a layout gets... And it can compete with commercial pruducts as StreamServe. Se xsl chart at http://nohardcore.tripod.com/fop/FOP-test.zip If You are interested in the code just say so. Always. /Henrik -- If you don't test then your code is only a collection of bugs which apparently behave like a working program. Website: http://www.rocketred.com.au/blogs/kevin/ - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, email: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, email
Re: A performance patch for PDFInfo class
Sorry if this seems hard but this is the sort of performance enhancement I was talking about yesterday. If people are going to do these sorts of enhancements then they should be aware of the effects. It's always easier to work with examples. public class StringTest { // String Buffer public String testStringBufferStraightCall() { StringBuffer sb = new StringBuffer(); sb.append(this ); sb.append(makeString(is )); sb.append(a ); sb.append(makeString(test)); return sb.toString(); } public String testStringBufferChained() { StringBuffer sb = new StringBuffer(); sb.append(this ) .append(makeString(is )) .append(a ) .append(makeString(test)); return sb.toString(); } public String testStringAdd() { return this + makeString(is ) + a + makeString(test); } public String testIncrement() { String result = this ; result += makeString(is ); result += a ; result += makeString(test.); return result; } private String makeString(String testString) { return testString; } } Which of the above is faster? A simple timer test will show that testStringAdd() is the fastest, followed closely by testStringBufferChained(). For the reason why, lets look at the byte-code. Method java.lang.String testStringBufferStraightCall() 0 new #2 Class java.lang.StringBuffer 3 dup 4 invokespecial #3 Method java.lang.StringBuffer() 7 astore_1 8 aload_1 9 ldc #4 String this 11 invokevirtual #5 Method java.lang.StringBuffer append(java.lang.String) 14 pop 15 aload_1 16 aload_0 17 ldc #6 String is 19 invokespecial #7 Method java.lang.String makeString(java.lang.String) 22 invokevirtual #5 Method java.lang.StringBuffer append(java.lang.String) 25 pop 26 aload_1 27 ldc #8 String a 29 invokevirtual #5 Method java.lang.StringBuffer append(java.lang.String) 32 pop 33 aload_1 34 aload_0 35 ldc #9 String test 37 invokespecial #7 Method java.lang.String makeString(java.lang.String) 40 invokevirtual #5 Method java.lang.StringBuffer append(java.lang.String) 43 pop 44 aload_1 45 invokevirtual #10 Method java.lang.String toString() 48 areturn Method java.lang.String testStringBufferChained() 0 new #2 Class java.lang.StringBuffer 3 dup 4 invokespecial #3 Method java.lang.StringBuffer() 7 astore_1 8 aload_1 9 ldc #4 String this 11 invokevirtual #5 Method java.lang.StringBuffer append(java.lang.String) 14 aload_0 15 ldc #6 String is 17 invokespecial #7 Method java.lang.String makeString(java.lang.String) 20 invokevirtual #5 Method java.lang.StringBuffer append(java.lang.String) 23 ldc #8 String a 25 invokevirtual #5 Method java.lang.StringBuffer append(java.lang.String) 28 aload_0 29 ldc #9 String test 31 invokespecial #7 Method java.lang.String makeString(java.lang.String) 34 invokevirtual #5 Method java.lang.StringBuffer append(java.lang.String) 37 pop 38 aload_1 39 invokevirtual #10 Method java.lang.String toString() 42 areturn Method java.lang.String testStringAdd() 0 new #2 Class java.lang.StringBuffer 3 dup 4 invokespecial #3 Method java.lang.StringBuffer() 7 ldc #4 String this 9 invokevirtual #5 Method java.lang.StringBuffer append(java.lang.String) 12 aload_0 13 ldc #6 String is 15 invokespecial #7 Method java.lang.String makeString(java.lang.String) 18 invokevirtual #5 Method java.lang.StringBuffer append(java.lang.String) 21 ldc #8 String a 23 invokevirtual #5 Method java.lang.StringBuffer append(java.lang.String) 26 aload_0 27 ldc #9 String test 29 invokespecial #7 Method java.lang.String makeString(java.lang.String) 32 invokevirtual #5 Method java.lang.StringBuffer append(java.lang.String) 35 invokevirtual #10 Method java.lang.String toString() 38 areturn Method java.lang.String testIncrement() 0 ldc #4 String this 2 astore_1 3 new #2 Class java.lang.StringBuffer 6 dup 7 invokespecial #3 Method java.lang.StringBuffer() 10 aload_1 11 invokevirtual #5 Method java.lang.StringBuffer append(java.lang.String) 14 aload_0 15 ldc #6 String is 17 invokespecial #7 Method java.lang.String makeString(java.lang.String) 20 invokevirtual #5 Method java.lang.StringBuffer append(java.lang.String) 23 invokevirtual #10 Method java.lang.String toString() 26 astore_1 27 new #2 Class java.lang.StringBuffer 30 dup 31 invokespecial #3 Method java.lang.StringBuffer() 34 aload_1 35 invokevirtual #5 Method java.lang.StringBuffer append(java.lang.String) 38 ldc #8 String a 40 invokevirtual #5 Method java.lang.StringBuffer append(java.lang.String) 43 invokevirtual #10 Method java.lang.String toString() 46 astore_1 47 new #2 Class java.lang.StringBuffer 50 dup 51 invokespecial
RE: A performance patch for PDFInfo class
-Original Message- From: Kevin O'Neill [mailto:kevin;rocketred.com.au] Sent: November 11, 2002 5:47 PM To: FOP Developers Subject: Re: A performance patch for PDFInfo class [ SNIP ] String buffers are used by the compiler to implement the binary string concatenation operator +. For example, the code: x = a + 4 + c is compiled to the equivalent of: x = new StringBuffer().append(a).append(4).append(c) .toString() So the first recommendation is to use String + for this type of method, it's easier to read and runs faster. [ SNIP ] This kind of thing is discussed by Jack Shirazi at length, also. The thing is, there has long been a blanket instruction, don't use String concatenation. Programmers learn it by fiat, and never think it through. In fact, it should be obvious to any programmer (if they are encouraged to think, that is) that concatenation of literal Strings is not something to avoid. Assuming a decent compiler. Regards, AHS - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, email: [EMAIL PROTECTED]
RE: A performance patch for PDFInfo class
snip/ So the first recommendation is to use String + for this type of method, it's easier to read and runs faster. [ SNIP ] This kind of thing is discussed by Jack Shirazi at length, also. The thing is, there has long been a blanket instruction, don't use String concatenation. Programmers learn it by fiat, and never think it through. In fact, it should be obvious to any programmer (if they are encouraged to think, that is) that concatenation of literal Strings is not something to avoid. Assuming a decent compiler. You've hit the nail on the head. Optimizations are just that optimizations. They are not blanket application things. Like anybody else there are times when I optimize as I go, but I really try and keep in mind, is this the simplest thing I could do? Fighting the urge to apply optimizations as you go is hard sometimes but in my experience leads to a better code base. When you do apply an optimization, prove it's worth. Create a small set of tests that show the difference and try and run them on a number of vms. You'd be surprised at the things I've found, on one embedded vm x = (y == null) ? a : b; was 50% slower than if (y == null) x = a; else x = b; go figure. -k. -- If you don't test then your code is only a collection of bugs which apparently behave like a working program. Website: http://www.rocketred.com.au/blogs/kevin/ - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, email: [EMAIL PROTECTED]
Re: A performance patch for PDFInfo class
Kevin O'Neill wrote: snip/ So the first recommendation is to use String + for this type of method, it's easier to read and runs faster. [This is from Arved.] This kind of thing is discussed by Jack Shirazi at length, also. The thing is, there has long been a blanket instruction, don't use String concatenation. Programmers learn it by fiat, and never think it through. In fact, it should be obvious to any programmer (if they are encouraged to think, that is) that concatenation of literal Strings is not something to avoid. Assuming a decent compiler. You've hit the nail on the head. Optimizations are just that optimizations. They are not blanket application things. Which of course applies to methodological, or style, optimisations as well. Yes, programmers learn by fiat, and yes, they rarely think it through. For example: always use the interface, not the implementation. Never do early optimisation. Like anybody else there are times when I optimize as I go, but I really try and keep in mind, is this the simplest thing I could do? Fighting the urge to apply optimizations as you go is hard sometimes but in my experience leads to a better code base. I almost always surrender to this urge immediately. As with most urges to which I surrender, I often I regret it afterwards, although nowhere near as frequently as with some of the other urges to which I have been known to surrender. Despite my occasional regrets, and my occasional unwinding of optimisations which did not work, I know with complete certainty that I expend hugely less energy on my optimisation regrets (which after all represent a minority of cases) than I would if I agonised over every temptation. At the end of the day, I am not ashamed of the code I have written over the years, even though, naturally, I would do much of it differently now. When you do apply an optimization, prove it's worth. Create a small set of tests that show the difference and try and run them on a number of vms. You'd be surprised at the things I've found, on one embedded vm x = (y == null) ? a : b; was 50% slower than if (y == null) x = a; else x = b; go figure. Indeed. I would never dream of writing such a test. x = (y == null) ? a : b; is quicker to write than the other, and I like it. Such a difference in performance is highly dependent on the particular compiler implementation, and is subject to radical unannounced variation. I would just write it and get on. I like it is the acid test; for me, of all code, but in particular for open source. Open source may be driven by many things, but money is not one of them. Pleasure is, and is high on the list. In spite of that, OS generates vast amounts of high-quality software. Go figure. Peter -- Peter B. West [EMAIL PROTECTED] http://www.powerup.com.au/~pbwest/ Lord, to whom shall we go? - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, email: [EMAIL PROTECTED]
Re: A performance patch for PDFInfo class
On Tue, 2002-11-12 at 11:21, Peter B. West wrote: Kevin O'Neill wrote: snip/ So the first recommendation is to use String + for this type of method, it's easier to read and runs faster. [This is from Arved.] This kind of thing is discussed by Jack Shirazi at length, also. The thing is, there has long been a blanket instruction, don't use String concatenation. Programmers learn it by fiat, and never think it through. In fact, it should be obvious to any programmer (if they are encouraged to think, that is) that concatenation of literal Strings is not something to avoid. Assuming a decent compiler. You've hit the nail on the head. Optimizations are just that optimizations. They are not blanket application things. Which of course applies to methodological, or style, optimisations as well. Yes, programmers learn by fiat, and yes, they rarely think it through. For example: always use the interface, not the implementation. Never do early optimisation. No rule is concrete, I don't think I ever say never :), but having guidelines that say things like: For published methods it's preferable to return an interface instance as opposed to a concrete class as it allows you to change the internals of the class without breaking external contracts. In short it improves encapsulation, helps a developer make choices. Like anybody else there are times when I optimize as I go, but I really try and keep in mind, is this the simplest thing I could do? Fighting the urge to apply optimizations as you go is hard sometimes but in my experience leads to a better code base. I almost always surrender to this urge immediately. As with most urges to which I surrender, I often I regret it afterwards, although nowhere near as frequently as with some of the other urges to which I have been known to surrender. :) Despite my occasional regrets, and my occasional unwinding of optimisations which did not work, I know with complete certainty that I expend hugely less energy on my optimisation regrets (which after all represent a minority of cases) than I would if I agonised over every temptation. At the end of the day, I am not ashamed of the code I have written over the years, even though, naturally, I would do much of it differently now. When you do apply an optimization, prove it's worth. Create a small set of tests that show the difference and try and run them on a number of vms. You'd be surprised at the things I've found, on one embedded vm x = (y == null) ? a : b; was 50% slower than if (y == null) x = a; else x = b; go figure. Indeed. I would never dream of writing such a test. x = (y == null) ? a : b; is quicker to write than the other, and I like it. Such a difference in performance is highly dependent on the particular compiler implementation, and is subject to radical unannounced variation. I would just write it and get on. Neither did I until a series of code blocks containing the operator came up in the performance tests. The code was compiled with the JDK 1.3.1 for 1.1 compatibility. I'm in no way suggesting not using the operator. It was an example of where you can sometime find performance differences and is a classic example of something that is likely to change with each release of the jvm/compiler (jdk 1.4.1 does produce slightly different opcodes) and something you should NOT optimize out except in extreme circumstances. I love a = b 1 ? -1 : 1. (oh we did optimize this out, it was in the main paint loop as gave us a 2% overall speed improvement). I like it is the acid test; for me, of all code, but in particular for open source. Open source may be driven by many things, but money is not one of them. Pleasure is, and is high on the list. In spite of that, OS generates vast amounts of high-quality software. Go figure. I'll disagree, open source can be driven by money, but lets agree to disagree :). We both agree that OS rocks though. Peter -- Peter B. West [EMAIL PROTECTED] http://www.powerup.com.au/~pbwest/ Lord, to whom shall we go? - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, email: [EMAIL PROTECTED] -- If you don't test then your code is only a collection of bugs which apparently behave like a working program. Website: http://www.rocketred.com.au/blogs/kevin/ - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, email: [EMAIL PROTECTED]
Re: A performance patch for PDFInfo class
Kevin O'Neill wrote: On Tue, 2002-11-12 at 11:21, Peter B. West wrote: I like it is the acid test; for me, of all code, but in particular for open source. Open source may be driven by many things, but money is not one of them. Pleasure is, and is high on the list. In spite of that, OS generates vast amounts of high-quality software. Go figure. I'll disagree, open source can be driven by money, but lets agree to disagree :). We both agree that OS rocks though. Yes, I take your point, and I am happy that companies are beginning to invest significant resources in OS. I just hope the the primary impulses of OS are not suppressed in the process. How unusual it is to have an (almost) shared time zone. Peter -- Peter B. West [EMAIL PROTECTED] http://www.powerup.com.au/~pbwest/ Lord, to whom shall we go? - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, email: [EMAIL PROTECTED]