Re: Micro-optimizations

2019-05-08 Thread Christopher Schultz
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Martin,

On 5/8/19 13:58, Martin Grigorov wrote:
> Hi,
> 
> On Wed, May 8, 2019 at 7:16 PM Christopher Schultz < 
> ch...@christopherschultz.net> wrote:
> 
> All,
> 
> While working on the sort-file-listing enhancement over the
> weekend, I looked in the Tomcat code to see if we already had a
> method to split a string by a single character. The closest method
> I found was
> 
> 
>> Since few years java.lang.String#split(String) is optimized to
>> not use Regex when the following condition passes:
> 
>> if (((regex.value.length == 1 && ".$|()[{^?*+\\".indexOf(ch =
>> regex.charAt(0)) == -1) || (regex.length() == 2 && 
>> regex.charAt(0) == '\\' && (((ch =
>> regex.charAt(1))-'0')|('9'-ch)) < 0 && ((ch-'a')|('z'-ch)) < 0
>> && ((ch-'A')|('Z'-ch)) < 0)) && (ch <
>> Character.MIN_HIGH_SURROGATE || ch >
>> Character.MAX_LOW_SURROGATE)) {
> 
>> i.e. in many cases it is already fast enough.

This is a great point. My method removes "empty" items and trims
stuff, but it does not really have to ; it just does because it was
easy enough to add at the time.

I'm happy to remove my custom split() method and use String.split in
my case.

I'll check to see if JspUtil.split() can be used in this way as well
and we can remove it.

- -chris
-BEGIN PGP SIGNATURE-
Comment: Using GnuPG with Thunderbird - https://www.enigmail.net/

iQIzBAEBCAAdFiEEMmKgYcQvxMe7tcJcHPApP6U8pFgFAlzTKfQACgkQHPApP6U8
pFh+9g//SXxOvwH5IAtfBUdq8nDOnkjB06j0HY8cefEnGS4W1KBVJBihtHpTI0IV
vN5Z5yqAslV7kbcc+/0ezV5g8JdToOhD2K+xn2ulT9o5gjX8MiI267luVvoaYbGM
hG/1tWPDft/IMLV5XkO6Cpr/1Bot0dBcEX9ASMb3tDKnhorOCmb9u+Oq2Fu4lZZ8
VSaPZItWc/9C7wQ5Uw3gj5WIeSThcG6i0o5HFEB3NqJiMi2LOwScz7Dk2YZkdIch
BuhsSdN97VuNO40TNLy71YLEPygHF3X3u1PtuKxjXLR4VhWBounGVzLlDvm/VJcz
C4cfzaNLFyBddDq/Kg4aBLXCqM7JXRjjb2QJ4VoXilpdM8fV9fDduozfbEUgdi1U
tSnCehHqwoD/UcqEWRBL0Scodm4KAXDTIFn4DVU+nzAAUIkGyJh60PadvCwJgA9H
bKTQOuv+yGaiZ9QjegGb+r2xcqDTmOXWBFgyUXMegQJTBUb9czgZ6z81pgt+ruNq
HpJChdsK1JCC8DIfkiDHMNohU3hH323DFtuC4PFbvF225l5ywu6pyg4qjaBEfw6X
W8Mf+KwfHcQUvjXhNPqKex4PU7HnolNL8h2Jigd3C97PtUMUiD4A+zekQ7oRiibe
3HFqnDA8oiNRCB5pkG3eTyXjii61bUMEglEsYxUoCZVR74Ek+0A=
=xZct
-END PGP SIGNATURE-

-
To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org
For additional commands, e-mail: dev-h...@tomcat.apache.org



Re: Micro-optimizations

2019-05-08 Thread Christopher Schultz
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Mark,

On 5/8/19 12:35, Mark Thomas wrote:
> On 08/05/2019 17:30, Christopher Schultz wrote:
>> 
>> Back to JspUtil.split. It's private and used exactly one place.
>> AND it's only ever called with a single-character String value.
>> So actually the use-case DID meet my needs for string-splitting
>> for the sorting. Perhaps I'll re-factor that method a little and
>> then use it across these two use-cases.
> 
> Doesn't that mean that DefaultServlet would depend on Jasper? That
> isn't allowed as the Jasper is meant to be an optional component.

That's what the re-factoring would be for: move the JspUtil.split
method somewhere else that is shared and use that from both places.

- -chris
-BEGIN PGP SIGNATURE-
Comment: Using GnuPG with Thunderbird - https://www.enigmail.net/

iQIyBAEBCAAdFiEEMmKgYcQvxMe7tcJcHPApP6U8pFgFAlzTKO0ACgkQHPApP6U8
pFj+7w/4gEfDhab5/dGDLNB1GODQPI963rD+GwyZFbrPaWxSLXnxkSUO70Z8xUfj
5t+L9xW7WqPI7yJzS0mIcCjskiYfsHoc2nSlaDtitduGMnSDMsPd0EJQaw9JXIqk
3gInxFQ7nKHjyVZpGQI6epKQQLk1RHgvGWGQxrTVZEqUgPO31sSiIZCbQZrLD1j+
msjRhYrtYYz2m1p/vp8waPdb9LF5QuczR/+cVQO8XdtkTthjfsv62z/pYTGC0Zza
3ZSsJDNdtjlEBuqdFAZX6YaK0oisA690+dChyVZ7Cc/OfQ++Al9t4Am2dpMADiCa
KyimX8uUzg2nW2+qxVaMvlu58jIcpDyo+alfBnhQ1/lon+qxbHE15eea8VmKFfH0
BIB+Aq1PUkWSqrXZGjs4vMjQeFZ3CfHSoe5coWjELDJwSx5gcN+5cEPbDEbxHHUa
yCprx1GipEYoa5rRspP9xOEXg7WjsVzify7ZHKrgkerUtpWRy4Ui3jlImetoXTji
G2tmQUiDDgq7iOKvmDxKWf0O+ImbPbXrZ5aXtYDS7iYKWGd9jwFSS6xYQJ7/FVib
t180MGHeH7DNO8rtFKBM47+byHGOtW9GY6LrH4gCqcxqUHwVAy21MO67Nm3WTgXt
ACMhRn6fHzYjXjjYbtr91XscBFjXnvd1Tc+5Iq3ulra7GlRyeA==
=nbT1
-END PGP SIGNATURE-

-
To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org
For additional commands, e-mail: dev-h...@tomcat.apache.org



Re: Micro-optimizations

2019-05-08 Thread Martin Grigorov
Hi,

On Wed, May 8, 2019 at 7:16 PM Christopher Schultz <
ch...@christopherschultz.net> wrote:

> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA256
>
> All,
>
> While working on the sort-file-listing enhancement over the weekend, I
> looked in the Tomcat code to see if we already had a method to split a
> string by a single character. The closest method I found was
>

Since few years java.lang.String#split(String) is optimized to not use
Regex when the following condition passes:

if (((regex.value.length == 1 &&
 ".$|()[{^?*+\\".indexOf(ch = regex.charAt(0)) == -1) ||
 (regex.length() == 2 &&
  regex.charAt(0) == '\\' &&
  (((ch = regex.charAt(1))-'0')|('9'-ch)) < 0 &&
  ((ch-'a')|('z'-ch)) < 0 &&
  ((ch-'A')|('Z'-ch)) < 0)) &&
(ch < Character.MIN_HIGH_SURROGATE ||
 ch > Character.MAX_LOW_SURROGATE))
{

i.e. in many cases it is already fast enough.


> org.pache.jasper.compiler.JspUtil which splits on a string, so I
> rejected it as being a bit too general-purpose for my needs.
>
> But it got me looking into the implementation of the method, and it
> looks a little sketchy.
>
> First, it uses a java.util.Vector. Yep, a Vector. Remember those?
> Synchronized everything? The Vector is not used outside of the method
> at all -- it's just local and temporary. So, the obvious conclusion is
> that it should be changed to ArrayList, right?
>
> But, modern JVMs are fairly good at (a) processing contention-less
> locks more quickly and (b) identifying objects which cannot possibly
> have lock-contention and simply skipping the locking.
>
> What's the right move in these cases? Should we change Vector ->
> ArrayList and "help-out" the compiler and runtime, or should we leave
> the code alone because "it's fine"?
>
> Similar arguments can be made for using StringBuilder over
> StringBuffer, etc.
>
> WDYT?
>
> - -chris
> -BEGIN PGP SIGNATURE-
> Comment: Using GnuPG with Thunderbird - https://www.enigmail.net/
>
> iQIzBAEBCAAdFiEEMmKgYcQvxMe7tcJcHPApP6U8pFgFAlzTAMcACgkQHPApP6U8
> pFjEOQ//SpbOVCErJu8SCue+RagRp0Z0PsAwbquf2z4NuMsPZVST66fcjE7QY7aI
> 9whVdUBauxNiUCwMUOkgV8g8IGBB8KoWyhyIkTWCZuh+cwan6+5aKesE5UcUdYW5
> Xg93cqB4oGUWsR1YaEWEdycBObCTcXx/oHTq26/rXZvrGDHGU7rrGGOgTqO4TETU
> 6xvME86xTaImrBYIrnSUTo2S3QyPbqsoOOzEWinyW75cvR6xFw4iNNRfgzHIlb8f
> EIUjCqHasm4qAcxQ+t0DIoeVbWvcLpX49UMNyXP5Wt2frOjeTzpx4ErFrNXsAlQd
> JYoNgo50nAYNwxAsnBbZCrqSkaFO89I+nj7Tq9WDovOTF17CyMMaYtJiBe/ocTNv
> 25JmldlsM/wFY+ol9PQz9UmfzOHCcKGWEV4DcKaro6aTBzhELGYHpRh3z+ZNNuHD
> zkbJPRZFDZ4XC9dwRSfm0DUbIsYP2peBp3z7/1hWm+lnK6NIAZQQh0TzJ/cIliKw
> 6uWvCesERD1GyQosqoL0h9MHSckGpTSfbk+iI6MZtEwQc3AH4C27fm6MpfAc8YQU
> sr0jf2EdgzrC990jazPZcRbpwT8Eifc5kF6QgkbiockkqC/Zr28awL7vY2m5Szg9
> n2scZoWUAnkGM8k0x3sFM4DZpthjUGVe78ZsBnJkKX4Q5LAnAYQ=
> =U4Jd
> -END PGP SIGNATURE-
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org
> For additional commands, e-mail: dev-h...@tomcat.apache.org
>
>


Re: Micro-optimizations

2019-05-08 Thread Mark Thomas
On 08/05/2019 17:30, Christopher Schultz wrote:
> Mark,
> 
> On 5/8/19 12:20, Mark Thomas wrote:
>> On 08/05/2019 17:16, Christopher Schultz wrote:
>>> All,
>>>
>>> While working on the sort-file-listing enhancement over the
>>> weekend, I looked in the Tomcat code to see if we already had a
>>> method to split a string by a single character. The closest
>>> method I found was org.pache.jasper.compiler.JspUtil which splits
>>> on a string, so I rejected it as being a bit too general-purpose
>>> for my needs.
>>>
>>> But it got me looking into the implementation of the method, and
>>> it looks a little sketchy.
>>>
>>> First, it uses a java.util.Vector. Yep, a Vector. Remember
>>> those? Synchronized everything? The Vector is not used outside of
>>> the method at all -- it's just local and temporary. So, the
>>> obvious conclusion is that it should be changed to ArrayList,
>>> right?
>>>
>>> But, modern JVMs are fairly good at (a) processing
>>> contention-less locks more quickly and (b) identifying objects
>>> which cannot possibly have lock-contention and simply skipping
>>> the locking.
>>>
>>> What's the right move in these cases? Should we change Vector -> 
>>> ArrayList and "help-out" the compiler and runtime, or should we
>>> leave the code alone because "it's fine"?
>>>
>>> Similar arguments can be made for using StringBuilder over 
>>> StringBuffer, etc.
>>>
>>> WDYT?
> 
>> I think the code is cleaner and easier to understand when the code 
>> matches the intention rather than when the code plus some expected 
>> compiler optimisation match the intention. So I'd say +1 to making
>> these sorts of changes.
> 
> This is kind of my thought as well.
> 
> I also feel like code coming from Apache ought to be an example of the
> "right way of doing things" wherever possible. Obviously, that's not a
> goal of many projects both within and outside the ASF, but I kind of
> feel like we should be able to point to Tomcat code and say "yes, this
> is how it should be done".
> 
> If it doesn't look nice, then it should be changed. I'm guessing
> that's one of the reasons you went through all the code and changed
> try/finally to try-with-resources and Foo baz = new Foo to
> Foo baz = new Foo<>() everywhere. I think this is kind of the same.

try-with-resources was partly about reducing code verbosity and partly
about ensuring resources were actually closed.

Foo baz = new Foo<>() was to reduce/eliminate IDE warnings. My
preference is to keep the code clean of IDE warnings so when I make a
stupid mistake and my IDE tries to point it out it isn't lost in the noise.

> Back to JspUtil.split. It's private and used exactly one place. AND
> it's only ever called with a single-character String value. So
> actually the use-case DID meet my needs for string-splitting for the
> sorting. Perhaps I'll re-factor that method a little and then use it
> across these two use-cases.

Doesn't that mean that DefaultServlet would depend on Jasper? That isn't
allowed as the Jasper is meant to be an optional component.

Mark

-
To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org
For additional commands, e-mail: dev-h...@tomcat.apache.org



Re: Micro-optimizations

2019-05-08 Thread Christopher Schultz
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Mark,

On 5/8/19 12:20, Mark Thomas wrote:
> On 08/05/2019 17:16, Christopher Schultz wrote:
>> All,
>> 
>> While working on the sort-file-listing enhancement over the
>> weekend, I looked in the Tomcat code to see if we already had a
>> method to split a string by a single character. The closest
>> method I found was org.pache.jasper.compiler.JspUtil which splits
>> on a string, so I rejected it as being a bit too general-purpose
>> for my needs.
>> 
>> But it got me looking into the implementation of the method, and
>> it looks a little sketchy.
>> 
>> First, it uses a java.util.Vector. Yep, a Vector. Remember
>> those? Synchronized everything? The Vector is not used outside of
>> the method at all -- it's just local and temporary. So, the
>> obvious conclusion is that it should be changed to ArrayList,
>> right?
>> 
>> But, modern JVMs are fairly good at (a) processing
>> contention-less locks more quickly and (b) identifying objects
>> which cannot possibly have lock-contention and simply skipping
>> the locking.
>> 
>> What's the right move in these cases? Should we change Vector -> 
>> ArrayList and "help-out" the compiler and runtime, or should we
>> leave the code alone because "it's fine"?
>> 
>> Similar arguments can be made for using StringBuilder over 
>> StringBuffer, etc.
>> 
>> WDYT?
> 
> I think the code is cleaner and easier to understand when the code 
> matches the intention rather than when the code plus some expected 
> compiler optimisation match the intention. So I'd say +1 to making
> these sorts of changes.

This is kind of my thought as well.

I also feel like code coming from Apache ought to be an example of the
"right way of doing things" wherever possible. Obviously, that's not a
goal of many projects both within and outside the ASF, but I kind of
feel like we should be able to point to Tomcat code and say "yes, this
is how it should be done".

If it doesn't look nice, then it should be changed. I'm guessing
that's one of the reasons you went through all the code and changed
try/finally to try-with-resources and Foo baz = new Foo to
Foo baz = new Foo<>() everywhere. I think this is kind of the same.

Back to JspUtil.split. It's private and used exactly one place. AND
it's only ever called with a single-character String value. So
actually the use-case DID meet my needs for string-splitting for the
sorting. Perhaps I'll re-factor that method a little and then use it
across these two use-cases.

- -chris
-BEGIN PGP SIGNATURE-
Comment: Using GnuPG with Thunderbird - https://www.enigmail.net/

iQIzBAEBCAAdFiEEMmKgYcQvxMe7tcJcHPApP6U8pFgFAlzTBCsACgkQHPApP6U8
pFjXaxAAmStuVV4R5xLXOuI7+svzqZ9qPBiDMa/KjBk2lvFBoZ7H1pg8OZcLH2qO
z+iGmf7IvdJ13SBwQfaGGZ556iJGjFkx3IX6+sC8jeBJt0se2BKU8UrsKlesYnnw
ZWaz9+gfg0Ab/N0l7Oh+BtHAdk60l1Lhg+3Pb4owHkJZj40VeIe59c7iF2OshBh9
PZOtek1TJkE1Mi2XmQKAqA6MJ8LDdUIIuFj+p4pWy+OOOzzywTvYNSlPe8Ozo8u9
9EpXis8wi66DMGMQXLtCbC9TW1z2DAV0r5lHfCuX1tdhvyDb9PvBKnL9SdIEB1Tf
4HUjnQ+XCUlXpsP1GjKD1W9v3z+dxuWbFoA/tZgtagc3+nsizkEwX7EFMAa9nfZq
7dYwbMIi6ypZZbDfzYTw3VD73KUZa8IRQd6zQBU8jYp/gT1sb4uSLzPhV73IyHDQ
tzdjtrwUzk0TPFJR7LgaOJBuGfIwh3gN19c+mNOgft6l7OBsu8rxXH+B7haoPkVD
u3KBWpvXhKTY6QWoUPQt3wgYysWUGjpigPJyFx5ndk5zVl5MSB0AXSZBEkBRuAG+
yEfumxHKtVy2OFLFgw65Wc2P6CZcYLIdGu5qccKFiyD/f8zh4vdeWcfWtsG8Siye
sCXJRBg4CwysWzVusTTus6ajb/MkMtOsj7NkU7f24o3KQ+EyIAY=
=/PKS
-END PGP SIGNATURE-

-
To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org
For additional commands, e-mail: dev-h...@tomcat.apache.org



Re: Micro-optimizations

2019-05-08 Thread Mark Thomas
On 08/05/2019 17:16, Christopher Schultz wrote:
> All,
> 
> While working on the sort-file-listing enhancement over the weekend, I
> looked in the Tomcat code to see if we already had a method to split a
> string by a single character. The closest method I found was
> org.pache.jasper.compiler.JspUtil which splits on a string, so I
> rejected it as being a bit too general-purpose for my needs.
> 
> But it got me looking into the implementation of the method, and it
> looks a little sketchy.
> 
> First, it uses a java.util.Vector. Yep, a Vector. Remember those?
> Synchronized everything? The Vector is not used outside of the method
> at all -- it's just local and temporary. So, the obvious conclusion is
> that it should be changed to ArrayList, right?
> 
> But, modern JVMs are fairly good at (a) processing contention-less
> locks more quickly and (b) identifying objects which cannot possibly
> have lock-contention and simply skipping the locking.
> 
> What's the right move in these cases? Should we change Vector ->
> ArrayList and "help-out" the compiler and runtime, or should we leave
> the code alone because "it's fine"?
> 
> Similar arguments can be made for using StringBuilder over
> StringBuffer, etc.
> 
> WDYT?

I think the code is cleaner and easier to understand when the code
matches the intention rather than when the code plus some expected
compiler optimisation match the intention. So I'd say +1 to making these
sorts of changes.

Mark

-
To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org
For additional commands, e-mail: dev-h...@tomcat.apache.org



Micro-optimizations

2019-05-08 Thread Christopher Schultz
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

All,

While working on the sort-file-listing enhancement over the weekend, I
looked in the Tomcat code to see if we already had a method to split a
string by a single character. The closest method I found was
org.pache.jasper.compiler.JspUtil which splits on a string, so I
rejected it as being a bit too general-purpose for my needs.

But it got me looking into the implementation of the method, and it
looks a little sketchy.

First, it uses a java.util.Vector. Yep, a Vector. Remember those?
Synchronized everything? The Vector is not used outside of the method
at all -- it's just local and temporary. So, the obvious conclusion is
that it should be changed to ArrayList, right?

But, modern JVMs are fairly good at (a) processing contention-less
locks more quickly and (b) identifying objects which cannot possibly
have lock-contention and simply skipping the locking.

What's the right move in these cases? Should we change Vector ->
ArrayList and "help-out" the compiler and runtime, or should we leave
the code alone because "it's fine"?

Similar arguments can be made for using StringBuilder over
StringBuffer, etc.

WDYT?

- -chris
-BEGIN PGP SIGNATURE-
Comment: Using GnuPG with Thunderbird - https://www.enigmail.net/

iQIzBAEBCAAdFiEEMmKgYcQvxMe7tcJcHPApP6U8pFgFAlzTAMcACgkQHPApP6U8
pFjEOQ//SpbOVCErJu8SCue+RagRp0Z0PsAwbquf2z4NuMsPZVST66fcjE7QY7aI
9whVdUBauxNiUCwMUOkgV8g8IGBB8KoWyhyIkTWCZuh+cwan6+5aKesE5UcUdYW5
Xg93cqB4oGUWsR1YaEWEdycBObCTcXx/oHTq26/rXZvrGDHGU7rrGGOgTqO4TETU
6xvME86xTaImrBYIrnSUTo2S3QyPbqsoOOzEWinyW75cvR6xFw4iNNRfgzHIlb8f
EIUjCqHasm4qAcxQ+t0DIoeVbWvcLpX49UMNyXP5Wt2frOjeTzpx4ErFrNXsAlQd
JYoNgo50nAYNwxAsnBbZCrqSkaFO89I+nj7Tq9WDovOTF17CyMMaYtJiBe/ocTNv
25JmldlsM/wFY+ol9PQz9UmfzOHCcKGWEV4DcKaro6aTBzhELGYHpRh3z+ZNNuHD
zkbJPRZFDZ4XC9dwRSfm0DUbIsYP2peBp3z7/1hWm+lnK6NIAZQQh0TzJ/cIliKw
6uWvCesERD1GyQosqoL0h9MHSckGpTSfbk+iI6MZtEwQc3AH4C27fm6MpfAc8YQU
sr0jf2EdgzrC990jazPZcRbpwT8Eifc5kF6QgkbiockkqC/Zr28awL7vY2m5Szg9
n2scZoWUAnkGM8k0x3sFM4DZpthjUGVe78ZsBnJkKX4Q5LAnAYQ=
=U4Jd
-END PGP SIGNATURE-

-
To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org
For additional commands, e-mail: dev-h...@tomcat.apache.org