Re: 8034856/8034857: More gcc warnings
On 2014-02-23 01:19, Alan Bateman wrote: On 19/02/2014 18:22, Mikael Vidstedt wrote: : The documented grammar in the comment only mentions "SPACE" and the code below doesn't make any references to \t. As a matter of fact, it only checks for one single, mandatory SPACE after the colon (enforced at line 535-536) and doesn't care to remove any space characters at the end of the value. The while loop only deals with continuations. If additional spaces do exist they will as far as I can tell be part of the value. Are they trimmed later? I'm assuming it would be nice to have both parsers (parse_manifest & JarFacade) behave the same way? Here's what it would look like to only check for space, but still eat any additional spaces which doesn't match what parse_manifest/parse_nv_pair does: http://cr.openjdk.java.net/~mikael/webrevs/isspace/webrev.01/webrev/ Sorry for the delay getting back to you on this (I've been busy with other things). I checked the JAR File Specification, which is turn references RFC 822 as the "inspiration" for the name-value pairs. The SPACE token is just ASCII SP. So I agree it's just ASCII SP that needs to be handled here, not LWSP-char which includes ASCII HT. Looking at JDK-6274276 then the trimming was done to avoid hard-to-diagnose problems with leading/trailing spaces. It's possible that this is inconsistent with other areas where JAR file attributes are used. I would suggest leaving it as is for now as this is potentially changing behavior in several areas. That sounds reasonable. I'll keep the webrev.01 approach - only check for and trim ASCII SP. Thanks for the review! Cheers, Mikael
Re: 8034856/8034857: More gcc warnings
On 19/02/2014 18:22, Mikael Vidstedt wrote: : The documented grammar in the comment only mentions "SPACE" and the code below doesn't make any references to \t. As a matter of fact, it only checks for one single, mandatory SPACE after the colon (enforced at line 535-536) and doesn't care to remove any space characters at the end of the value. The while loop only deals with continuations. If additional spaces do exist they will as far as I can tell be part of the value. Are they trimmed later? I'm assuming it would be nice to have both parsers (parse_manifest & JarFacade) behave the same way? Here's what it would look like to only check for space, but still eat any additional spaces which doesn't match what parse_manifest/parse_nv_pair does: http://cr.openjdk.java.net/~mikael/webrevs/isspace/webrev.01/webrev/ Sorry for the delay getting back to you on this (I've been busy with other things). I checked the JAR File Specification, which is turn references RFC 822 as the "inspiration" for the name-value pairs. The SPACE token is just ASCII SP. So I agree it's just ASCII SP that needs to be handled here, not LWSP-char which includes ASCII HT. Looking at JDK-6274276 then the trimming was done to avoid hard-to-diagnose problems with leading/trailing spaces. It's possible that this is inconsistent with other areas where JAR file attributes are used. I would suggest leaving it as is for now as this is potentially changing behavior in several areas. -Alan
Re: 8034856/8034857: More gcc warnings
On 2014-02-19 09:07, Alan Bateman wrote: On 18/02/2014 19:45, Mikael Vidstedt wrote: That makes sense, and in fact parse_manifest.c does not even appear to allow for \t, so I'm more and more starting to think that a reasonable implementation in this context would be: static int isNormalSpace(int c) { return c == ' '; } In which case it probably shouldn't even be a separate function to start with. I would like to get a second opinion on the implications of only checking for ' ' (0x20) though. If we want to allow both ' ' and \t we should probably call the function isblankAscii. Thanks again for taking this. On \t then if it's nor handled by the parsing code then isNormalSpace should be fine. Since I'm not exactly an expert on the code in question I would certainly appreciate it if somebody could verify me on that. I'm looking at parse_nv_pair (lines 430-542) in: http://hg.openjdk.java.net/jdk9/dev/jdk/file/c766ec3e4877/src/share/bin/parse_manifest.c The documented grammar in the comment only mentions "SPACE" and the code below doesn't make any references to \t. As a matter of fact, it only checks for one single, mandatory SPACE after the colon (enforced at line 535-536) and doesn't care to remove any space characters at the end of the value. The while loop only deals with continuations. If additional spaces do exist they will as far as I can tell be part of the value. Are they trimmed later? I'm assuming it would be nice to have both parsers (parse_manifest & JarFacade) behave the same way? Here's what it would look like to only check for space, but still eat any additional spaces which doesn't match what parse_manifest/parse_nv_pair does: http://cr.openjdk.java.net/~mikael/webrevs/isspace/webrev.01/webrev/ Cheers, Mikael
Re: 8034856/8034857: More gcc warnings
On 18/02/2014 19:45, Mikael Vidstedt wrote: That makes sense, and in fact parse_manifest.c does not even appear to allow for \t, so I'm more and more starting to think that a reasonable implementation in this context would be: static int isNormalSpace(int c) { return c == ' '; } In which case it probably shouldn't even be a separate function to start with. I would like to get a second opinion on the implications of only checking for ' ' (0x20) though. If we want to allow both ' ' and \t we should probably call the function isblankAscii. Thanks again for taking this. On \t then if it's nor handled by the parsing code then isNormalSpace should be fine. -Alan.
Re: 8034856/8034857: More gcc warnings
On 2014-02-18 00:33, Alan Bateman wrote: On 18/02/2014 03:59, Mikael Vidstedt wrote: How about: http://cr.openjdk.java.net/~mikael/webrevs/isspace/webrev.00/webrev/ Cheers, Mikael I checked the java.lang.instrument spec and for the Boot-Class-Path attribute then it doesn't say any more than "space". It might be worth checking the manifest parsing code (parse_manfiest.c) to see how continuations are handled as I suspect \r and \n can't appear in the attribute value (in which case the check might really only need to be for space and \t. That makes sense, and in fact parse_manifest.c does not even appear to allow for \t, so I'm more and more starting to think that a reasonable implementation in this context would be: static int isNormalSpace(int c) { return c == ' '; } In which case it probably shouldn't even be a separate function to start with. I would like to get a second opinion on the implications of only checking for ' ' (0x20) though. If we want to allow both ' ' and \t we should probably call the function isblankAscii. Otherwise replacing isspace is good and your isspaceAscii is likely to match the libc isspace (at runtime). This code isn't performance sensitive but maybe check space first would be a bit better. Also the library native code using 4 space indent rather than hotspot's 2. Will fix indentation. I seriously doubt that the performance difference warrants the more complicated code. I created JDK-8035054 a few days ago to track this. Thanks for taking it as I am busy with a number of other things at the moment. Always for you, sir! ;) /Mikael
Re: 8034856/8034857: More gcc warnings
Mikael, > http://cr.openjdk.java.net/~mikael/webrevs/isspace/webrev.00/webrev/ 1. I agree that ctypes isspace usually cause more problems than solve and it's good to have our own version. 2. one of possible implementation is #define isspaceASCII(c) (strchr(SPACE_CHARS,c) != NULL) -Dmitry On 2014-02-18 07:59, Mikael Vidstedt wrote: > > On 2014-02-17 07:08, Alan Bateman wrote: >> On 17/02/2014 05:51, Mikael Vidstedt wrote: >>> >>> I'm inclined to agree with this. Since the code depends on a specific >>> behavior of isspace which does not match what the system provided >>> function does I too think it would be more robust to implement our >>> own version of it. >> I completely agree that changing this code to use its own isspace is >> the right thing, it just seems a bit much for a drive-by fixed to gcc >> warnings. Do either of you want to take it? > > How about: > > http://cr.openjdk.java.net/~mikael/webrevs/isspace/webrev.00/webrev/ > > Cheers, > Mikael > -- Dmitry Samersoff Oracle Java development team, Saint Petersburg, Russia * I would love to change the world, but they won't give me the sources.
Re: 8034856/8034857: More gcc warnings
On 18/02/2014 03:59, Mikael Vidstedt wrote: How about: http://cr.openjdk.java.net/~mikael/webrevs/isspace/webrev.00/webrev/ Cheers, Mikael I checked the java.lang.instrument spec and for the Boot-Class-Path attribute then it doesn't say any more than "space". It might be worth checking the manifest parsing code (parse_manfiest.c) to see how continuations are handled as I suspect \r and \n can't appear in the attribute value (in which case the check might really only need to be for space and \t. Otherwise replacing isspace is good and your isspaceAscii is likely to match the libc isspace (at runtime). This code isn't performance sensitive but maybe check space first would be a bit better. Also the library native code using 4 space indent rather than hotspot's 2. I created JDK-8035054 a few days ago to track this. Thanks for taking it as I am busy with a number of other things at the moment. -Alan
Re: 8034856/8034857: More gcc warnings
On 2014-02-17 07:08, Alan Bateman wrote: On 17/02/2014 05:51, Mikael Vidstedt wrote: I'm inclined to agree with this. Since the code depends on a specific behavior of isspace which does not match what the system provided function does I too think it would be more robust to implement our own version of it. I completely agree that changing this code to use its own isspace is the right thing, it just seems a bit much for a drive-by fixed to gcc warnings. Do either of you want to take it? How about: http://cr.openjdk.java.net/~mikael/webrevs/isspace/webrev.00/webrev/ Cheers, Mikael
Re: 8034856/8034857: More gcc warnings
On 17/02/2014 05:51, Mikael Vidstedt wrote: I'm inclined to agree with this. Since the code depends on a specific behavior of isspace which does not match what the system provided function does I too think it would be more robust to implement our own version of it. I completely agree that changing this code to use its own isspace is the right thing, it just seems a bit much for a drive-by fixed to gcc warnings. Do either of you want to take it? -Alan
Re: 8034856/8034857: More gcc warnings
I'm inclined to agree with this. Since the code depends on a specific behavior of isspace which does not match what the system provided function does I too think it would be more robust to implement our own version of it. Cheers, Mikael > On Feb 16, 2014, at 14:20, Martin Buchholz wrote: > > Those locale-dependent APIs - more trouble than they're worth. More often > than not, you want a locale-independent version. > > So just define your own is_ASCII_space etc. like everybody else has done and > move on. > > >> On Sun, Feb 16, 2014 at 9:20 AM, Alan Bateman >> wrote: >>> On 13/02/2014 21:14, Mikael Vidstedt wrote: >>> : >>> >>> The change in question appears to come from >>> https://bugs.openjdk.java.net/browse/JDK-6679866, but I'm not sure the bug >>> gives enough additional information. My speculation (and it's really just a >>> speculation) is that it's not related to isspace per-se, but to something >>> else which gets defined/redefined/undefined by including ctype.h. I guess >>> it would be good to know if we have tests which cover the thing the comment >>> is alluding to (non-ascii in Premain-Class). >> Thanks for pointing this out. I looked at it again and the issue is that >> isspace is a macro and depends on the locale. By not including ctype.h then >> it means we get linked to the libc function instead. One approach is to >> include ctype.h and then #undef isspace, another is to define function >> prototype ourselves. I think the latter is a little bit better because it >> would avoid accidental usage of other local sensitive char classifiers. >> Attached is the patch that I propose. I have deliberate moved to to after >> other includes so we get a chance to #undef in the event that it gets >> included by something else. >> >> On tests then PremainClassTest.java is good enough to find this on Solaris. >> >> -Alan >> >> >> diff --git a/src/share/instrument/JarFacade.c >> b/src/share/instrument/JarFacade.c >> --- a/src/share/instrument/JarFacade.c >> +++ b/src/share/instrument/JarFacade.c >> @@ -23,17 +23,20 @@ >> * questions. >> */ >> >> -#ifdef _WIN32 >> -/* >> - * Win* needs this include. However, Linux and Solaris do not. >> - * Having this include on Solaris SPARC breaks having non US-ASCII >> - * characters in the value of the Premain-Class attribute. >> - */ >> -#include >> -#endif /* _WIN32 */ >> #include >> #include >> -#include >> + >> +/** >> + * ctype.h is required on Windows. For other platforms we use a function >> + * prototype to ensure that we use the libc isspace function rather than >> + * the isspace macro (due to isspace being locale sensitive) >> + */ >> +#ifdef _WIN32 >> + #include >> +#else >> + #undef isspace >> + extern int isspace(int c); >> +#endif /* _WIN32 */ >> >> #include "jni.h" >> #include "manifest_info.h" >
Re: 8034856/8034857: More gcc warnings
On 13/02/2014 21:14, Mikael Vidstedt wrote: : The change in question appears to come from https://bugs.openjdk.java.net/browse/JDK-6679866, but I'm not sure the bug gives enough additional information. My speculation (and it's really just a speculation) is that it's not related to isspace per-se, but to something else which gets defined/redefined/undefined by including ctype.h. I guess it would be good to know if we have tests which cover the thing the comment is alluding to (non-ascii in Premain-Class). Thanks for pointing this out. I looked at it again and the issue is that isspace is a macro and depends on the locale. By not including ctype.h then it means we get linked to the libc function instead. One approach is to include ctype.h and then #undef isspace, another is to define function prototype ourselves. I think the latter is a little bit better because it would avoid accidental usage of other local sensitive char classifiers. Attached is the patch that I propose. I have deliberate moved to to after other includes so we get a chance to #undef in the event that it gets included by something else. On tests then PremainClassTest.java is good enough to find this on Solaris. -Alan diff --git a/src/share/instrument/JarFacade.c b/src/share/instrument/JarFacade.c --- a/src/share/instrument/JarFacade.c +++ b/src/share/instrument/JarFacade.c @@ -23,17 +23,20 @@ * questions. */ -#ifdef _WIN32 -/* - * Win* needs this include. However, Linux and Solaris do not. - * Having this include on Solaris SPARC breaks having non US-ASCII - * characters in the value of the Premain-Class attribute. - */ -#include -#endif /* _WIN32 */ #include #include -#include + +/** + * ctype.h is required on Windows. For other platforms we use a function + * prototype to ensure that we use the libc isspace function rather than + * the isspace macro (due to isspace being locale sensitive) + */ +#ifdef _WIN32 + #include +#else + #undef isspace + extern int isspace(int c); +#endif /* _WIN32 */ #include "jni.h" #include "manifest_info.h"
Re: 8034856/8034857: More gcc warnings
On 2014-02-13 10:23, Alan Bateman wrote: On 13/02/2014 17:56, Mikael Vidstedt wrote: Alan, I made the change to JarFacade.c myself last week, only to then see the comment a few lines above where you added the new include. It seems to indicate that including ctype.h on Solaris/SPARC is a bad idea. I have no idea if the comment is still relevant, but that may be worth understanding first. Do you have cycles to look into it? As the code is using isspace already then it's not clear (unless there are different versions). Before pushing the changes then I ran the tests on all platforms (including Solaris) and the j.l.i tests include a number of tests exercise these manifest attributes with a non-US characters. The change in question appears to come from https://bugs.openjdk.java.net/browse/JDK-6679866, but I'm not sure the bug gives enough additional information. My speculation (and it's really just a speculation) is that it's not related to isspace per-se, but to something else which gets defined/redefined/undefined by including ctype.h. I guess it would be good to know if we have tests which cover the thing the comment is alluding to (non-ascii in Premain-Class). As an aside, the native code warnings coming from the jdk repository are really annoying so this is the reason for the drive-by fixes when I get a few minutes. I think others are doing the same. Absolutely support this work! As a matter of fact I have a couple of change in a sandbox I should send out for review. Cheers, Mikael
Re: 8034856/8034857: More gcc warnings
On 13/02/2014 17:56, Mikael Vidstedt wrote: Alan, I made the change to JarFacade.c myself last week, only to then see the comment a few lines above where you added the new include. It seems to indicate that including ctype.h on Solaris/SPARC is a bad idea. I have no idea if the comment is still relevant, but that may be worth understanding first. Do you have cycles to look into it? As the code is using isspace already then it's not clear (unless there are different versions). Before pushing the changes then I ran the tests on all platforms (including Solaris) and the j.l.i tests include a number of tests exercise these manifest attributes with a non-US characters. As an aside, the native code warnings coming from the jdk repository are really annoying so this is the reason for the drive-by fixes when I get a few minutes. I think others are doing the same. -Alan.
Re: 8034856/8034857: More gcc warnings
Alan, I made the change to JarFacade.c myself last week, only to then see the comment a few lines above where you added the new include. It seems to indicate that including ctype.h on Solaris/SPARC is a bad idea. I have no idea if the comment is still relevant, but that may be worth understanding first. Cheers, Mikael > On Feb 13, 2014, at 5:18, Alan Bateman wrote: > > > The number of native code warnings in the build is annoying so this is > another drive-by fix that eliminates a few of them in the serviceability and > security areas. The webrev with the changes is here: > > http://cr.openjdk.java.net/~alanb/8034856+8034857/webrev/ > > In the pkcs11 code the issue is the function prototypes for the throwXXX > functions aren't included. This is fixed by including pkcs11wrapper.h but > that exposes another issue with the header file includes that needed to be > fixed. > > In JarFacade the issue is that it uses isspace but doesn't include the ctype.h > > For LinuxOperatingSystem.c then there are 12 warnings related to fscanf > usages where the format specifier is %lld and the code wants to read into a > uint64_t. I've changed the format specifier to"%"SCNd64 so that it matches > uint64_t and should be okay on both 32 and 64-bit. > > Thanks, > Alan.
Re: 8034856/8034857: More gcc warnings
Looks fine to me. --Sean On 02/13/2014 08:18 AM, Alan Bateman wrote: The number of native code warnings in the build is annoying so this is another drive-by fix that eliminates a few of them in the serviceability and security areas. The webrev with the changes is here: http://cr.openjdk.java.net/~alanb/8034856+8034857/webrev/ In the pkcs11 code the issue is the function prototypes for the throwXXX functions aren't included. This is fixed by including pkcs11wrapper.h but that exposes another issue with the header file includes that needed to be fixed. In JarFacade the issue is that it uses isspace but doesn't include the ctype.h For LinuxOperatingSystem.c then there are 12 warnings related to fscanf usages where the format specifier is %lld and the code wants to read into a uint64_t. I've changed the format specifier to"%"SCNd64 so that it matches uint64_t and should be okay on both 32 and 64-bit. Thanks, Alan.
Re: 8034856/8034857: More gcc warnings
Changes look good. /Staffan On 13 feb 2014, at 14:18, Alan Bateman wrote: > > The number of native code warnings in the build is annoying so this is > another drive-by fix that eliminates a few of them in the serviceability and > security areas. The webrev with the changes is here: > > http://cr.openjdk.java.net/~alanb/8034856+8034857/webrev/ > > In the pkcs11 code the issue is the function prototypes for the throwXXX > functions aren't included. This is fixed by including pkcs11wrapper.h but > that exposes another issue with the header file includes that needed to be > fixed. > > In JarFacade the issue is that it uses isspace but doesn't include the ctype.h > > For LinuxOperatingSystem.c then there are 12 warnings related to fscanf > usages where the format specifier is %lld and the code wants to read into a > uint64_t. I've changed the format specifier to"%"SCNd64 so that it matches > uint64_t and should be okay on both 32 and 64-bit. > > Thanks, > Alan.
8034856/8034857: More gcc warnings
The number of native code warnings in the build is annoying so this is another drive-by fix that eliminates a few of them in the serviceability and security areas. The webrev with the changes is here: http://cr.openjdk.java.net/~alanb/8034856+8034857/webrev/ In the pkcs11 code the issue is the function prototypes for the throwXXX functions aren't included. This is fixed by including pkcs11wrapper.h but that exposes another issue with the header file includes that needed to be fixed. In JarFacade the issue is that it uses isspace but doesn't include the ctype.h For LinuxOperatingSystem.c then there are 12 warnings related to fscanf usages where the format specifier is %lld and the code wants to read into a uint64_t. I've changed the format specifier to"%"SCNd64 so that it matches uint64_t and should be okay on both 32 and 64-bit. Thanks, Alan.