[Bug fortran/61847] bug in gfortran runtime: digits cut off when reading floating point number
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61847 --- Comment #24 from Janne Blomqvist jb at gcc dot gnu.org --- Author: jb Date: Mon Nov 10 00:17:16 2014 New Revision: 217273 URL: https://gcc.gnu.org/viewcvs?rev=217273root=gccview=rev Log: PR 47007 and 61847 Locale failures in libgfortran. 2014-11-10 Janne Blomqvist j...@gcc.gnu.org PR libfortran/47007 PR libfortran/61847 * config.h.in: Regenerated. * configure: Regenerated. * configure.ac (AC_CHECK_HEADERS_ONCE): Check for xlocale.h. (AC_CHECK_FUNCS_ONCE): Check for newlocale, freelocale, uselocale, strerror_l. * io/io.h (locale.h): Include. (xlocale.h): Include if present. (c_locale): New variable. (old_locale): New variable. (old_locale_ctr): New variable. (old_locale_lock): New variable. (st_parameter_dt): Add old_locale member. * io/transfer.c (data_transfer_init): Set locale to C if doing formatted transfer. (finalize_transfer): Reset locale to previous. * io/unit.c (c_locale): New variable. (old_locale): New variable. (old_locale_ctr): New variable. (old_locale_lock): New variable. (init_units): Init c_locale, init old_locale_lock. (close_units): Free c_locale. * runtime/error.c (locale.h): Include. (xlocale.h): Include if present. (gf_strerror): Use strerror_l if available. Reset locale to LC_GLOBAL_LOCALE for strerror_r branch. 2014-11-10 Janne Blomqvist j...@gcc.gnu.org PR libfortran/47007 PR libfortran/61847 * gfortran.texi: Add note about locale issues to thread-safety section. Modified: trunk/gcc/fortran/ChangeLog trunk/gcc/fortran/gfortran.texi trunk/libgfortran/ChangeLog trunk/libgfortran/config.h.in trunk/libgfortran/configure trunk/libgfortran/configure.ac trunk/libgfortran/io/io.h trunk/libgfortran/io/transfer.c trunk/libgfortran/io/unit.c trunk/libgfortran/runtime/error.c
[Bug fortran/61847] bug in gfortran runtime: digits cut off when reading floating point number
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61847 Janne Blomqvist jb at gcc dot gnu.org changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED --- Comment #25 from Janne Blomqvist jb at gcc dot gnu.org --- Fixed on trunk, closing.
[Bug fortran/61847] bug in gfortran runtime: digits cut off when reading floating point number
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61847 Janne Blomqvist jb at gcc dot gnu.org changed: What|Removed |Added URL||https://gcc.gnu.org/ml/gcc- ||patches/2014-11/msg00277.ht ||ml --- Comment #23 from Janne Blomqvist jb at gcc dot gnu.org --- Patch using POSIX 2008 functionality at https://gcc.gnu.org/ml/gcc-patches/2014-11/msg00277.html
[Bug fortran/61847] bug in gfortran runtime: digits cut off when reading floating point number
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61847 --- Comment #22 from Janne Blomqvist jb at gcc dot gnu.org --- (In reply to Tobias Burnus from comment #21) (In reply to Jerry DeLisle from comment #20) Created attachment 33858 [details] Proposed patch /* If the current locale is expecting a comma rather than a decimal point, convert it now. */ if (dtp-u.p.current_unit-decimal_locale == ',') *strchr (buffer, '.') = ','; In principle, there are more options than just , and .; for instance, in Britain, one often uses a centered dot (·) but that's not in the locale. Looking at the output of all my locales, I found: ps_AF.utf8: 0٫40 as the only locale which doesn't use either a '.' or a ','. Interesting.. still, Jerry's patch looks like an improvement over the status quo and should cover the vast majority of cases. Still, using a code like the following looks more robust. /* During _gfortran_st_read/write. */ const char *curr_locale = setlocale(LC_ALL, NULL); setlocale(LC_ALL, C); ... /* During _gfortran_st_read_done/write_done. */ setlocale(LC_ALL, curr_locale); I really don't think we should mess with setlocale(). It changes the process-wide locale, and if some other thread does something locale-dependent between our setlocale() calls there will be a bug in the user program which will be very hard to track down. As an aside, Jerry's patch suffers from similar issues, as the locale might be changed between checking the decimal separator (on OPEN) and using some locale-dependent functions. The robust solution really is to use strtod_l etc. as previously mentioned.
[Bug fortran/61847] bug in gfortran runtime: digits cut off when reading floating point number
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61847 Tobias Burnus burnus at gcc dot gnu.org changed: What|Removed |Added CC||burnus at gcc dot gnu.org --- Comment #21 from Tobias Burnus burnus at gcc dot gnu.org --- (In reply to Jerry DeLisle from comment #20) Created attachment 33858 [details] Proposed patch /* If the current locale is expecting a comma rather than a decimal point, convert it now. */ if (dtp-u.p.current_unit-decimal_locale == ',') *strchr (buffer, '.') = ','; In principle, there are more options than just , and .; for instance, in Britain, one often uses a centered dot (·) but that's not in the locale. Looking at the output of all my locales, I found: ps_AF.utf8: 0٫40 as the only locale which doesn't use either a '.' or a ','. Still, using a code like the following looks more robust. /* During _gfortran_st_read/write. */ const char *curr_locale = setlocale(LC_ALL, NULL); setlocale(LC_ALL, C); ... /* During _gfortran_st_read_done/write_done. */ setlocale(LC_ALL, curr_locale); * * * Side remarks: * Per PR36857 comment 8, it has to be C and not POSIX for MinGW. * The fix for PR 36857 also assumes that there is only , and .; thus, when going the setlocale route, it should be fixed as well. * See also PR 47007; see also the variant using __strtold_l/strtold_l and newlocale for READ (cf. PR 47007 comment 20 to 22).
[Bug fortran/61847] bug in gfortran runtime: digits cut off when reading floating point number
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61847 Jerry DeLisle jvdelisle at gcc dot gnu.org changed: What|Removed |Added Attachment #33227|0 |1 is obsolete|| --- Comment #20 from Jerry DeLisle jvdelisle at gcc dot gnu.org --- Created attachment 33858 -- https://gcc.gnu.org/bugzilla/attachment.cgi?id=33858action=edit Proposed patch This attached patch enhances gfortran to check the locale and get the currently active decimal character at the time a unit is connected and saves it. During formatted input the decimal type is checked and if necessary changed internally so that the calls to the system provided string to decimal functions convert the value properly. Some configuration checks for locale.h are needed to make sure getting the decimal character will work. If not, the code reverts to the current behavior. Need testing on various platforms for all who can do so. I tested on linux x86-64.
[Bug fortran/61847] bug in gfortran runtime: digits cut off when reading floating point number
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61847 Janne Blomqvist jb at gcc dot gnu.org changed: What|Removed |Added CC||jb at gcc dot gnu.org --- Comment #19 from Janne Blomqvist jb at gcc dot gnu.org --- (In reply to Jerry DeLisle from comment #17) I have a patch in the works. The idea is to query the locale at the time the Unit is connected and save the LC_NUMERIC character in the unit structure. Then, if the decimal character matches the DECIMAL_STATUS (decimal=point or decimal=comma) active at the time of reading, change the decimal character internally to the current locale character previously saved. This way, only one call to locale is needed per unit connection, preserving efficiency. The real string will then be converted correctly, regardless of locale. While clever, I'm not sure this approach works. A program can change the locale between opening the file and reading from it (potentially in another thread, since the locale is a process-wide property). What can be done instead is to use the POSIX 2008 extended locale functionality (newlocale) to create a locale object in the default C locale and then use functions like strtod_l (for some reason not in POSIX 2008, though at least glibc and BSD/OSX have it, IIRC) that take such a locale object as argument. This is fairly new though and not available everywhere, but ought to be robust. See also PR 47007. As an aside, AFAIK the C and POSIX locales are the same, just two names for the same thing. C might be more portable, as that should work everywhere there is a C implementation.
[Bug fortran/61847] bug in gfortran runtime: digits cut off when reading floating point number
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61847 --- Comment #18 from Jerry DeLisle jvdelisle at gcc dot gnu.org --- Created attachment 33227 -- https://gcc.gnu.org/bugzilla/attachment.cgi?id=33227action=edit A preliminary patch This patch is preliminary for proof of principle. If the idea is acceptable, then configuration magic is required to ensure the locale.h is available, and if not, default to '.'.
[Bug fortran/61847] bug in gfortran runtime: digits cut off when reading floating point number
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61847 Jerry DeLisle jvdelisle at gcc dot gnu.org changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |jvdelisle at gcc dot gnu.org Severity|normal |enhancement --- Comment #17 from Jerry DeLisle jvdelisle at gcc dot gnu.org --- I have a patch in the works. The idea is to query the locale at the time the Unit is connected and save the LC_NUMERIC character in the unit structure. Then, if the decimal character matches the DECIMAL_STATUS (decimal=point or decimal=comma) active at the time of reading, change the decimal character internally to the current locale character previously saved. This way, only one call to locale is needed per unit connection, preserving efficiency. The real string will then be converted correctly, regardless of locale.
[Bug fortran/61847] bug in gfortran runtime: digits cut off when reading floating point number
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61847 Dominique d'Humieres dominiq at lps dot ens.fr changed: What|Removed |Added Summary|bug in gfortran runtime on |bug in gfortran runtime: |OSX: digits cut off when|digits cut off when reading |reading floating point |floating point number |number | --- Comment #10 from Dominique d'Humieres dominiq at lps dot ens.fr --- I can reproduce this PR on a linux box with gcc version 4.6.3 20120306 (Red Hat 4.6.3-2), so the bug is not darwin specific. I have noticed that the file generated by running the test is 1.2345 and does not change if I put the line setlocale(LC_ALL, de_DE.UTF-8); before the line f = fopen(bug.dat, w); Now if I change the content of bug.dat to 1,2345 suppress the file generation in bug.c and use open(unit=1,file='bug.dat', decimal='comma') in bugf.f90, running the executable does not give any output (success). So apparently strtod uses the locale to read 1.2345, giving 1.0 as a result for de_DE.UTF-8 (or fr_FR.UTF-8), using en_US.ISO8859-1 gives 1.2345. The only solution I see is to save the current locale, set it to C before using strtod, and restore it upon completion.
[Bug fortran/61847] bug in gfortran runtime: digits cut off when reading floating point number
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61847 --- Comment #11 from Jerry DeLisle jvdelisle at gcc dot gnu.org --- After all that has been said here, I am almost afraid to add any more. This is not a bug. Fortran and GFortran are not locale aware. The ',' or '.' are read from the file or device literally as is. From this read, a numeric string is constructed. If the unit was opened with decimal='comma' and a comma was seen, the comma is converted to '.'. If decimal='point' and a comma is read, an error occurs. After the above described numeric string is constructed it is passed to the glibc library strtod (sring to double). The glibc library is locale aware and if the locale has defined the decimal token to be a ',' (comma), it see the decimal 'point' and interprets it as end of string conversion. We do not want to take a performance it by checking the locale setting on every I/O operation, so the only logical place to do that is in main.c. However, in the example, the original poster is only compiling a gfortran subroutine. There is no gfortran program, so there is no gfortran main.c So the responsibility for addressing the locale has to fall on the C program side or within the users subroutine using maybe system calls that are extensions and not Fortran standard code. This users code would query the current runtime environment for current decimal setting and then do the I/O with the appropriate decimal= specifier. To avoid confusion, remember that gfortran is reading the characters in the file literally. So if there is a 1,2345 it sees the comma. If there is a 1.234 it sees the point. The conversion to internal floating point representation occurs after the character data is read. The easiest solution is to do what I said in in Comment #2 on the C side. The equivalent can be done on the fortran side as well, just not as easily. One possible enhancement we could consider is providing some set and get locale intrinsics. This would be helpful for some folks. But, thats a lot more work.
[Bug fortran/61847] bug in gfortran runtime: digits cut off when reading floating point number
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61847 --- Comment #12 from e2cd58e1 at opayq dot com --- Sorry but I still have a problem with this, maybe I didn't get what you are saying or I wasn't clear enough. Suppose I cannot change the C-wrapper and the locale might be set to whatever. The file bug.dat already exists and uses point decimals. So I want a fortran routine that always reads in a file in the usual point separated format. If in the fortran routine I call open(unit=1,file='bug.dat', decimal=point) I expect the keyword to be more important than the locale setting: I explicitly specify to read point separated values, but in the example below, it still returns 1.0 instead of 1.2345. Is that really expected behavior? - bug.c -- #include stdlib.h /* strtod */ #include locale.h #include stdio.h int badcall_(); int main() { setlocale(LC_ALL, de_DE.UTF-8); badcall_(); return 0; } - bug.dat -- 1.2345 - bugf.f90 -- subroutine badcall() implicit none double precision :: res open(unit=1,file='bug.dat',decimal=point) read(1,*) res write(*,*) 'res =', res end subroutine badcall
[Bug fortran/61847] bug in gfortran runtime: digits cut off when reading floating point number
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61847 --- Comment #13 from Steve Kargl sgk at troutmask dot apl.washington.edu --- On Tue, Jul 22, 2014 at 01:39:30AM +, jvdelisle at gcc dot gnu.org wrote: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61847 --- Comment #11 from Jerry DeLisle jvdelisle at gcc dot gnu.org --- After all that has been said here, I am almost afraid to add any more. This is not a bug. Fortran and GFortran are not locale aware. The ',' or '.' are read from the file or device literally as is. From this read, a numeric string is constructed. If the unit was opened with decimal='comma' and a comma was seen, the comma is converted to '.'. If decimal='point' and a comma is read, an error occurs. I never claimed it to be bug. You've simply restated what I was trying to convey in much more coherent manner in a single post. After the above described numeric string is constructed it is passed to the glibc library strtod (sring to double). The glibc library is locale aware and if the locale has defined the decimal token to be a ',' (comma), it see the decimal 'point' and interprets it as end of string conversion. I do however note that OP is using MacOS and I use FreeBSD. Neither uses glibc. strtod is a C99/POSIX specified function, so correctly implmented strtod function should give the same results (up to whether C99/POSIX requires adherence to locale).
[Bug fortran/61847] bug in gfortran runtime: digits cut off when reading floating point number
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61847 --- Comment #14 from Jerry DeLisle jvdelisle at gcc dot gnu.org --- (In reply to e2cd58e1 from comment #12) --- snip --- Suppose I cannot change the C-wrapper and the locale might be set to whatever. The file bug.dat already exists and uses point decimals. So I want a fortran routine that always reads in a file in the usual point separated format. If in the fortran routine I call open(unit=1,file='bug.dat', decimal=point) I expect the keyword to be more important than the locale setting: I explicitly specify to read point separated values, but in the example below, it still returns 1.0 instead of 1.2345. Is that really expected behavior? My first bad assumption was that for some reason you wanted the current locale, whatever it is, to remain active. My second bad assumption was that you could easily change your C-wrapper. :) In reading up on the locale business, setting locale to POSIX is suppose to be fully portable. So, we could easily force the locale to POSIX in the open statement. I need to think about whether this will mess up something else.
[Bug fortran/61847] bug in gfortran runtime: digits cut off when reading floating point number
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61847 --- Comment #15 from Jerry DeLisle jvdelisle at gcc dot gnu.org --- Maybe something like this: Index: open.c === --- open.c(revision 212498) +++ open.c(working copy) @@ -26,6 +26,7 @@ see the files COPYING3 and COPYING.RUNTIME respect #include io.h #include fbuf.h #include unix.h +#include locale.h #ifdef HAVE_UNISTD_H #include unistd.h @@ -725,6 +726,10 @@ st_open (st_parameter_open *opp) library_start (opp-common); + /* For portability, set locale to POSIX. */ + + setlocale(LC_ALL, POSIX); + /* Decode options. */ flags.access = !(cf IOPARM_OPEN_HAS_ACCESS) ? ACCESS_UNSPECIFIED :
[Bug fortran/61847] bug in gfortran runtime: digits cut off when reading floating point number
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61847 --- Comment #16 from Steve Kargl sgk at troutmask dot apl.washington.edu --- On Tue, Jul 22, 2014 at 04:27:58AM +, jvdelisle at gcc dot gnu.org wrote: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61847 --- Comment #15 from Jerry DeLisle jvdelisle at gcc dot gnu.org --- Maybe something like this: I think that you'll need to use configure to check for locale.h. Index: open.c === --- open.c(revision 212498) +++ open.c(working copy) @@ -26,6 +26,7 @@ see the files COPYING3 and COPYING.RUNTIME respect #include io.h #include fbuf.h #include unix.h +#include locale.h #ifdef HAVE_LOCALE_H #include locale.h #endif #ifdef HAVE_UNISTD_H #include unistd.h @@ -725,6 +726,10 @@ st_open (st_parameter_open *opp) library_start (opp-common); #ifdef HAVE_LOCALE_H + /* For portability, set locale to POSIX. */ + + setlocale(LC_ALL, POSIX); + #endif