[ 
https://issues.apache.org/jira/browse/STDCXX-499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12557859#action_12557859
 ] 

Martin Sebor commented on STDCXX-499:
-------------------------------------

The question is: is this our problem or one with the locale definition (such as 
the Bulgarian locale on Linux in the test case above). I.e., is it a valid 
locale that specifies a grouping but no thousands_sep?

Among our own locales there is only one that fits this description suggesting 
it might be a bug in the locale definition:

$ (cd ~/stdcxx && for f in `grep -l "^grouping  *[1-9]" etc/nls/src/*`; do grep 
-l "thousands_sep  *\"\"" $f; done)
etc/nls/src/bg_BG

The latest glibc bg_BG definition is the same:
http://sources.redhat.com/cgi-bin/cvsweb.cgi/libc/localedata/locales/bg_BG?rev=1.7.2.2&content-type=text/x-cvsweb-markup&cvsroot=glibc

I opened a glibc issue to see if they agree it's a bug:
 http://sources.redhat.com/bugzilla/show_bug.cgi?id=5599

If we should decide to work around it I see two possible ways of handling it in 
punct.cpp, after retrieving the grouping and thousands_sep for the locale using 
localeconv():

When grouping is not empty and valid and thsousands_sep is NUL, either
a) set grouping to "", or
b) set thousands_sep to some non-NUL value.

Solution a) seems safer because it doesn't involve inventing a thousands_sep 
that's valid for the locale but the downside is that it loses potentially 
useful information.

Solution b) leaves open the question of which thousands_sep is appropriate for 
the locale.

> std::num_put inserts NUL thousand separator
> -------------------------------------------
>
>                 Key: STDCXX-499
>                 URL: https://issues.apache.org/jira/browse/STDCXX-499
>             Project: C++ Standard Library
>          Issue Type: Bug
>          Components: 22. Localization
>    Affects Versions: 4.1.2, 4.1.3, 4.1.4
>            Reporter: Martin Sebor
>            Assignee: Martin Sebor
>             Fix For: 4.2.1
>
>
> Moved from Rogue Wave Bugzilla: 
> http://bugzilla.cvo.roguewave.com/show_bug.cgi?id=1913
> -------- Original Message --------
> Subject: num_put and null-character thousand separator
> Date: Tue, 11 Jan 2005 16:10:23 -0500
> From: Boris Gubenko <[EMAIL PROTECTED]>
> Reply-To: Boris Gubenko <[EMAIL PROTECTED]>
> Organization: Hewlett-Packard Co.
> To: Martin Sebor <[EMAIL PROTECTED]>
>   Another locale-related issue that we fixed in rw stdlib v3.0 (and in
>   v2.0 also) is making sure, that num_put does not insert null thousand
>   separator character into the stream. Here is the fix in _num_put.cc
>   in v3.0 :
> template <class _CharT, class _OutputIter /* = ostreambuf_iterator<_CharT>
> */>
> _TYPENAME num_put<_CharT, _OutputIter>::iter_type
> num_put<_CharT, _OutputIter>::
> _C_put (iter_type __it, ios_base &__flags, char_type __fill, int __type,
>         const void *__pval) const
> {
>     const numpunct<char_type> &__np =
>         _V3_USE_FACET (numpunct<char_type>, __flags.getloc ());
>     // FIXME: adjust buffer dynamically as necessary
>     char __buf [_RWSTD_DBL_MAX_10_EXP];
>     char *__pbuf = __buf;
>     const string __grouping = __np.grouping ();
>     const char *__grp       = __grouping.c_str ();
>     const int __prec        = __flags.precision ();
> #if defined(__VMS) && defined(__DECCXX) && !defined(__DECFIXCXXL1730)
>     const char __nogrouping = _RWSTD_CHAR_MAX;
>     if (!__np.thousands_sep())
>         __grp = &__nogrouping;
> #endif
>   Here is the test:
> cosf.zko.dec.com> setenv LANG fr_FR.ISO8859-1
> cosf.zko.dec.com> locale -k thousands_sep
> thousands_sep=""
> cosf.zko.dec.com> cxx x.cxx && a.out
> null character thousand_sep was not inserted
> cosf.zko.dec.com> cxx x.cxx -D_RWSTD_USE_CONFIG -D_RWSTDDEBUG \
>    -I/usr/cxx1/boris/CXXL_1886-2/stdlib-4.0/stdlib/include/ \
>    -nocxxstd -L/usr/cxx1/boris/CXXL_1886-2/result/lib -lstd11s \
>    && a.out
> null character thousand_sep was inserted
> cosf.zko.dec.com>
> x.cxx
> -----
> #ifndef __USE_STD_IOSTREAM
> #define __USE_STD_IOSTREAM
> #endif
> #include <iostream>
> #include <sstream>
> #include <string>
> #include <locale>
> #include <locale.h>
> #ifdef __linux
> #define FRENCH_LOCALE "fr_FR"
> #else
> #define FRENCH_LOCALE "fr_FR.ISO8859-1"
> #endif
> using namespace std;
> int main()
> {
>   ostringstream os;
>   if (setlocale(LC_ALL,FRENCH_LOCALE))
>   {
>     setlocale(LC_ALL,"C");
>     os.imbue(locale(FRENCH_LOCALE));
>     os << (double) 10000.1 << endl;
>     if ( (os.str())[2] == '\0' )
>       cout << "null character thousand_sep was inserted" << endl;
>     else
>       cout << "null character thousand_sep was not inserted" << endl;
>   }
>   return 0;
> }
> ------- Additional Comments From [EMAIL PROTECTED] 2005-01-11 14:50:44 ----
> -------- Original Message --------
> Subject: Re: num_put and null-character thousand separator
> Date: Tue, 11 Jan 2005 15:50:06 -0700
> From: Martin Sebor <[EMAIL PROTECTED]>
> To: Boris Gubenko <[EMAIL PROTECTED]>
> References: <[EMAIL PROTECTED]>
> Boris Gubenko wrote:
> >   Another locale-related issue that we fixed in rw stdlib v3.0 (and in
> >   v2.0 also) is making sure, that num_put does not insert null thousand
> >   separator character into the stream. Here is the fix in _num_put.cc
> >   in v3.0 :
> I don't think this fix would be quite correct in general. NUL is
> a valid character that the locale library was specifically designed
> to be able to insert and extract just like any other. In addition,
> in the code below, operator==() need not be defined for the character
> type.
> > 
> ...
> >   Here is the test:
> Thanks for the helpful test case.
> My feeling is that this case points out a fundamental design
> disconnect between the C and C++ locales. In C, NUL is not
> an ordinary character -- it's a special character that terminates
> strings. In addition, C formatted I/O is done in multibyte
> characters. In contrast, in C++, NUL is a character like any other
> and formatted I/O is always done in single chars (or wchar_t when
> char is not wide enough), but never in multibyte characters.
> In C, the thousand separator is a multibyte string so even if
> grouping is non-empty, inserting an empty string will be as good
> as inserting none at all. In C++ the separator is assumed to be
> a single character so there's no way to achieve the same effect.
> Instead, whether a thousand separator gets inserted or not is
> controlled by the grouping string.
> One way to fix this would be to set grouping to "" if thousands_sep
> is NUL, although that would be quite correct, either because numpunct
> can be used directly by user programs. I'll have to think about how
> to deal with this. In the meantime, I filed bug 1913 for this problem
> so that you can track it.
> Martin

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to