Re: Support for LC_TIME
El 2014-05-08 13:20, Marc Espie escribió: As for portability issues: programs stay with the C locale *in any case* unless they do setlocale() right at the start, in which case they explicitly say yes, I want to be localized. So, from that point of view the portability issues are minimal (yes, I'm aware of the can of worms that threads+locale may open). xlocale addresses that. That said, I don't have a general problem with adding other locale categories. I believe LC_TIME would provide a useful testbed for eventually switching all our locales to the localedef format (including LC_CTYPE). Alas, the proposed diff does something else, and unfortunately I don't have enough time for a detailed rabbit hole discussion and review with a lot of back-and-forth that we had when discussing similar diffs in the past. THAT on the other hand is the issue at hand... chronic time shortage to be certain that what we do for locales isn't dangerous... I would like a little of clarification about something else that Stefan is talking about. I have been looking for ways to implement whole xlocale and whole LC_* support that is also part of libc specially to support well spanish . The tables with data of collations, formats for numerical and monetary quantities adn time format for different countries that I have been using are from FreeBSD's, since they are BSD licensed, cover a lot of countries and somehow developed. So when I have sent more than is immediatly needed is to make easier future implementation of xlocale or other LC_* or to use FreeBSD tables (or what tables do you recommend me to use? are there tables in localedef format and BSD-license compatible available somewhere?) I have not been adding anything to weak security. I understand any change can lead to security problems and I thank and appreciate the time any developer spends auditing. All the suggestions that Stefan has asked me in past, I have implemented (All what I have sent and the suggestions I have received from Stephan and other developers I have included with credit in adJ: https://github.com/pasosdeJesus/adJ/tree/OBSD_CURRENT/arboldes/usr/src ). The LC_TIME implementation is shorter IMHO than the other LC_* and then faster to audit, so if you want I can try to send very short diffs in separate emails.
Re: Support for LC_TIME
Answering bad question I made (sorry): El 2014-05-12 05:55, vtamara escribió: (or what tables do you recommend me to use? are there tables in localedef format and BSD-license compatible available somewhere?) http://unicode.org/cldr/trac/browser/tags/release-1-9/posix/ However for the moment I don't have the time to implement parser for localedef and use CLDR tables. Hope to find time for that in future (hopefully for 5.7).
Re: Support for LC_TIME
On Mon, May 12, 2014 at 05:55:55AM -0400, vtamara wrote: I would like a little of clarification about something else that Stefan is talking about. In my dream world, I would like a locale implementation that follows the POSIX standard, supports multibyte characters throughout, avoids file formats FreeBSD made up in 1995 (POSIX specifies a file format for localedef that we could use), and uses data provided by unicode.org as much as possible (e.g. it might be possible to derive all our LC_CTYPE data from there in some automated fashion to ease maintenance). All the suggestions that Stefan has asked me in past, I have implemented (All what I have sent and the suggestions I have received from Stephan and other developers I have included with credit in adJ: https://github.com/pasosdeJesus/adJ/tree/OBSD_CURRENT/arboldes/usr/src ). It looks like, at this point in time, and maybe forever, there aren't any developers who are interested in spending their time on driving OpenBSD's locales into the direction you want to go. I spent a lot of time on your LC_COLLATE diffs, and unfortunately it was just you and me without anyone else participating (i.e. there was a general lack of interest, so we were working in a vacuum, which is bad). When I realised your proposed LC_COLLATE changes (ported from FreeBSD's 1995 implementation) couldn't support multibyte outside the latin1 range I gave up because that's too far away from my dream world. At the moment I have several other things I'd like to work on so I'm not going to dive into reviewing your diffs because doing so would take time away from these other things. Note that there is a gsoc student over at FreeBSD who is digging teeth into LC_COLLATE. Perhaps you would have more fun and better results trying to help out there? https://www.google-melange.com/gsoc/project/details/google/gsoc2014/ghostmansd/570483752256?PageSpeed=noscript
Re: Support for LC_TIME
On Wed, May 07, 2014 at 07:44:51PM +0200, Ingo Schwarze wrote: While LC_CTYPE and LC_COLLATE make some sense, LC_MONETARY, LC_NUMERIC, and LC_TIME are badly overengineered, pointless bloat, causing nothing but surprising, erratic behaviour and portability problems when trying to parse output from programs. I think this should be rejected outright and you should stop wasting your time on it. They make sense for systems that try to provide full i18n. Of course, we don't try to provide i18n, at least not for the base system which is English only. So they don't really make sense *for OpenBSD*. That said, I don't have a general problem with adding other locale categories. I believe LC_TIME would provide a useful testbed for eventually switching all our locales to the localedef format (including LC_CTYPE). Alas, the proposed diff does something else, and unfortunately I don't have enough time for a detailed rabbit hole discussion and review with a lot of back-and-forth that we had when discussing similar diffs in the past.
Re: Support for LC_TIME
On Thu, May 08, 2014 at 12:07:30PM +0200, Stefan Sperling wrote: On Wed, May 07, 2014 at 07:44:51PM +0200, Ingo Schwarze wrote: While LC_CTYPE and LC_COLLATE make some sense, LC_MONETARY, LC_NUMERIC, and LC_TIME are badly overengineered, pointless bloat, causing nothing but surprising, erratic behaviour and portability problems when trying to parse output from programs. I think this should be rejected outright and you should stop wasting your time on it. They make sense for systems that try to provide full i18n. Of course, we don't try to provide i18n, at least not for the base system which is English only. So they don't really make sense *for OpenBSD*. ??? Basic support for that stuff makes sense, as part of a *full* libc. Not surprisingly, Antoine is for providing LC_* support. So am I. This has little to do with base OpenBSD, everything to do with enough stuff to be able to compile reasonable portable software on OpenBSD without needing to patch left and right. As for portability issues: programs stay with the C locale *in any case* unless they do setlocale() right at the start, in which case they explicitly say yes, I want to be localized. So, from that point of view the portability issues are minimal (yes, I'm aware of the can of worms that threads+locale may open). That said, I don't have a general problem with adding other locale categories. I believe LC_TIME would provide a useful testbed for eventually switching all our locales to the localedef format (including LC_CTYPE). Alas, the proposed diff does something else, and unfortunately I don't have enough time for a detailed rabbit hole discussion and review with a lot of back-and-forth that we had when discussing similar diffs in the past. THAT on the other hand is the issue at hand... chronic time shortage to be certain that what we do for locales isn't dangerous...
Re: Support for LC_TIME
Hi, Marc Espie wrote on Thu, May 08, 2014 at 07:20:52PM +0200: On Thu, May 08, 2014 at 12:07:30PM +0200, Stefan Sperling wrote: On Wed, May 07, 2014 at 07:44:51PM +0200, Ingo Schwarze wrote: While LC_CTYPE and LC_COLLATE make some sense, LC_MONETARY, LC_NUMERIC, and LC_TIME are badly overengineered, pointless bloat, causing nothing but surprising, erratic behaviour and portability problems when trying to parse output from programs. I think this should be rejected outright and you should stop wasting your time on it. They make sense for systems that try to provide full i18n. Of course, we don't try to provide i18n, at least not for the base system which is English only. So they don't really make sense *for OpenBSD*. ??? Basic support for that stuff makes sense, as part of a *full* libc. Not surprisingly, Antoine is for providing LC_* support. So am I. This has little to do with base OpenBSD, everything to do with enough stuff to be able to compile reasonable portable software on OpenBSD without needing to patch left and right. I don't see how any software might need patching if we continue to ignore LC_TIME, just like we do now. It's just as if the user never sets LC_TIME, which the standard specifically says *any* software must cope with. As for portability issues: programs stay with the C locale *in any case* unless they do setlocale() right at the start, And that's what arch(1), at(1), awk(1), basename(1), calendar(1), cat(1), chmod(1), cmp(1), cp(1), cron(8), csh(1), cut(1), date(1), dig(1), dirname(1), env(1), expr(1), fmt(1), getconf(1), less(1), logname(1), mandoc(1), mg(1), mkdir(1), mknod(8), nice(1), nl(1), printf(1), rm(1), rmdir(1), sleep(1), sftp(1), sort(1), sudo(8), tee(1), touch(1), tmux(1), uname(1), uudecode(1), vi(1), wc(1), which(1), who(1), xargs(1) already do, right now. Reliable and secure shell scripting will certainly be fun in that LC_TIME ridden world. in which case they explicitly say yes, I want to be localized. So, from that point of view the portability issues are minimal (yes, I'm aware of the can of worms that threads+locale may open). That said, I don't have a general problem with adding other locale categories. I believe LC_TIME would provide a useful testbed for eventually switching all our locales to the localedef format (including LC_CTYPE). Alas, the proposed diff does something else, and unfortunately I don't have enough time for a detailed rabbit hole discussion and review with a lot of back-and-forth that we had when discussing similar diffs in the past. THAT on the other hand is the issue at hand... chronic time shortage to be certain that what we do for locales isn't dangerous... I don't doubt it *will* cause trouble even if it were done in the so-called right way, because it's the basic design that is broken, not just some implementation. The concept is utterly wrong because it does i18n *at the wrong level*, that is, not just in high level graphical user interfaces, where it is merely annoying but doesn't break much, but also at the system level, where it is nothing but harmful. And that layering violation is a direct consequence of having this code at the wrong level. No wonder it breaks everything if it infects the C library including such functions as (according to POSIX) - LC_NUMERIC changing the radix character in strtod(3), printf(3), scanf(3) - LC_TIME changing what strftime(3), strptime(3) and getdate(3) do, up to including non-ASCII characters into library system messages - LC_MESSAGES changing what strerror(3) does - ... Look here: schwarze@donnerwolke:~$ uname Linux schwarze@donnerwolke:~$ locale LANG= LANGUAGE= LC_CTYPE=de_DE.UTF-8 LC_NUMERIC=de_DE.UTF-8 LC_TIME=de_DE.UTF-8 LC_COLLATE=de_DE.UTF-8 LC_MONETARY=de_DE.UTF-8 LC_MESSAGES=de_DE.UTF-8 LC_PAPER=de_DE.UTF-8 LC_NAME=de_DE.UTF-8 LC_ADDRESS=de_DE.UTF-8 LC_TELEPHONE=de_DE.UTF-8 LC_MEASUREMENT=de_DE.UTF-8 LC_IDENTIFICATION=de_DE.UTF-8 LC_ALL=de_DE.UTF-8 schwarze@donnerwolke:~$ ls -l .xsession-errors -rw--- 1 schwarze schwarze 89221 28. MÃ €r 00:04 .xsession-errors (Blank inserted for clarity). Good luck parsing such abominations, or as a system administrator, handling problem reports from users when such stuff causes scripts to break. Then again, given that this isn't going forward right now anyway, maybe there is no need to waste time fighting back just yet. Yours, Ingo
Re: Support for LC_TIME
Hi, POSIX doesn't require support for any locales except POSIX and C. While LC_CTYPE and LC_COLLATE make some sense, LC_MONETARY, LC_NUMERIC, and LC_TIME are badly overengineered, pointless bloat, causing nothing but surprising, erratic behaviour and portability problems when trying to parse output from programs. I think this should be rejected outright and you should stop wasting your time on it. Yours, Ingo
Re: Support for LC_TIME
On Wed, May 07, 2014 at 07:44:51PM +0200, Ingo Schwarze wrote: Hi, POSIX doesn't require support for any locales except POSIX and C. While LC_CTYPE and LC_COLLATE make some sense, LC_MONETARY, LC_NUMERIC, and LC_TIME are badly overengineered, pointless bloat, causing nothing but surprising, erratic behaviour and portability problems when trying to parse output from programs. I think this should be rejected outright and you should stop wasting your time on it. Yeah, I fully disagree. -- Antoine