Re: Support for LC_TIME

2014-05-12 Thread vtamara

El 2014-05-08 13:20, Marc Espie escribió:
As for portability issues: programs stay with the C locale *in any 
case*

unless they do setlocale()   right at the start, in which case they
explicitly say yes, I want to be localized. So, from that point of 
view
the portability issues are minimal (yes, I'm aware of the can of 
worms

that threads+locale may open).


xlocale addresses that.

That said, I don't have a general problem with adding other locale 
categories.
I believe LC_TIME would provide a useful testbed for eventually 
switching all our
locales to the localedef format (including LC_CTYPE). Alas, the 
proposed diff
does something else, and unfortunately I don't have enough time for 
a detailed
rabbit hole discussion and review with a lot of back-and-forth that 
we had when

discussing similar diffs in the past.


THAT on the other hand is the issue at hand... chronic time shortage 
to be

certain that what we do for locales isn't dangerous...


I would like a little of clarification about something else that 
Stefan is talking about.   I have been looking for ways to implement 
whole xlocale and whole LC_* support that is also part of libc specially 
to support well spanish .  The tables with data of collations, formats 
for numerical and monetary quantities adn time format for different 
countries that I have been using are from FreeBSD's, since they are BSD 
licensed, cover a lot of countries and somehow developed.


So when I have sent more than is immediatly needed is to make easier 
future implementation of xlocale or other LC_* or to use FreeBSD tables 
(or what tables do you recommend me to use? are there tables in 
localedef format and BSD-license compatible available somewhere?)  I 
have not been adding anything to weak security.  I understand any change 
can lead to security problems and I thank and appreciate the time any 
developer spends auditing.  All the suggestions that Stefan has asked me 
in past, I have implemented (All what I have sent and the suggestions I 
have received from Stephan and other developers I have included with 
credit in adJ: 
https://github.com/pasosdeJesus/adJ/tree/OBSD_CURRENT/arboldes/usr/src 
).


The LC_TIME implementation is shorter IMHO than the other LC_* and then 
faster to audit, so if you want I can try to send very short diffs in 
separate emails.





Re: Support for LC_TIME

2014-05-12 Thread vtamara

Answering bad question I made (sorry):

El 2014-05-12 05:55, vtamara escribió:

(or what tables do you recommend me to use? are there tables in
localedef format and BSD-license compatible available somewhere?)


http://unicode.org/cldr/trac/browser/tags/release-1-9/posix/

However for the moment I don't have the time to implement parser for 
localedef and use CLDR tables.  Hope to find time for that in future 
(hopefully for 5.7).






Re: Support for LC_TIME

2014-05-12 Thread Stefan Sperling
On Mon, May 12, 2014 at 05:55:55AM -0400, vtamara wrote:
 I would like a little of clarification about something else that Stefan is
 talking about. 

In my dream world, I would like a locale implementation that follows
the POSIX standard, supports multibyte characters throughout, avoids
file formats FreeBSD made up in 1995 (POSIX specifies a file format
for localedef that we could use), and uses data provided by unicode.org
as much as possible (e.g. it might be possible to derive all our LC_CTYPE
data from there in some automated fashion to ease maintenance).
 
 All the suggestions that Stefan has asked me in past, I have implemented
 (All what I have sent and the suggestions I have received from Stephan and
 other developers I have included with credit in adJ:
 https://github.com/pasosdeJesus/adJ/tree/OBSD_CURRENT/arboldes/usr/src ).

It looks like, at this point in time, and maybe forever, there aren't
any developers who are interested in spending their time on driving
OpenBSD's locales into the direction you want to go.

I spent a lot of time on your LC_COLLATE diffs, and unfortunately it
was just you and me without anyone else participating (i.e. there was
a general lack of interest, so we were working in a vacuum, which is bad).
When I realised your proposed LC_COLLATE changes (ported from FreeBSD's
1995 implementation) couldn't support multibyte outside the latin1 range
I gave up because that's too far away from my dream world.

At the moment I have several other things I'd like to work on so I'm
not going to dive into reviewing your diffs because doing so would take
time away from these other things.

Note that there is a gsoc student over at FreeBSD who is digging teeth
into LC_COLLATE. Perhaps you would have more fun and better results
trying to help out there?
https://www.google-melange.com/gsoc/project/details/google/gsoc2014/ghostmansd/570483752256?PageSpeed=noscript



Re: Support for LC_TIME

2014-05-08 Thread Stefan Sperling
On Wed, May 07, 2014 at 07:44:51PM +0200, Ingo Schwarze wrote:
 While LC_CTYPE and LC_COLLATE make some sense, LC_MONETARY, LC_NUMERIC,
 and LC_TIME are badly overengineered, pointless bloat, causing nothing
 but surprising, erratic behaviour and portability problems when trying
 to parse output from programs.  I think this should be rejected outright
 and you should stop wasting your time on it.

They make sense for systems that try to provide full i18n.
Of course, we don't try to provide i18n, at least not for the base system
which is English only. So they don't really make sense *for OpenBSD*.

That said, I don't have a general problem with adding other locale categories.
I believe LC_TIME would provide a useful testbed for eventually switching all 
our
locales to the localedef format (including LC_CTYPE). Alas, the proposed diff
does something else, and unfortunately I don't have enough time for a detailed
rabbit hole discussion and review with a lot of back-and-forth that we had when
discussing similar diffs in the past.



Re: Support for LC_TIME

2014-05-08 Thread Marc Espie
On Thu, May 08, 2014 at 12:07:30PM +0200, Stefan Sperling wrote:
 On Wed, May 07, 2014 at 07:44:51PM +0200, Ingo Schwarze wrote:
  While LC_CTYPE and LC_COLLATE make some sense, LC_MONETARY, LC_NUMERIC,
  and LC_TIME are badly overengineered, pointless bloat, causing nothing
  but surprising, erratic behaviour and portability problems when trying
  to parse output from programs.  I think this should be rejected outright
  and you should stop wasting your time on it.
 
 They make sense for systems that try to provide full i18n.
 Of course, we don't try to provide i18n, at least not for the base system
 which is English only. So they don't really make sense *for OpenBSD*.

???

Basic support for that stuff makes sense, as part of a *full* libc.
Not surprisingly, Antoine is for providing LC_* support. So am I.

This has little to do with base OpenBSD, everything to do with enough
stuff to be able to compile reasonable portable software on OpenBSD 
without needing to patch left and right.

As for portability issues: programs stay with the C locale *in any case*
unless they do setlocale()   right at the start, in which case they
explicitly say yes, I want to be localized. So, from that point of view
the portability issues are minimal (yes, I'm aware of the can of worms
that threads+locale may open).


 That said, I don't have a general problem with adding other locale categories.
 I believe LC_TIME would provide a useful testbed for eventually switching all 
 our
 locales to the localedef format (including LC_CTYPE). Alas, the proposed diff
 does something else, and unfortunately I don't have enough time for a detailed
 rabbit hole discussion and review with a lot of back-and-forth that we had 
 when
 discussing similar diffs in the past.

THAT on the other hand is the issue at hand... chronic time shortage to be
certain that what we do for locales isn't dangerous...



Re: Support for LC_TIME

2014-05-08 Thread Ingo Schwarze
Hi,

Marc Espie wrote on Thu, May 08, 2014 at 07:20:52PM +0200:
 On Thu, May 08, 2014 at 12:07:30PM +0200, Stefan Sperling wrote:
 On Wed, May 07, 2014 at 07:44:51PM +0200, Ingo Schwarze wrote:

 While LC_CTYPE and LC_COLLATE make some sense, LC_MONETARY, LC_NUMERIC,
 and LC_TIME are badly overengineered, pointless bloat, causing nothing
 but surprising, erratic behaviour and portability problems when trying
 to parse output from programs.  I think this should be rejected outright
 and you should stop wasting your time on it.

 They make sense for systems that try to provide full i18n.
 Of course, we don't try to provide i18n, at least not for the
 base system which is English only.  So they don't really make
 sense *for OpenBSD*.

 ???
 
 Basic support for that stuff makes sense, as part of a *full* libc.
 Not surprisingly, Antoine is for providing LC_* support. So am I.
 
 This has little to do with base OpenBSD, everything to do with enough
 stuff to be able to compile reasonable portable software on OpenBSD 
 without needing to patch left and right.

I don't see how any software might need patching if we continue
to ignore LC_TIME, just like we do now.  It's just as if the user
never sets LC_TIME, which the standard specifically says *any*
software must cope with.

 As for portability issues: programs stay with the C locale *in any case*
 unless they do setlocale() right at the start,

And that's what arch(1), at(1), awk(1), basename(1), calendar(1),
cat(1), chmod(1), cmp(1), cp(1), cron(8), csh(1), cut(1), date(1),
dig(1), dirname(1), env(1), expr(1), fmt(1), getconf(1), less(1),
logname(1), mandoc(1), mg(1), mkdir(1), mknod(8), nice(1), nl(1),
printf(1), rm(1), rmdir(1), sleep(1), sftp(1), sort(1), sudo(8),
tee(1), touch(1), tmux(1), uname(1), uudecode(1), vi(1), wc(1),
which(1), who(1), xargs(1) already do, right now.

Reliable and secure shell scripting will certainly be fun in
that LC_TIME ridden world.

 in which case they explicitly say yes, I want to be localized.
 So, from that point of view the portability issues are minimal
 (yes, I'm aware of the can of worms that threads+locale may open).

 That said, I don't have a general problem with adding other locale
 categories.  I believe LC_TIME would provide a useful testbed for
 eventually switching all our locales to the localedef format
 (including LC_CTYPE). Alas, the proposed diff does something else,
 and unfortunately I don't have enough time for a detailed rabbit
 hole discussion and review with a lot of back-and-forth that we
 had when discussing similar diffs in the past.

 THAT on the other hand is the issue at hand... chronic time shortage
 to be certain that what we do for locales isn't dangerous...

I don't doubt it *will* cause trouble even if it were done in the
so-called right way, because it's the basic design that is broken,
not just some implementation.  The concept is utterly wrong because
it does i18n *at the wrong level*, that is, not just in high level
graphical user interfaces, where it is merely annoying but doesn't
break much, but also at the system level, where it is nothing but
harmful.  And that layering violation is a direct consequence of
having this code at the wrong level.  No wonder it breaks everything
if it infects the C library including such functions as (according
to POSIX)

 - LC_NUMERIC changing the radix character in strtod(3), printf(3), scanf(3)
 - LC_TIME changing what strftime(3), strptime(3) and getdate(3) do,
   up to including non-ASCII characters into library system messages
 - LC_MESSAGES changing what strerror(3) does
 - ...

Look here:

schwarze@donnerwolke:~$ uname
Linux
schwarze@donnerwolke:~$ locale
LANG=
LANGUAGE=
LC_CTYPE=de_DE.UTF-8
LC_NUMERIC=de_DE.UTF-8
LC_TIME=de_DE.UTF-8
LC_COLLATE=de_DE.UTF-8
LC_MONETARY=de_DE.UTF-8
LC_MESSAGES=de_DE.UTF-8
LC_PAPER=de_DE.UTF-8
LC_NAME=de_DE.UTF-8
LC_ADDRESS=de_DE.UTF-8
LC_TELEPHONE=de_DE.UTF-8
LC_MEASUREMENT=de_DE.UTF-8
LC_IDENTIFICATION=de_DE.UTF-8
LC_ALL=de_DE.UTF-8
schwarze@donnerwolke:~$ ls -l .xsession-errors
-rw--- 1 schwarze schwarze 89221 28. MÃ €r 00:04 .xsession-errors

(Blank inserted for clarity).  Good luck parsing such abominations,
or as a system administrator, handling problem reports from users when
such stuff causes scripts to break.

Then again, given that this isn't going forward right now anyway,
maybe there is no need to waste time fighting back just yet.

Yours,
  Ingo



Re: Support for LC_TIME

2014-05-07 Thread Ingo Schwarze
Hi,

POSIX doesn't require support for any locales except POSIX and C.

While LC_CTYPE and LC_COLLATE make some sense, LC_MONETARY, LC_NUMERIC,
and LC_TIME are badly overengineered, pointless bloat, causing nothing
but surprising, erratic behaviour and portability problems when trying
to parse output from programs.  I think this should be rejected outright
and you should stop wasting your time on it.

Yours,
  Ingo



Re: Support for LC_TIME

2014-05-07 Thread Antoine Jacoutot
On Wed, May 07, 2014 at 07:44:51PM +0200, Ingo Schwarze wrote:
 Hi,
 
 POSIX doesn't require support for any locales except POSIX and C.
 
 While LC_CTYPE and LC_COLLATE make some sense, LC_MONETARY, LC_NUMERIC,
 and LC_TIME are badly overengineered, pointless bloat, causing nothing
 but surprising, erratic behaviour and portability problems when trying
 to parse output from programs.  I think this should be rejected outright
 and you should stop wasting your time on it.

Yeah, I fully disagree.

-- 
Antoine