Re: man.conf mandoc -Tlocale

2014-02-16 Thread Ingo Schwarze
Hi Ted,

Ted Unangst wrote on Fri, Feb 14, 2014 at 12:42:20PM -0500:
 On Fri, Feb 14, 2014 at 14:02, Ingo Schwarze wrote:

 I even considered switching the mandoc(1) default from -Tascii to
 -Tlocale in general, but forgot about it again.  If you like the
 idea, that would be something to do after unlock; it might require
 explicitly giving the -Tascii option in some build system and similar
 contexts.
 
 I think -Tlocale might be a saner default than -Tascii nowadays.
 People who don't want UTF-8 shouldn't have it in their LC_CTYPE,
 and it's hard to see why people who do want it and have it in their
 LC_CTYPE should be forced to give -Tlocale or something similar
 to each and every utility they call.

 Inclined to agree, but I wanted to try the conservative approach for
 this release.

I fully understand that, and i'm not strongly opposed to your less
intrusive suggestion, not even to putting it in before the release.

However there are two (weak) reasons why you might decide to wait
even with this less intrusive step, anyway.

 1. I asked around a bit and Thomas Klausner (NetBSD) mentioned
that both groff and mandoc format bare, unescaped ASCII minus
characters (`-', 0x2d) found in the input stream as the
three-byte UTF-8 sequence 0xe2 0x80 0x93 in the output stream
when running with -Tutf8 or with -Tlocale and LC_CTYPE=*_*.UTF-8.
That can be annoying when trying to copy and paste code examples
from formatted manual pages.  Maybe we should not rush this in
but allow more time to decide whether we dislike that quick
and maybe devise mitigations.  More similar issues might hide
under some rocks in the vicinity.

 2. If we hope to switch the mandoc default anyway, switching
/etc/man.conf back and forth in the process maybe just
gratuitiously exercises sysmerge(8) on people's machines.

 I don't know how many places the output of mandoc is
 saved for later.

Few, probably, because mandoc(1) is fast enough that we run in on
demand whenever possible and usually avoid preformatting anything
during builds, or where we do preformat, we use groff for that, anyway.
But even missing one single instance that is hiding somewhere
would just be pointless disruption in a release.

Yours,
  Ingo



Re: man.conf mandoc -Tlocale

2014-02-16 Thread Marc Espie
On Sun, Feb 16, 2014 at 03:11:07PM +0100, Ingo Schwarze wrote:
 Few, probably, because mandoc(1) is fast enough that we run in on
 demand whenever possible and usually avoid preformatting anything
 during builds, or where we do preformat, we use groff for that, anyway.
 But even missing one single instance that is hiding somewhere
 would just be pointless disruption in a release.

We can try specifically poisoning mandoc in ports land, so that you would
see whenever it's invoked.

Don't remember if you're familiar with that, but it's fairly trivial to do:
the ports tree runs all the builds with PATH starting with
${WRKDIR}/bin

you can poison mandoc very easily around line 2477 of bsd.port.mk.



Re: man.conf mandoc -Tlocale

2014-02-16 Thread Ted Unangst
On Sun, Feb 16, 2014 at 22:19, Ingo Schwarze wrote:

 Ingo Schwarze wrote on Sun, Feb 16, 2014 at 03:11:07PM +0100:
 
  1. I asked around a bit and Thomas Klausner (NetBSD) mentioned
 that both groff and mandoc format bare, unescaped ASCII minus
 characters (`-', 0x2d) found in the input stream as the
 three-byte UTF-8 sequence 0xe2 0x80 0x93 in the output stream
 when running with -Tutf8 or with -Tlocale and LC_CTYPE=*_*.UTF-8.
 
 Dmitrij D. Czarkoff just pointed out to me in private mail that
 this isn't true at all.  I misunderstood what Thomas said.

Damn. I was going to count that as feature to teach people to not
blindly copy and paste samples.



Re: man.conf mandoc -Tlocale

2014-02-14 Thread Stefan Sperling
On Thu, Feb 13, 2014 at 09:22:04PM -0500, Ted Unangst wrote:
 About 20 years after the invention of utf-8, I've decided to see what
 all the fuss is about and experiment with uxterm and whatnot.
 Naturally, this means I want to see sweet fancy quotes in all my man
 pages instead of the lame ``fake'' quotes. In order to convince mandoc
 to give me what I want, however, requires a command line option. But
 what about all those old school ascii only terminals I still sometimes
 use?
 
 mandoc fortunately has an option -Tlocale, which will pick between
 ascii and utf8 based on environment. Perfect! Let's use it.
 
 Tested to work as expected in uxterm. Tested to change nothing in a
 regular xterm by default (no LC_CTYPE set).

I don't see any problem from the locale side, so OK with me.
But I cannot speak for the man side of things.

 Index: man.conf
 ===
 RCS file: /cvs/src/etc/man.conf,v
 retrieving revision 1.18
 diff -u -p -r1.18 man.conf
 --- man.conf  13 Jul 2013 20:21:52 -  1.18
 +++ man.conf  14 Feb 2014 02:14:29 -
 @@ -16,15 +16,15 @@ _subdir   {cat,man}1 {cat,man}8 {cat,man}
  _suffix  .0
  _build   .0.Z/usr/bin/zcat %s
  _build   .0.gz   /usr/bin/gzcat %s
 -_build   .[1-9n] /usr/bin/mandoc %s
 -_build   .[1-9n].Z   /usr/bin/zcat %s | /usr/bin/mandoc
 -_build   .[1-9n].gz  /usr/bin/gzcat %s | /usr/bin/mandoc
 -_build   .[1-9][a-z] /usr/bin/mandoc %s
 -_build   .[1-9][a-z].Z   /usr/bin/zcat %s | /usr/bin/mandoc
 -_build   .[1-9][a-z].gz  /usr/bin/gzcat %s | /usr/bin/mandoc
 -_build   .tbl/usr/bin/mandoc %s
 -_build   .tbl.Z  /usr/bin/zcat %s | /usr/bin/mandoc
 -_build   .tbl.gz /usr/bin/gzcat %s | /usr/bin/mandoc
 +_build   .[1-9n] /usr/bin/mandoc -Tlocale %s
 +_build   .[1-9n].Z   /usr/bin/zcat %s | /usr/bin/mandoc 
 -Tlocale
 +_build   .[1-9n].gz  /usr/bin/gzcat %s | /usr/bin/mandoc 
 -Tlocale
 +_build   .[1-9][a-z] /usr/bin/mandoc -Tlocale %s
 +_build   .[1-9][a-z].Z   /usr/bin/zcat %s | /usr/bin/mandoc 
 -Tlocale
 +_build   .[1-9][a-z].gz  /usr/bin/gzcat %s | /usr/bin/mandoc 
 -Tlocale
 +_build   .tbl/usr/bin/mandoc -Tlocale %s
 +_build   .tbl.Z  /usr/bin/zcat %s | /usr/bin/mandoc 
 -Tlocale
 +_build   .tbl.gz /usr/bin/gzcat %s | /usr/bin/mandoc 
 -Tlocale
  
  # Sections and their directories.
  # All paths ending in '/' are the equivalent of entries specifying that



Re: man.conf mandoc -Tlocale

2014-02-14 Thread Anthony J. Bentley
On Thu, Feb 13, 2014 at 7:22 PM, Ted Unangst t...@tedunangst.com wrote:
 mandoc fortunately has an option -Tlocale, which will pick between
 ascii and utf8 based on environment. Perfect! Let's use it.

 Tested to work as expected in uxterm. Tested to change nothing in a
 regular xterm by default (no LC_CTYPE set).

I've been using this exact man.conf (with LC_CTYPE=en_US.UTF-8) since
December 2012. It would be nice to have as the default. OK here...



Re: man.conf mandoc -Tlocale

2014-02-14 Thread Ingo Schwarze
Hi Ted,

Ted Unangst wrote on Thu, Feb 13, 2014 at 09:22:04PM -0500:

 About 20 years after the invention of utf-8, I've decided to see what
 all the fuss is about and experiment with uxterm and whatnot.
 Naturally, this means I want to see sweet fancy quotes in all my man
 pages instead of the lame ``fake'' quotes. In order to convince mandoc
 to give me what I want, however, requires a command line option. But
 what about all those old school ascii only terminals I still sometimes
 use?
 
 mandoc fortunately has an option -Tlocale, which will pick between
 ascii and utf8 based on environment. Perfect! Let's use it.
 
 Tested to work as expected in uxterm. Tested to change nothing in a
 regular xterm by default (no LC_CTYPE set).

Even though i don't use it, i'm not opposed to your patch.
I think it makes sense.

I even considered switching the mandoc(1) default from -Tascii to
-Tlocale in general, but forgot about it again.  If you like the
idea, that would be something to do after unlock; it might require
explicitly giving the -Tascii option in some build system and similar
contexts.

I think -Tlocale might be a saner default than -Tascii nowadays.
People who don't want UTF-8 shouldn't have it in their LC_CTYPE,
and it's hard to see why people who do want it and have it in their
LC_CTYPE should be forced to give -Tlocale or something similar
to each and every utility they call.

What do you think?
  Ingo


 Index: man.conf
 ===
 RCS file: /cvs/src/etc/man.conf,v
 retrieving revision 1.18
 diff -u -p -r1.18 man.conf
 --- man.conf  13 Jul 2013 20:21:52 -  1.18
 +++ man.conf  14 Feb 2014 02:14:29 -
 @@ -16,15 +16,15 @@ _subdir   {cat,man}1 {cat,man}8 {cat,man}
  _suffix  .0
  _build   .0.Z/usr/bin/zcat %s
  _build   .0.gz   /usr/bin/gzcat %s
 -_build   .[1-9n] /usr/bin/mandoc %s
 -_build   .[1-9n].Z   /usr/bin/zcat %s | /usr/bin/mandoc
 -_build   .[1-9n].gz  /usr/bin/gzcat %s | /usr/bin/mandoc
 -_build   .[1-9][a-z] /usr/bin/mandoc %s
 -_build   .[1-9][a-z].Z   /usr/bin/zcat %s | /usr/bin/mandoc
 -_build   .[1-9][a-z].gz  /usr/bin/gzcat %s | /usr/bin/mandoc
 -_build   .tbl/usr/bin/mandoc %s
 -_build   .tbl.Z  /usr/bin/zcat %s | /usr/bin/mandoc
 -_build   .tbl.gz /usr/bin/gzcat %s | /usr/bin/mandoc
 +_build   .[1-9n] /usr/bin/mandoc -Tlocale %s
 +_build   .[1-9n].Z   /usr/bin/zcat %s | /usr/bin/mandoc 
 -Tlocale
 +_build   .[1-9n].gz  /usr/bin/gzcat %s | /usr/bin/mandoc 
 -Tlocale
 +_build   .[1-9][a-z] /usr/bin/mandoc -Tlocale %s
 +_build   .[1-9][a-z].Z   /usr/bin/zcat %s | /usr/bin/mandoc 
 -Tlocale
 +_build   .[1-9][a-z].gz  /usr/bin/gzcat %s | /usr/bin/mandoc 
 -Tlocale
 +_build   .tbl/usr/bin/mandoc -Tlocale %s
 +_build   .tbl.Z  /usr/bin/zcat %s | /usr/bin/mandoc 
 -Tlocale
 +_build   .tbl.gz /usr/bin/gzcat %s | /usr/bin/mandoc 
 -Tlocale
  
  # Sections and their directories.
  # All paths ending in '/' are the equivalent of entries specifying that
 



Re: man.conf mandoc -Tlocale

2014-02-14 Thread Christian Weisgerber
On 2014-02-14, Ingo Schwarze schwa...@usta.de wrote:

 I even considered switching the mandoc(1) default from -Tascii to
 -Tlocale in general, but forgot about it again.  If you like the
 idea, that would be something to do after unlock;

I like that, but...

 it might require explicitly giving the -Tascii option in some
 build system and similar contexts.

... we have to make sure that when mandoc is run at build time, the
output doesn't depend on the user's locale.

-- 
Christian naddy Weisgerber  na...@mips.inka.de



Re: man.conf mandoc -Tlocale

2014-02-14 Thread Ted Unangst
On Fri, Feb 14, 2014 at 14:02, Ingo Schwarze wrote:

 I even considered switching the mandoc(1) default from -Tascii to
 -Tlocale in general, but forgot about it again.  If you like the
 idea, that would be something to do after unlock; it might require
 explicitly giving the -Tascii option in some build system and similar
 contexts.
 
 I think -Tlocale might be a saner default than -Tascii nowadays.
 People who don't want UTF-8 shouldn't have it in their LC_CTYPE,
 and it's hard to see why people who do want it and have it in their
 LC_CTYPE should be forced to give -Tlocale or something similar
 to each and every utility they call.

Inclined to agree, but I wanted to try the conservative approach for
this release. I don't know how many places the output of mandoc is
saved for later.



man.conf mandoc -Tlocale

2014-02-13 Thread Ted Unangst
About 20 years after the invention of utf-8, I've decided to see what
all the fuss is about and experiment with uxterm and whatnot.
Naturally, this means I want to see sweet fancy quotes in all my man
pages instead of the lame ``fake'' quotes. In order to convince mandoc
to give me what I want, however, requires a command line option. But
what about all those old school ascii only terminals I still sometimes
use?

mandoc fortunately has an option -Tlocale, which will pick between
ascii and utf8 based on environment. Perfect! Let's use it.

Tested to work as expected in uxterm. Tested to change nothing in a
regular xterm by default (no LC_CTYPE set).

Index: man.conf
===
RCS file: /cvs/src/etc/man.conf,v
retrieving revision 1.18
diff -u -p -r1.18 man.conf
--- man.conf13 Jul 2013 20:21:52 -  1.18
+++ man.conf14 Feb 2014 02:14:29 -
@@ -16,15 +16,15 @@ _subdir {cat,man}1 {cat,man}8 {cat,man}
 _suffix.0
 _build .0.Z/usr/bin/zcat %s
 _build .0.gz   /usr/bin/gzcat %s
-_build .[1-9n] /usr/bin/mandoc %s
-_build .[1-9n].Z   /usr/bin/zcat %s | /usr/bin/mandoc
-_build .[1-9n].gz  /usr/bin/gzcat %s | /usr/bin/mandoc
-_build .[1-9][a-z] /usr/bin/mandoc %s
-_build .[1-9][a-z].Z   /usr/bin/zcat %s | /usr/bin/mandoc
-_build .[1-9][a-z].gz  /usr/bin/gzcat %s | /usr/bin/mandoc
-_build .tbl/usr/bin/mandoc %s
-_build .tbl.Z  /usr/bin/zcat %s | /usr/bin/mandoc
-_build .tbl.gz /usr/bin/gzcat %s | /usr/bin/mandoc
+_build .[1-9n] /usr/bin/mandoc -Tlocale %s
+_build .[1-9n].Z   /usr/bin/zcat %s | /usr/bin/mandoc -Tlocale
+_build .[1-9n].gz  /usr/bin/gzcat %s | /usr/bin/mandoc -Tlocale
+_build .[1-9][a-z] /usr/bin/mandoc -Tlocale %s
+_build .[1-9][a-z].Z   /usr/bin/zcat %s | /usr/bin/mandoc -Tlocale
+_build .[1-9][a-z].gz  /usr/bin/gzcat %s | /usr/bin/mandoc -Tlocale
+_build .tbl/usr/bin/mandoc -Tlocale %s
+_build .tbl.Z  /usr/bin/zcat %s | /usr/bin/mandoc -Tlocale
+_build .tbl.gz /usr/bin/gzcat %s | /usr/bin/mandoc -Tlocale
 
 # Sections and their directories.
 # All paths ending in '/' are the equivalent of entries specifying that