Re: pfctl interprets "# ... \" as multi-line comment and can skip rules

2016-01-17 Thread samt

On 16/01/2016 5:52 PM, Theo de Raadt wrote:

I've been using pf for years and really like it.  I accidentally discovered
some undesirable behavior from the rule parser that caused some rules to be
skipped.  This has happened to me twice and there was much hair pulling.

The short version is rules starting with # but ending in \ get treated as a
multi-line comment instead of a single-line comment and it has the risk of
silently ignoring a wanted rule immediately below.  This does not match
the behavior I'd expect, for example a line starting with # is entirely
ignored in /bin/sh:

# echo this is a comment \
echo this is not a comment \
or is it?

# sh test.sh
this is not a comment or is it?


But in pf.conf:

pf is not sh.
it isn't cpp either.
nor is it m4.


I try to keep my firewall rules less than 80 chars in case I need to edit
them on a dumb terminal.  Sometimes I end up duplicating a continued line to
make changes to an alternate copy and comment out the original, but if the newly
commented out line ends in a backslash, my intended replacement is ignored.
I think pfctl should act like sh and ignore a line entirely if it begins with
a comment.  Thanks for your consideration.

Unfortunately, it is too late to make such a change in the parser.

That ship has sailed.

.


I can feel Joseph's pain, and had the same pain of ignorance myself.

The behaviour is expected and documented.

http://www.openbsd.org/cgi-bin/man.cgi/OpenBSD-current/man5/pf.conf.5?query=pf%2econf=5

The web manual interface indicates this documentation has been since 
OpenBSD 5.0


The current line can be extended over multiple lines using a
backslash (‘\’). Comments can be put anywhere in the file using a
hash mark (‘#’), and extend to the end of the current line. Care
should be taken when commenting out multi-line text: the comment is
effective until the end of the entire block.

and explained at various times on this list. 
http://thread.gmane.org/gmane.os.openbsd.misc/224709/focus=224743



Sam T.
http://www.nomoa.com/bsd/



Proper printf attribute in less(1)

2016-01-17 Thread Michael McConville
Michael Reed and I noticed the straggling lintism PRINTFLIKE1 in
less(1). Should it be replaced with an attribute? If so, am I doing this
right?


Index: funcs.h
===
RCS file: /cvs/src/usr.bin/less/funcs.h,v
retrieving revision 1.17
diff -u -p -r1.17 funcs.h
--- funcs.h 15 Jan 2016 22:22:38 -  1.17
+++ funcs.h 18 Jan 2016 03:34:55 -
@@ -10,8 +10,8 @@ struct mlist;
 struct loption;
 
 void *ecalloc(int, unsigned int);
-/*PRINTFLIKE1*/
-char *easprintf(const char *, ...);
+char *easprintf(const char *, ...)
+__attribute__((__format__(printf, 1, 2)));
 char *estrdup(const char *);
 char *skipsp(char *);
 int sprefix(char *, char *, int);



Re: [patch] ls + utf-8 support

2016-01-17 Thread Michael McConville
Martin Natano wrote:
> I agree with Ingo, ls(1) shouldn't generate unsafe output. Regardless
> of whether the output is to a terminal or some other file.

While POSIX is not holy law, doing what you suggest would probably be a
violation. See the description of the -q option:

http://pubs.opengroup.org/onlinepubs/9699919799/utilities/ls.html

That combined with Stuart's mentioned legitimate uses makes me think
that this could cause subtle and unexpected problems. I do see the
danger of allowing raw bytestring redirection, though.



Re: [PATCH] uname, arch/machine -> %c, %a update in PKG_PATH

2016-01-17 Thread Michael McConville
Raf Czlonka wrote:
> Hi all,
> 
> Given that PKG_PATH and pkg.conf(5)'s installpath, now supports %c, %a,
> etc. sequences, it might be worth advertising it a bit more by changing
> all relevant uname(1), arch(1)/machine(1) occurrences or (hard-coded
> release versions or hardware architectures for that matter) in the
> documentation.
> 
> While there, I have also taken the liberty of changing ftp.openbsd.org
> to your.local.mirror and ftp to http in packages(7) to keep it
> consistent with other examples.
> 
> Main benefits:
> - as the sequences themselves - not need to hard-code the values
> - no need to run uname, arch/machine is sub-shells any more
> - short and sweet

ok mmcc@

> Index: share/man/man7/packages.7
> ===
> RCS file: /cvs/src/share/man/man7/packages.7,v
> retrieving revision 1.40
> diff -u -p -r1.40 packages.7
> --- share/man/man7/packages.7 24 Oct 2015 08:44:49 -  1.40
> +++ share/man/man7/packages.7 11 Jan 2016 14:26:49 -
> @@ -240,7 +240,7 @@ are supported: pointing
>  .Ev PKG_PATH
>  to a distant package repository, e.g.,
>  .Bd -literal -offset 1n
> -# export PKG_PATH=ftp://ftp.openbsd.org/pub/OpenBSD/5.2/packages/i386/
> +# export PKG_PATH=http://your.local.mirror/pub/OpenBSD/%c/packages/%a/
>  .Ed
>  .Pp
>  will let
> Index: faq/faq15.html
> ===
> RCS file: /cvs/www/faq/faq15.html,v
> retrieving revision 1.116
> diff -u -p -r1.116 faq15.html
> --- faq/faq15.html23 Nov 2015 03:15:50 -  1.116
> +++ faq/faq15.html11 Jan 2016 14:29:33 -
> @@ -203,13 +203,13 @@ A list of possible locations to fetch pa
>  Example 1: fetching from your CD-ROM,
>  assuming you mounted it on /mnt/cdrom
>  
> -$ export PKG_PATH=/mnt/cdrom/$(uname -r)/packages/$(machine -a)/
> +$ export PKG_PATH=/mnt/cdrom/%c/packages/%a/
>  
>  
>  
>  Example 2: fetching from a nearby mirror
>  
> -$ export PKG_PATH=http://your.local.mirror/pub/OpenBSD/$(uname 
> -r)/packages/$(machine -a)/
> +$ export PKG_PATH=http://your.local.mirror/pub/OpenBSD/%c/packages/%a/
>  
>  
>  
> @@ -404,7 +404,7 @@ HTTP, or SCP locations.
>  Let's consider installation via HTTP in the next example:
>  
>  
> -# pkg_add http://your.local.mirror/pub/OpenBSD/$(uname 
> -r)/packages/$(machine -a)/screen-4.0.3p3.tgz
> +# pkg_add 
> http://your.local.mirror/pub/OpenBSD/%c/packages/%a/screen-4.0.3p3.tgz
>  screen-4.0.3p3: complete
>  
>  
> Index: faq/faq9.html
> ===
> RCS file: /cvs/www/faq/faq9.html,v
> retrieving revision 1.116
> diff -u -p -r1.116 faq9.html
> --- faq/faq9.html 23 Nov 2015 03:16:31 -  1.116
> +++ faq/faq9.html 11 Jan 2016 14:29:33 -
> @@ -362,7 +362,7 @@ To find out more about the packages and 
>  To install the above mentioned package you would issue
>  
>  
> -# export PKG_PATH=http://your.local.mirror/pub/OpenBSD/$(uname 
> -r)/packages/$(uname -m)/
> +# export PKG_PATH=http://your.local.mirror/pub/OpenBSD/%c/packages/%a/
>  # pkg_add fedora_base
>  
>  
> Index: faq/pf/example1.html
> ===
> RCS file: /cvs/www/faq/pf/example1.html,v
> retrieving revision 1.63
> diff -u -p -r1.63 example1.html
> --- faq/pf/example1.html  10 Jan 2016 01:28:23 -  1.63
> +++ faq/pf/example1.html  11 Jan 2016 14:29:33 -
> @@ -346,7 +346,7 @@ to use (8.8.8.8@53, for example
>  installing the tool and choosing a resolver.
>  
>  
> -# export PKG_PATH=http://ftp.openbsd.org/pub/OpenBSD/$(uname 
> -r)/packages/$(uname -m)
> +# export PKG_PATH=http://your.local.mirror/pub/OpenBSD/%c/packages/%a/
>  # pkg_add dnscrypt-proxy
>  # rcctl enable dnscrypt_proxy
>  
> Index: faq/ports/guide.html
> ===
> RCS file: /cvs/www/faq/ports/guide.html,v
> retrieving revision 1.46
> diff -u -p -r1.46 guide.html
> --- faq/ports/guide.html  21 Dec 2015 16:35:48 -  1.46
> +++ faq/ports/guide.html  11 Jan 2016 14:29:33 -
> @@ -563,7 +563,7 @@ When dealing with multi-packages, it may
>  >pkg_add(1) and
>   href="http://www.openbsd.org/cgi-bin/man.cgi?sektion=1query=pkg_delete;
>  >pkg_delete(1) directly,
> -setting PKG_PATH to /usr/ports/packages/`arch -s`/all/ in 
> the
> +setting PKG_PATH to /usr/ports/packages/%a/all/ in the
>  environment.
>  
>  
> 



Re: [patch] ls + utf-8 support

2016-01-17 Thread Ted Unangst
Ingo Schwarze wrote:
> The old ls(1) also weeded out non-printable bytes, in particular
> control codes.

The old ls only had this behavior for terminals however. Redirecting to a file
or pipe would always output the original bytes.



Re: [patch] ls + utf-8 support

2016-01-17 Thread Stuart Henderson
On 2016/01/17 14:29, Ted Unangst wrote:
> Ingo Schwarze wrote:
> > The old ls(1) also weeded out non-printable bytes, in particular
> > control codes.
> 
> The old ls only had this behavior for terminals however. Redirecting to a file
> or pipe would always output the original bytes.

I've used this a few times in the past, for example "ls | hexdump -C"
or .."| vis", to find out what the characters used in some filename are.
I'd find it surprising for this to not work.



Re: Perl 5.22.1 testing request + issue on alpha

2016-01-17 Thread Miod Vallat
> I have run into a strange issue on alpha that I'm still tracking down.
> I fear this has interrupted me too long to get 5.22 in for OpenBSD 5.9,
> but maybe we can get ahead of the curve and be ready after unlock.
>
> Previously, NaN + 1 looked like this:
> $ perl -we 'print "NaN" + 1'
> -nan
>
> Due to improvements in the Inf/NaN code, 5.22 should get:
> $ perl -we 'print "NaN" + 1' 
> NaN
>
> But for some reason on alpha NaN isn't special and we instead get:
> $ ./perl -we 'print "NaN" + 1'
> 1

You might want to try this compiler diff on alpha.





When compiling with optimization enabled and ieee-style floating point, the
compiler tries to insert asynchronous fpu trap synchronization barriers as
late as possible.

Unfortunately, the logic does not take into account the store of a
floating-point result into memory as something requiring a barrier, which
leads to incorrect behaviour on alpha processors without the ``precise
arithmetic trap'' extension.

Index: alpha.c
===
RCS file: /OpenBSD/src/gnu/gcc/gcc/config/alpha/alpha.c,v
retrieving revision 1.4
diff -u -p -r1.4 alpha.c
--- alpha.c 20 Dec 2012 13:58:06 -  1.4
+++ alpha.c 17 Jan 2016 19:42:44 -
@@ -8721,11 +8721,15 @@ summarize_insn (rtx x, struct shadow_sum
result of an instruction that might generate an UNPREDICTABLE
result.
 
-   (c) Within the trap shadow, no register may be used more than once
+   (c) Within the trap shadow, the destination register of the potentially
+   trapping instruction may not be used as an input, for its value would be
+   UNPREDICTABLE.
+
+   (d) Within the trap shadow, no register may be used more than once
as a destination register.  (This is to make life easier for the
trap-handler.)
 
-   (d) The trap shadow may not include any branch instructions.  */
+   (e) The trap shadow may not include any branch instructions.  */
 
 static void
 alpha_handle_trap_shadows (void)
@@ -8797,7 +8801,7 @@ alpha_handle_trap_shadows (void)
  if ((sum.defd.i & shadow.defd.i)
  || (sum.defd.fp & shadow.defd.fp))
{
- /* (c) would be violated */
+ /* (d) would be violated */
  goto close_shadow;
}
 
@@ -8820,11 +8824,19 @@ alpha_handle_trap_shadows (void)
 
  goto close_shadow;
}
+
+ if ((sum.used.i & shadow.defd.i)
+ || (sum.used.fp & shadow.defd.fp))
+   {
+ /* (c) would be violated */
+ goto close_shadow;
+   }
  break;
 
case JUMP_INSN:
case CALL_INSN:
case CODE_LABEL:
+ /* (e) would be violated */
  goto close_shadow;
 
default:



Re: [patch] ls + utf-8 support

2016-01-17 Thread Martin Natano
I agree with Ingo, ls(1) shouldn't generate unsafe output. Regardless
of whether the output is to a terminal or some other file.

cheers,
natano



Re: [patch] ls + utf-8 support

2016-01-17 Thread Ingo Schwarze
And now with the patch...

- Forwarded message from Ingo Schwarze  -

From: Ingo Schwarze 
Date: Sun, 17 Jan 2016 21:37:56 +0100
To: Stuart Henderson , Ted Unangst 
Cc: Martijn van Duren , tech@openbsd.org
Subject: Re: [patch] ls + utf-8 support

Hi,

Stuart Henderson wrote on Sun, Jan 17, 2016 at 07:46:23PM +:
> On 2016/01/17 14:29, Ted Unangst wrote:
>> Ingo Schwarze wrote:

>>> The old ls(1) also weeded out non-printable bytes, in particular
>>> control codes.

>> The old ls only had this behavior for terminals however.
>> Redirecting to a file or pipe would always output the original bytes.

> I've used this a few times in the past, for example "ls | hexdump -C"
> or .."| vis", to find out what the characters used in some filename are.
> I'd find it surprising for this to not work.

Oops.  What we currently have in the tree is broken in that respect,
i broke it, including the -q option.

Current behaviour is:

 * SMALL: fully works, but no UTF-8 support
 * not SMALL:
- LC_CTYPE=C on a tty or with -q: does '?', ok
- LC_CTYPE=en_US.UTF-8 on a tty or with -q: does '?', ok
- LC_CTYPE=C neither tty nor -q: does '?', wrong
- LC_CTYPE=en_US.UTF-8 neither tty nor -q: does '?', wrong

The following patch fixes the last two cases.
It is similar in spirit to what Martijn originally sent,
but fixes two issues with his patch:

 1) Do not invent a new global variable, use the existing f_nonprint.
 2) For valid, but non-printable codepoints, print all bytes of the
codepoint's encoding rather than just the first byte.

Should i commit this?

Yours,
  Ingo

- End forwarded message -

Index: utf8.c
===
RCS file: /cvs/src/bin/ls/utf8.c,v
retrieving revision 1.1
diff -u -p -r1.1 utf8.c
--- utf8.c  1 Dec 2015 18:36:13 -   1.1
+++ utf8.c  17 Jan 2016 20:13:51 -
@@ -21,6 +21,8 @@
 #include 
 #include 
 
+extern int f_nonprint;
+
 int
 mbsprint(const char *mbs, int print)
 {
@@ -33,12 +35,16 @@ mbsprint(const char *mbs, int print)
if ((len = mbtowc(, mbs, MB_CUR_MAX)) == -1) {
(void)mbtowc(NULL, NULL, MB_CUR_MAX);
if (print)
-   putchar('?');
+   putchar(f_nonprint ? '?' : *mbs);
total_width++;
len = 1;
} else if ((width = wcwidth(wc)) == -1) {
-   if (print)
-   putchar('?');
+   if (print) {
+   if (f_nonprint)
+   putchar('?');
+   else
+   fwrite(mbs, 1, len, stdout);
+   }
total_width++;
} else {
if (print)



Re: [patch] ls + utf-8 support

2016-01-17 Thread Ingo Schwarze
Hi,

Stuart Henderson wrote on Sun, Jan 17, 2016 at 07:46:23PM +:
> On 2016/01/17 14:29, Ted Unangst wrote:
>> Ingo Schwarze wrote:

>>> The old ls(1) also weeded out non-printable bytes, in particular
>>> control codes.

>> The old ls only had this behavior for terminals however.
>> Redirecting to a file or pipe would always output the original bytes.

> I've used this a few times in the past, for example "ls | hexdump -C"
> or .."| vis", to find out what the characters used in some filename are.
> I'd find it surprising for this to not work.

Oops.  What we currently have in the tree is broken in that respect,
i broke it, including the -q option.

Current behaviour is:

 * SMALL: fully works, but no UTF-8 support
 * not SMALL:
- LC_CTYPE=C on a tty or with -q: does '?', ok
- LC_CTYPE=en_US.UTF-8 on a tty or with -q: does '?', ok
- LC_CTYPE=C neither tty nor -q: does '?', wrong
- LC_CTYPE=en_US.UTF-8 neither tty nor -q: does '?', wrong

The following patch fixes the last two cases.
It is similar in spirit to what Martijn originally sent,
but fixes two issues with his patch:

 1) Do not invent a new global variable, use the existing f_nonprint.
 2) For valid, but non-printable codepoints, print all bytes of the
codepoint's encoding rather than just the first byte.

Should i commit this?

Yours,
  Ingo



morse(6) update

2016-01-17 Thread Stuart Henderson
Update for the current spec which includes an official prosign for @
(I'm not sure where the old one we had came from but it's not the right
thing to use nowadays) and distinct parentheses.

OK?

Index: morse.6
===
RCS file: /cvs/src/games/morse/morse.6,v
retrieving revision 1.1
diff -u -p -r1.1 morse.6
--- morse.6 7 Nov 2014 22:17:49 -   1.1
+++ morse.6 17 Jan 2016 23:45:01 -
@@ -71,3 +71,9 @@ Produce dots and dashes rather than word
 .%R "Operational provisions for the international public telegram service"
 .%O Division B, I. Morse code
 .Re
+.Rs
+.%I ITU-R M.1677-1
+.%R International Morse code
+.%D 2009
+.%U http://www.itu.int/rec/R-REC-M.1677-1-200910-I/
+.Re
Index: morse.c
===
RCS file: /cvs/src/games/morse/morse.c,v
retrieving revision 1.17
diff -u -p -r1.17 morse.c
--- morse.c 23 Oct 2015 02:01:15 -  1.17
+++ morse.c 17 Jan 2016 23:45:01 -
@@ -89,14 +89,14 @@ struct punc {
{ '-', "--" },
{ ':', "---..." },
{ ';', "-.-.-." },
-   { '(', "-.--.-." }, /* When converting from Morse, can't tell */
-   { ')', "-.--.-." }, /* '(' and ')' apart  */
+   { '(', "-.--." },
+   { ')', "-.--.-" },
{ '"', ".-..-." },
{ '`', ".-..-." },
{ '\'', ".." },
{ '+', ".-.-." },   /* AR */
{ '=', "-...-" },   /* BT */
-   { '@', "...-.-" },  /* SK */
+   { '@', ".--.-." },  /* AC */
{ '\0', NULL }
 };
 



Re: morse(6) update

2016-01-17 Thread Aaron Bieber

Stuart Henderson writes:

> Update for the current spec which includes an official prosign for @
> (I'm not sure where the old one we had came from but it's not the right
> thing to use nowadays) and distinct parentheses.
>
> OK?

OK abieber@ if you haven't gotten one yet.

>
> Index: morse.6
> ===
> RCS file: /cvs/src/games/morse/morse.6,v
> retrieving revision 1.1
> diff -u -p -r1.1 morse.6
> --- morse.6   7 Nov 2014 22:17:49 -   1.1
> +++ morse.6   17 Jan 2016 23:45:01 -
> @@ -71,3 +71,9 @@ Produce dots and dashes rather than word
>  .%R "Operational provisions for the international public telegram service"
>  .%O Division B, I. Morse code
>  .Re
> +.Rs
> +.%I ITU-R M.1677-1
> +.%R International Morse code
> +.%D 2009
> +.%U http://www.itu.int/rec/R-REC-M.1677-1-200910-I/
> +.Re
> Index: morse.c
> ===
> RCS file: /cvs/src/games/morse/morse.c,v
> retrieving revision 1.17
> diff -u -p -r1.17 morse.c
> --- morse.c   23 Oct 2015 02:01:15 -  1.17
> +++ morse.c   17 Jan 2016 23:45:01 -
> @@ -89,14 +89,14 @@ struct punc {
>   { '-', "--" },
>   { ':', "---..." },
>   { ';', "-.-.-." },
> - { '(', "-.--.-." }, /* When converting from Morse, can't tell */
> - { ')', "-.--.-." }, /* '(' and ')' apart  */
> + { '(', "-.--." },
> + { ')', "-.--.-" },
>   { '"', ".-..-." },
>   { '`', ".-..-." },
>   { '\'', ".." },
>   { '+', ".-.-." },   /* AR */
>   { '=', "-...-" },   /* BT */
> - { '@', "...-.-" },  /* SK */
> + { '@', ".--.-." },  /* AC */
>   { '\0', NULL }
>  };
>  



Re: [PATCH] mount_tmpfs -P option for populating after mounting (simplified)

2016-01-17 Thread bytevolcano
ping.

On Sun, 13 Sep 2015 14:50:19 +0100
bytevolc...@safe-mail.net wrote:

> Hello,
> 
> Although a patch like this was submitted before, this one is more
> "focused". It may be useful for those moving from MFS who use this
> option to pre-populate an MFS mount with a collection of files. This
> will refuse to work if the path specified for -P is not a directory.
> If the copy is unsuccessful, unmount and inform the user.
> 
> A few questions on this though:
> 
> 1. Is there a better way than executing pax, to copy all the files in
> a directory?
> 
> 2. The do_exec() function was sort of copied from sbin/newfs/newfs.c.
> It doesn't feel like the best way of handling this, but some research
> suggests it is better than calling system() due to security issues,
> one that springs to mind is a malformed input directory to the -P
> option. Is there a simple way to sanitise the input for use with
> system()?
> 
> 3. I noticed a pattern of functions that have about 2-3 lines of code
> in them. Is it best to keep things that way, or is it best to merge
> them into a single function (eg. copy_dir() merged into the
> mount_tmpfs() function)?
> 
> 4. Are variable declarations to be at the beginning of a function
> block, or the beginning of the immediate block? I feel limiting scope
> is a good idea, but would like some feedback on this as I cannot see
> anything in style(9).
> 
> 5. Is mount_tmpfs.h really necessary, given nothing in the tree seems
> to use anything related to the mount_tmpfs() function which is
> internal to the mount_tmpfs binary?
> 
> 
>  -
> 
> Index: sbin/mount_tmpfs/mount_tmpfs.8
> ===
> RCS file: /cvs/src/sbin/mount_tmpfs/mount_tmpfs.8,v
> retrieving revision 1.4
> diff -u -p -r1.4 mount_tmpfs.8
> --- sbin/mount_tmpfs/mount_tmpfs.816 Nov 2014 02:22:10
> - 1.4 +++ sbin/mount_tmpfs/mount_tmpfs.8  13 Sep
> 2015 13:03:34 - @@ -41,6 +41,7 @@
>  .Op Fl m Ar mode
>  .Op Fl n Ar nodes
>  .Op Fl o Ar options
> +.Op Fl P Ar directory
>  .Op Fl s Ar size
>  .Op Fl u Ar user
>  .Ar tmpfs
> @@ -80,6 +81,8 @@ flag followed by a comma-separated strin
>  See the
>  .Xr mount 8
>  man page for possible options and their meanings.
> +.It Fl P Ar directory
> +Populate the created tmpfs file system with the contents of the
> directory. .It Fl s Ar size
>  Specifies the total file system size in bytes.
>  If zero is given (the default), the available amount of memory
> (including @@ -136,6 +139,10 @@ and
>  .Ox 5.5 .
>  .Sh CAVEATS
>  The update of mount options (through mount -u) is currently not
> supported. +The
> +.Fl P
> +option will produce an error if the mount is read-only, or if the
> files +cannot be copied from the specified template directory.
>  .Sh BUGS
>  File system meta-data is not pageable.
>  If there is not enough main memory to hold this information, the
> system may Index: sbin/mount_tmpfs/mount_tmpfs.c
> ===
> RCS file: /cvs/src/sbin/mount_tmpfs/mount_tmpfs.c,v
> retrieving revision 1.5
> diff -u -p -r1.5 mount_tmpfs.c
> --- sbin/mount_tmpfs/mount_tmpfs.c16 Jan 2015 06:39:59
> - 1.5 +++ sbin/mount_tmpfs/mount_tmpfs.c  13 Sep
> 2015 13:03:34 - @@ -34,6 +34,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #include 
>  #include 
> @@ -41,6 +42,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -68,12 +70,15 @@ static inta_num(const char *, const cha
>  static mode_ta_mask(const char *);
>  static void  pathadj(const char *, char *);
>  
> +static int do_exec(const char *, const char *, char *const[]);
> +static int copy_dir(char *, char *);
> +
>  /*
> -
> */ void
>  mount_tmpfs_parseargs(int argc, char *argv[],
>   struct tmpfs_args *args, int *mntflags,
> - char *canon_dev, char *canon_dir)
> + char *canon_dev, char *canon_dir, char *pop_dir)
>  {
>   int gidset, modeset, uidset; /* Ought to be 'bool'. */
>   int ch;
> @@ -95,7 +100,7 @@ mount_tmpfs_parseargs(int argc, char *ar
>   modeset = 0; mode = 0;
>  
>   optind = optreset = 1;
> - while ((ch = getopt(argc, argv, "g:m:n:o:s:u:")) != -1 ) {
> + while ((ch = getopt(argc, argv, "P:g:m:n:o:s:u:")) != -1 ) {
>   switch (ch) {
>   case 'g':
>   gid = a_gid(optarg);
> @@ -131,6 +136,10 @@ mount_tmpfs_parseargs(int argc, char *ar
>   uidset = 1;
>   break;
>  
> + case 'P':
> + strlcpy(pop_dir, optarg, PATH_MAX);
> + break;
> +
>   case '?':
>   default:
>   usage();
> @@ -161,7 +170,8 @@ usage(void)
>   extern char *__progname;
>   (void)fprintf(stderr,
>   "usage: %s [-g group] [-m mode] [-n nodes] [-o 

Re: [patch] ls + utf-8 support

2016-01-17 Thread Ingo Schwarze
Hi Martijn,

Martijn van Duren wrote on Sun, Jan 17, 2016 at 12:58:38PM +0100:

> I've come across a fair amount of malformed file names by all sorts
> of causes.  Be it malware or just human error.  When such a malformed
> character is in an inconvenient place and can't be auto-completed
> I usually fix this by something like the following:
>
> $ cd "`ls | tail -1`"
> ksh: cd: /home/martijn/Muziek/Motrhead/N?? Sleep at All - No such file
> or directory
> $ cd "`/usr/src/bin/ls/ls | tail -1`"

Why not just

 $ cd N*Sleep*

That seems simpler to me.
Sure, if you have more than one file in the same directory showing
this problem, and the names are too similar to be independently
globbed, and you want to keep them, it's a bit more work:

 $ i=1; for f in N*Sleep*; do mv $f tmp$i; i=$((i+1)); done

I don't see a real reason to use ls(1) in such situations.

> My patch maintains the question marks when stdout is a tty, but returns
> the original byte otherwise.  Afaik the only logical use for the length
> is when doing formatted output, which is only when printing to a tty.

The rationale for weeding out invalid bytes and non-printable
characters is not that we don't know how many display columns the
terminal might use to represent them.  The rationale is that they
might cause the terminal to change state, to interpret them as
in-band control codes.

> This doesn't solve the case when ls is run over ssh -t and the content
> is redirected client-side, but you can't win them all.

Indeed.  I worry that might result in security violations.

The old ls(1) also weeded out non-printable bytes, in particular
control codes.

Even though this question is only tangentially related to UTF-8:
Do we want to change that?  Should ls(1) sometimes pass through
bytes that might be control codes for some terminals?  Imposing
the responsibility that non-isatty(3) ls(1) output never ends up
on a terminal on the user?

I tend to think "no".  I wouldn't want my ls(1) to sometimes generate
output that isn't safe on a terminal.  I doubt i would get into the
habit of considering whether or not ls(1) is safe to run in a given
situation.  I'd rather have it just be safe, always.  And deal with
the occasional insane file name with different tools.

What do people think?
  Ingo


> Index: ls.c
> ===
> RCS file: /cvs/src/bin/ls/ls.c,v
> retrieving revision 1.44
> diff -u -p -r1.44 ls.c
> --- ls.c  1 Dec 2015 18:36:13 -   1.44
> +++ ls.c  17 Jan 2016 10:57:03 -
> @@ -94,6 +94,7 @@ int f_type; /* add type character for 
>  int f_typedir;   /* add type character for directories */
>  
>  int rval;
> +int istty = 0;
>  
>  int
>  ls_main(int argc, char *argv[])
> @@ -110,6 +111,7 @@ ls_main(int argc, char *argv[])
>  
>   /* Terminal defaults to -Cq, non-terminal defaults to -1. */
>   if (isatty(STDOUT_FILENO)) {
> + istty = 1;
>   if ((p = getenv("COLUMNS")) != NULL)
>   width = strtonum(p, 1, INT_MAX, NULL);
>   if (width == 0 &&
> Index: utf8.c
> ===
> RCS file: /cvs/src/bin/ls/utf8.c,v
> retrieving revision 1.1
> diff -u -p -r1.1 utf8.c
> --- utf8.c1 Dec 2015 18:36:13 -   1.1
> +++ utf8.c17 Jan 2016 10:57:03 -
> @@ -21,6 +21,8 @@
>  #include 
>  #include 
>  
> +extern int istty;
> +
>  int
>  mbsprint(const char *mbs, int print)
>  {
> @@ -33,12 +35,12 @@ mbsprint(const char *mbs, int print)
>   if ((len = mbtowc(, mbs, MB_CUR_MAX)) == -1) {
>   (void)mbtowc(NULL, NULL, MB_CUR_MAX);
>   if (print)
> - putchar('?');
> + putchar(istty ? '?' : *mbs);
>   total_width++;
>   len = 1;
>   } else if ((width = wcwidth(wc)) == -1) {
>   if (print)
> - putchar('?');
> + putchar(istty ? '?' : *mbs);
>   total_width++;
>   } else {
>   if (print)



Re: start UTF-8 support for ksh(1) vi editing mode

2016-01-17 Thread Ingo Schwarze
Hi Martin, hi Theo,

thank you for your feedback!

Martijn van Duren wrote on Sun, Jan 17, 2016 at 12:06:26PM +0100:

> When applying this patch and running in an environment without an
> UTF-8 LC_ALL or LC_CTYPE the isu8cont gives the reverse problem of
> the current ksh having UTF-8 filenames with UTF-8 enabled.
> It becomes impossible to properly remove the utf-8 characters where
> the screen is badly redrawn, while in the old situation these could
> be removed byte-wise when in POSIX mode.
>
> Example (caret is cursor):
> $ cd ./Muziek/Mot??rhead/
>  ^
> 
> $ cd ./Muziek/Mot??rhead/
>^
> 
> $ ccd ./Muziek/Mot??rhead
> 
> Since (afaik) utf-8 isn't supported on the console this could become an
> annoying situation in some situations.

All those seem valid concerns.  So we should only change behaviour
based on the locale(1), which means that the shell has to call
setlocale(3) - of course only in the non-ramdisk case.

Patch updated in two places:

 1. Call setlocale(3) in main().
 2. Check the locale in isu8cont().

> I reckon it would be cleaner to do it like ls where the character
> is first pulled through mbtowc and based on it's return printed or
> presented as an ? per byte, or at the very least treated as a single
> byte.

Right now, i'm trying to be extremely conservative and not use any
multibyte library functions.  But yes, ultimately, that will be
needed, in particular to support zero-width and double-width
characters.  But i don't deem the time ripe to attempt that just yet.


Theo (tb@) pointed out in private that entering a two-byte UTF-8
character produces the following echo to the terminal with the
first version of the patch:  first byte, backspace, first byte,
second byte.  While that works at least on some terminals, that's
certainly not ideal.  The following updated version of the patch
fixes this by keeping minimal state in display() and avoiding
the backup and reprint of the start byte if nothing unrelated
happened in between.  No changes outside display().

More testing and feedback is certainly welcome.

Yours,
  Ingo


Index: main.c
===
RCS file: /cvs/src/bin/ksh/main.c,v
retrieving revision 1.78
diff -u -p -r1.78 main.c
--- main.c  30 Dec 2015 09:07:00 -  1.78
+++ main.c  17 Jan 2016 16:40:03 -
@@ -8,6 +8,7 @@
 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -153,6 +154,8 @@ main(int argc, char *argv[])
kshname = argv[0];
 
 #ifndef MKNOD
+   setlocale(LC_CTYPE, "");
+
if (pledge("stdio rpath wpath cpath fattr flock getpw proc exec tty",
NULL) == -1) {
perror("pledge");
Index: vi.c
===
RCS file: /cvs/src/bin/ksh/vi.c,v
retrieving revision 1.39
diff -u -p -r1.39 vi.c
--- vi.c22 Dec 2015 08:39:26 -  1.39
+++ vi.c17 Jan 2016 16:40:03 -
@@ -12,6 +12,7 @@
 #include   /* completion */
 
 #include 
+#include 
 #include 
 
 #include "sh.h"
@@ -21,11 +22,11 @@
 #define CTRL(c)(c & 0x1f)
 
 struct edstate {
-   int winleft;
-   char*cbuf;
-   int cbufsize;
-   int linelen;
-   int cursor;
+   char*cbuf;  /* main buffer to build the command line */
+   int cbufsize;   /* number of bytes allocated for cbuf */
+   int linelen;/* current number of bytes in cbuf */
+   int winleft;/* first byte# in cbuf to be displayed */
+   int cursor; /* byte# in cbuf having the cursor */
 };
 
 
@@ -68,6 +69,7 @@ static void   vi_pprompt(int);
 static voidvi_error(void);
 static voidvi_macro_reset(void);
 static int x_vi_putbuf(const char *, size_t);
+static int isu8cont(unsigned char);
 
 #define C_ 0x1 /* a valid command that isn't a M_, E_, U_ */
 #define M_ 0x2 /* movement command (h, l, etc.) */
@@ -148,7 +150,7 @@ static void restore_edstate(struct edst
 static voidfree_edstate(struct edstate *old);
 
 static struct edstate  ebuf;
-static struct edstate  undobuf = { 0, undocbuf, CMDLEN, 0, 0 };
+static struct edstate  undobuf = { undocbuf, CMDLEN, 0, 0, 0 };
 
 static struct edstate  *es;/* current editor state */
 static struct edstate  *undo;
@@ -157,7 +159,7 @@ static char ibuf[CMDLEN];   /* input buff
 static int first_insert;   /* set when starting in insert mode */
 static int saved_inslen;   /* saved inslen for first insert */
 static int inslen; /* length of input buffer */
-static int srchlen;/* length of current search pattern */
+static int srchlen;/* number of bytes in search pattern */
 static charybuf[CMDLEN];   /* yank buffer */
 static int yanklen;/* length of yank buffer