grow the number of pool items a cpu can cache on demand

2017-06-15 Thread David Gwynne
the pool cache code counts the number of times the mutex on the
global info was already held, which we can interpret as demand. in
response to this demand we can grow the number of items a cpu can
cache, which in return should reduce the codemand/contention on the
global info.

the contention is measured in the gc tick task, which runs once a
second. if the cache mtx was contended more than 8 times in that
interval, we consider growing the list length.

rather than double the lists, this adds 8 each time.

ive placed a limit on how much the list can grow to try and avoid
having cpus starve each other. basically, we look at how many pool
items the current set of pages can provide, and compare that to how
many items the cpus could cache if we grew the list.

this means that the length of the lists are bound by the amount of
memory available to the pools, and is proportional to the number
of cpus in the system.

this semantic may be wrong or bad, but id like to put it in the
tree and see how we go. this is the last major chunk of pool work
for a while, which should mean if it doesnt work out it will be
simple to revert or disable.

NAME LEN   NL  NGC  CPU REQREL   LREQ   LREL
mbufpl56   14   38049972265  164731885   20152420679
  1333760577886458 544004   6510
  270548537   26047463 979997  25398
  363359512   22846575 871000  23154
  4 59783371821032  80711   4033
  5  157443  69606   2032212
  6   12656   5297195 29
  72526   2126 19 11
mtagpl 8000   0  0  0  0
  1   0  0  0  0
  2   0  0  0  0
  3   0  0  0  0
  4   0  0  0  0
  5   0  0  0  0
  6   0  0  0  0
  7   0  0  0  0
mcl2k  81   220   8 11  0  0
  12458   1031179  0
  22288   3258 13133
  32253   3108 17123
  41061873 36 12
  5  47 55  1  1
  6   6  7  0  0
  7   1  5  0  0
mcl2k2563   27049935396   73111701468  1
  1 8656802697  0 223701
  2 602   22136816  0 774735
  3 595   19413129  0 655773
  4 1611515461  0  45021
  5   2  57970  0   2071
  6   0   4204  0153
  7   0   1656  0 43
mcl4k  84   530  93 64 10  5
  1  120770  43303  11392   1708
  2   98822 155361   7703  14770
  3  111419 135419   9456  12454
  4   24665  21605   1670   1287
  5 434942 21 83
  6  57 78  3  5
  7   4 19  0  0
mcl8k  8   14   6301154188142 20
  1  796165 297379  93094  30745
  2  7847311153063  91364 137405
  3  8418661057558  98997 125957
  4  247836 161689  28512  17743
  53796   5849434689
  6 

smtpd session hang

2017-06-15 Thread Henri Kemppainen
I had a little debugging session with awolk@ over at #openbsd-daily.  His
smtpd would over time end up with hung sessions that never timeout.

The problem is related to the data_io path's congestion control which
may pause the session.  In this case the io system will not wait for
read events and as such will not have a chance to timeout until it is
resumed.

If the pause happens when a full message is just about to pass through
the data_io path, the session is never resumed, even though there is
obviously no more congestion and the session should be reading more
input from the client again.

A debug trace excerpt shows the course of events:

mtp: 0xe54baa1e000: IO_DATAIN 
debug: smtp: 0xe54baa1e000: filter congestion: pausing session
smtp: 0xe54baa1e000 (data): IO_LOWAT 
debug: smtp: 0xe54baa1e000: data io done (259146 bytes)
debug: 0xe54baa1e000: end of message, msgflags=0x
smtp: 0xe54baa1e000: >>> 250 2.0.0: 4f36f9e3 Message accepted for delivery
smtp: 0xe54baa1e000: STATE_BODY -> STATE_HELO
smtp: 0xe54baa1e000: IO_LOWAT 

>From this point on, session 0xe54baa1e000 and its io 0xe551d0d5000
(which has the pause_in flag) are never seen again in the trace, and
fstat shows a corresponding connection to smtpd that never goes away.

The proposed fix is to always resume the session if the data_io path
hits the low water mark.

Mr. Wolk tested this diff against smtpd on 6.1 as well as a against
-current version of smtpd (compiled on the same system running 6.1).


Index: usr.sbin/smtpd/smtp_session.c
===
RCS file: /cvs/src/usr.sbin/smtpd/smtp_session.c,v
retrieving revision 1.303
diff -u -p -r1.303 smtp_session.c
--- usr.sbin/smtpd/smtp_session.c   17 May 2017 14:00:06 -  1.303
+++ usr.sbin/smtpd/smtp_session.c   15 Jun 2017 20:28:12 -
@@ -1474,9 +1474,10 @@ smtp_data_io(struct io *io, int evt, voi
break;
 
case IO_LOWAT:
-   if (s->tx->dataeom && io_queued(s->tx->oev) == 0) {
+   if (s->tx->dataeom && io_queued(s->tx->oev) == 0)
smtp_data_io_done(s);
-   } else if (io_paused(s->io, IO_IN)) {
+
+   if (io_paused(s->io, IO_IN)) {
log_debug("debug: smtp: %p: filter congestion over: 
resuming session", s);
io_resume(s->io, IO_IN);
}




hack: fix paths, remove hack.sh

2017-06-15 Thread Michal Mazurek
Fix paths in Makequest /usr/games/lib/questdir -> /var/games/questdir
Remove hack.sh, it is broken since forever and nobody complained
Update the READ_ME
But don't update this part:

   3. The rest of the stuff belonging to hack sits in a subdirectory hackdir
  (on our system /usr/games/lib/hackdir) with modes
   drwx--  3 play  1024 Aug  9 09:03 hackdir

As it states "on our system", i.e. the system of the original authors.

Comments? OK?


Index: games/hack/Makequest
===
RCS file: /cvs/src/games/hack/Makequest,v
retrieving revision 1.3
diff -u -p -r1.3 Makequest
--- games/hack/Makequest28 Jan 2001 23:41:42 -  1.3
+++ games/hack/Makequest15 Jun 2017 21:37:23 -
@@ -9,7 +9,7 @@ TERMLIB = -lcurses
 
 # make hack
 GAME = quest
-GAMEDIR = /usr/games/lib/questdir
+GAMEDIR = /var/games/questdir
 CFLAGS = -g -DQUEST
 HACKCSRC = hack.Decl.c\
hack.apply.c hack.bones.c hack.c hack.cmd.c hack.do.c\
@@ -37,7 +37,7 @@ HSOURCES = hack.h hack.mfndpos.h config.
 
 SOURCES = $(CSOURCES) $(HSOURCES)
 
-AUX = data help hh rumors hack.6 hack.sh
+AUX = data help hh rumors hack.6
 
 DISTR = $(SOURCES) $(AUX) READ_ME Makefile date.h hack.onames.h
 
Index: games/hack/READ_ME
===
RCS file: /cvs/src/games/hack/READ_ME,v
retrieving revision 1.3
diff -u -p -r1.3 READ_ME
--- games/hack/READ_ME  28 Jan 2001 23:41:42 -  1.3
+++ games/hack/READ_ME  15 Jun 2017 21:37:23 -
@@ -31,12 +31,6 @@ Files for hack:
(Some of these contain information on the game,
others are just plain stupid. Additional rumors
are appreciated.)
-   hack.sh A shell script.
-   (We have hack.sh in /usr/games/hack and
-   hack in /usr/games/lib/hackdir/hack and all the other
-   hack stuff in /usr/games/lib/hackdir - perhaps this
-   will make the script clear.
-   There is no need for you to use it.)
READ_ME This file.
Original_READ_ME Jay Fenlason's READ_ME
 
@@ -88,8 +82,6 @@ $ hack [-d hackdir] [maxnrofplayers]
 (for playing) or
 $ hack [-d hackdir] -s [listofusers | limit | all]
 (for seeing part of the scorelist).
-The shell file hack (in this kit called hack.sh) takes care of
-calling hack with the right arguments.
 
 Send complaints, bug reports, suggestions for improvements to
 mcvax!aeb - in real life Andries Brouwer.
Index: games/hack/hack.sh
===
RCS file: games/hack/hack.sh
diff -N games/hack/hack.sh
--- games/hack/hack.sh  16 Mar 2003 21:22:36 -  1.3
+++ /dev/null   1 Jan 1970 00:00:00 -
@@ -1,78 +0,0 @@
-#!/bin/sh
-#  $OpenBSD: hack.sh,v 1.3 2003/03/16 21:22:36 camield Exp $
-#  $NetBSD: hack.sh,v 1.2 1995/03/23 08:31:30 cgd Exp $
-
-#
-# Copyright (c) 1985, Stichting Centrum voor Wiskunde en Informatica,
-# Amsterdam
-# All rights reserved.
-#
-# Redistribution and use in source and binary forms, with or without
-# modification, are permitted provided that the following conditions are
-# met:
-#
-# - Redistributions of source code must retain the above copyright notice,
-# this list of conditions and the following disclaimer.
-#
-# - Redistributions in binary form must reproduce the above copyright
-# notice, this list of conditions and the following disclaimer in the
-# documentation and/or other materials provided with the distribution.
-#
-# - Neither the name of the Stichting Centrum voor Wiskunde en
-# Informatica, nor the names of its contributors may be used to endorse or
-# promote products derived from this software without specific prior
-# written permission.
-#
-# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS
-# IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED
-# TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A
-# PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER
-# OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
-# EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
-# PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
-# PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
-# LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
-# NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
-# SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
-#
-
-#
-# Copyright (c) 1982 Jay Fenlason 
-# All rights reserved.
-#
-# Redistribution and use in source and binary forms, with or without
-# modification, are permitted provided that the following conditions
-# are met:
-# 1. Redistributions of source code must retain the above copyright
-#notice, this list of 

Re: Add support for rdonly keyword in fstab

2017-06-15 Thread Jérôme FRGACIC

On Wed, 14 Jun 2017 12:49:19 -0700, Klemens Nanni wrote:

fstab(5) has always used ro/rw for this exclusively, which is sufficient
enough imho.


I dare say mount(8) actually has too many ways to say the same thing:
-r
-o ro
-r norw
-o rdonly

How about removing some of those?


I think you're right, that's sound better. ;)



Re: sppp(4)/pppoe(4) dynamic address hack

2017-06-15 Thread Martin Pieuchot
On 12/06/17(Mon) 15:46, Stefan Sperling wrote:
> On Sat, Jun 10, 2017 at 06:14:02PM +0200, Martin Pieuchot wrote:
> > There's no need to fix the routing table, we could something like:
> > 
> > -   if (hisaddr == 1) {
> > +   if (hisaddr < 10) {
> > 
> > Now I think you have a good point that using a flag is better than a 
> > magic address.  But I think the ifconfig(8) interface should be simpler.
> > 
> > What about reusing the 'autoconf' toggle?  Could we say:
> > 
> > # ifconfig pppoe0 inet autoconf
> > 
> > Instead of introducing 'dynaddr' and 'dyndest'?
> 
> 'autoconf' is already used in a context where no addresses are configured
> yet (apart from the mandatory link-local ones, which don't exist as such
> in IPv4). I would rather not overload it.
> 
> I agree that adding new ifconfig(8) commands may not be the ideal solution.
> If the routing table were fixed eventually, such commands may become obsolete.
>
> Here's a shorter diff based on your suggestion above, which bypasses the
> tedious ifconfig(8) design decisions but still allows us to improve the
> current situation.
> 
> With this, we keep the magic address hack (even though we don't really
> like it), allow multiple dynamic pppoe(4) in one box (255 should be
> enough for anyone), document our current hack properly, and don't
> require existing pppoe(4) setups to change their configuration after 6.2.
> 
> Is this an acceptable short term solution?

Yes, but I'm not opposed to the previous diff either, both diffs are
correct and ok from my point of view.

> Index: share/man/man4/pppoe.4
> ===
> RCS file: /cvs/src/share/man/man4/pppoe.4,v
> retrieving revision 1.33
> diff -u -p -r1.33 pppoe.4
> --- share/man/man4/pppoe.422 Mar 2017 21:37:24 -  1.33
> +++ share/man/man4/pppoe.412 Jun 2017 13:20:51 -
> @@ -113,17 +113,19 @@ The physical interface must also be mark
>  .Pp
>  Since this is a PPP interface, the addresses assigned to the interface
>  may change during PPP negotiation.
> -There is no fine grained control available for deciding
> -which addresses are acceptable and which are not.
> -For the local side and the remote address there is exactly one choice:
> -hard coded address or wildcard.
> -If a real address is assigned to one side of the connection,
> -PPP negotiation will only agree to exactly this address.
> -If one side is wildcarded,
> -every address suggested by the peer will be accepted.
> +In the above example, 0.0.0.0 and 0.0.0.1 serve as placeholders for
> +dynamic address configuration.
>  .Pp
> -To wildcard the local address set it to 0.0.0.0; to wildcard the remote
> -address set it to 0.0.0.1.
> +If the local address is set to wildcard address 0.0.0.0, it will
> +be changed to an address suggested by the peer.
> +.Pp
> +If the destination address is set to a wildcard address in the range
> +from 0.0.0.1 to 0.0.0.255, it will be changed to an address suggested
> +by the peer, and if a default route which uses this interface exists
> +the gateway will be changed to the suggested address as well.
> +.Pp
> +Otherwise, PPP negotiation will only agree to exactly the IPv4 addresses
> +which are configured on the interface.
>  .Sh KERNEL OPTIONS
>  .Nm
>  does not interfere with other PPPoE implementations
> @@ -254,3 +256,8 @@ The presence of a
>  .Xr mygate 5
>  file will interfere with the routing table.
>  Make sure this file is either empty or does not exist.
> +.Pp
> +Two
> +.Nm
> +interfaces configured with the same wildcard destination address
> +cannot share a routing table.
> Index: sys/net/if_spppsubr.c
> ===
> RCS file: /cvs/src/sys/net/if_spppsubr.c,v
> retrieving revision 1.164
> diff -u -p -r1.164 if_spppsubr.c
> --- sys/net/if_spppsubr.c 30 May 2017 07:50:37 -  1.164
> +++ sys/net/if_spppsubr.c 12 Jun 2017 13:10:20 -
> @@ -2632,7 +2632,7 @@ sppp_ipcp_tls(struct sppp *sp)
>   sp->ipcp.flags |= IPCP_MYADDR_DYN;
>   sp->ipcp.opts |= (1 << IPCP_OPT_ADDRESS);
>   }
> - if (hisaddr == 1) {
> + if (hisaddr >= 1 && hisaddr <= 255) {
>   /*
>* XXX - remove this hack!
>* remote has no valid address, we need to get one assigned.
> 



Re: install.sub: ieee80211_{scan,config}: Allow quoted SSIDs

2017-06-15 Thread Klemens Nanni

No need for quoting $_nwid within [[ ... ]] since field splitting is not
applied.

Index: install.sub
===
RCS file: /cvs/src/distrib/miniroot/install.sub,v
retrieving revision 1.1014
diff -u -p -r1.1014 install.sub
--- install.sub 3 Jun 2017 22:27:41 -   1.1014
+++ install.sub 15 Jun 2017 14:48:56 -
@@ -1060,10 +1060,9 @@ v6_config() {
# Perform an 802.11 network scan on interface $1.
# The result is cached in $WLANLIST.
ieee80211_scan() {
-   # N.B. Skipping quoted nwid's for now.
[[ -f $WLANLIST ]] ||
ifconfig $1 scan |
-   sed -n 's/^ nwid \([^"]\)/\1/p' >$WLANLIST
+   sed -n 's/^[[:space:]]*nwid //p' >$WLANLIST
cat $WLANLIST
}

@@ -1082,11 +1081,12 @@ ieee80211_config() {
ask_until "Access point? (ESSID, 'any', list# or '?')" "any"
case "$resp" in
+([0-9]))
-   _nwid=$(ieee80211_scan $_if | sed -n "${resp}s/ .*//p")
+   _nwid=$(ieee80211_scan $_if | sed -n ${resp}'{s/ chan 
.*//p;q;}')
[[ -z $_nwid ]] && echo "There is no line $resp."
+   [[ $_nwid = \"*\" ]] && _nwid=${_nwid#\"} 
_nwid=${_nwid%\"}
;;
\?) ieee80211_scan $_if |
-   sed -n 's/^\([^ ]*\) chan .* bssid \([^ ]*\) 
.*$/   \1 (\2)/p' |
+   sed -n 's/^\(.*\) chan .* bssid \([^ ]*\) 
.*$/\1 (\2)/p' |
cat -n | more -c
;;
*)  _nwid=$resp



Re: wsfont: remove iso7/pcvt encoding macros and entries

2017-06-15 Thread Frederic Cambus
On Tue, Jun 13, 2017 at 11:29:59AM +, Miod Vallat wrote:
> 
> > Hi tech@,
> >
> > We do not support iso7 nor pcvt encoding, so remove macro definitions
> > and commented entries.
> 
> If you do that, then you can probably optimize away
> vga_valid_primary_font() in sys/dev/ic/vga.c as well.

Makes sense to me as well, thanks for pointing that out. Diff below.

Comments? OK?

Index: sys/dev/ic/vga.c
===
RCS file: /cvs/src/sys/dev/ic/vga.c,v
retrieving revision 1.68
diff -u -p -r1.68 vga.c
--- sys/dev/ic/vga.c9 Sep 2015 18:23:39 -   1.68
+++ sys/dev/ic/vga.c15 Jun 2017 10:47:09 -
@@ -344,10 +344,6 @@ bad:
  * We want at least ASCII 32..127 be present in the
  * first font slot.
  */
-#define vga_valid_primary_font(f) \
-   (f->encoding == WSDISPLAY_FONTENC_IBM || \
-   f->encoding == WSDISPLAY_FONTENC_ISO)
-
 int
 vga_selectfont(struct vga_config *vc, struct vgascreen *scr, const char *name1,
 const char *name2) /* NULL: take first found */
@@ -363,7 +359,8 @@ vga_selectfont(struct vga_config *vc, st
if (!f || f->height != type->fontheight)
continue;
if (!f1 &&
-   vga_valid_primary_font(f) &&
+   (f->encoding == WSDISPLAY_FONTENC_IBM ||
+f->encoding == WSDISPLAY_FONTENC_ISO) &&
(!name1 || !*name1 ||
 !strncmp(name1, f->name, WSFONT_NAME_SIZE))) {
f1 = f;



gc idle pool cache lists

2017-06-15 Thread David Gwynne
the diff below adds cleanup of idle lists in the cpu caches in pools.

the caches on the cpus move lists of items around instead of
individual items. these lists are moved to the global pool struct
and accumulate there. if you get a burst of work in the pool (eg,
you use a lot of mbufs for a short period) you'll end up with a lot
of lists in the global pool struct that dont get used.

we gc idle pages in vanilla pools and return them to the system,
so this extends that to the cache subsystem.

it does this by timestamping the global list of lists when it has
been empty, and if it hasnt been empty for a while it returns a
list to the pages where they can be gc'ed back to the page allocator.
if the items in a pool are being allocated in bursts over relatively
short periods, eg, think replenishing rx rings on a nic in rate
limited interrupts, then the global lists will be emptied regularly. if we
stop rxing packets, then we wont empty the lists and therefore wont
timestamp them, making them available for gc.

care has be taken when moving items on a cache list back to the
pages so the puts arent counted twice (once when the item is put
on a free list on a cpu cache, and again when moving from the list
back to the pages), so this splits pool_put up. pool_do_put is
solely responsible for putting items back into pool pages. this is
then called from pool_put (which still does the accounting for
normal pools) and the list gc. a neat side effect of this is that
the list gc can return multiple items to the pages while only taking
the lock around the page structures once.

i wrote this in feb, and it's been running solidly for me ever
since. i wanted to implement the systat bits beofre putting it in
though.

the last big pool cache chunk after this is growing the size of the
cache lists based on contention.

ok?

Index: sys/pool.h
===
RCS file: /cvs/src/sys/sys/pool.h,v
retrieving revision 1.70
diff -u -p -r1.70 pool.h
--- sys/pool.h  15 Jun 2017 02:52:30 -  1.70
+++ sys/pool.h  15 Jun 2017 11:28:31 -
@@ -185,11 +185,13 @@ struct pool {
unsigned long   pr_cache_magic[2];
struct mutexpr_cache_mtx;
struct pool_cache_lists
-   pr_cache_lists;
-   u_int   pr_cache_nlist; /* # of lists */
+   pr_cache_lists; /* list of idle item lists */
+   u_int   pr_cache_nlist; /* # of idle lists */
u_int   pr_cache_items; /* target list length */
u_int   pr_cache_contention;
+   int pr_cache_tick;  /* time idle list was empty */
int pr_cache_nout;
+   uint64_tpr_cache_ngc;   /* # of times the gc released a list */
 
u_int   pr_align;
u_int   pr_maxcolors;   /* Cache coloring */
Index: kern/subr_pool.c
===
RCS file: /cvs/src/sys/kern/subr_pool.c,v
retrieving revision 1.212
diff -u -p -r1.212 subr_pool.c
--- kern/subr_pool.c15 Jun 2017 03:50:50 -  1.212
+++ kern/subr_pool.c15 Jun 2017 11:28:31 -
@@ -1,4 +1,4 @@
-/* $OpenBSD: subr_pool.c,v 1.212 2017/06/15 03:50:50 dlg Exp $ */
+/* $OpenBSD: subr_pool.c,v 1.211 2017/06/15 03:48:50 dlg Exp $ */
 /* $NetBSD: subr_pool.c,v 1.61 2001/09/26 07:14:56 chs Exp $   */
 
 /*-
@@ -135,6 +135,7 @@ struct pool_cache {
 void   *pool_cache_get(struct pool *);
 voidpool_cache_put(struct pool *, void *);
 voidpool_cache_destroy(struct pool *);
+voidpool_cache_gc(struct pool *);
 #endif
 voidpool_cache_pool_info(struct pool *, struct kinfo_pool *);
 int pool_cache_info(struct pool *, void *, size_t *);
@@ -156,6 +157,7 @@ void pool_p_free(struct pool *, struct 
 
 voidpool_update_curpage(struct pool *);
 void   *pool_do_get(struct pool *, int, int *);
+voidpool_do_put(struct pool *, void *);
 int pool_chk_page(struct pool *, struct pool_page_header *, int);
 int pool_chk(struct pool *);
 voidpool_get_done(void *, void *);
@@ -711,7 +713,6 @@ pool_do_get(struct pool *pp, int flags, 
 void
 pool_put(struct pool *pp, void *v)
 {
-   struct pool_item *pi = v;
struct pool_page_header *ph, *freeph = NULL;
 
 #ifdef DIAGNOSTIC
@@ -728,6 +729,37 @@ pool_put(struct pool *pp, void *v)
 
mtx_enter(>pr_mtx);
 
+   pool_do_put(pp, v);
+
+   pp->pr_nout--;
+   pp->pr_nput++;
+
+   /* is it time to free a page? */
+   if (pp->pr_nidle > pp->pr_maxpages &&
+   (ph = TAILQ_FIRST(>pr_emptypages)) != NULL &&
+   (ticks - ph->ph_tick) > (hz * pool_wait_free)) {
+   freeph = ph;
+   pool_p_remove(pp, freeph);
+   }
+
+   mtx_leave(>pr_mtx);
+
+   if (freeph != NULL)
+   pool_p_free(pp, freeph);
+
+   if (!TAILQ_EMPTY(>pr_requests)) {
+   mtx_enter(>pr_requests_mtx);
+

Re: vi(1): documenting :s

2017-06-15 Thread Jason McIntyre
On Thu, Jun 15, 2017 at 02:14:46AM -0600, Anthony J. Bentley wrote:
> Hi,
> 
> From vi(1):
> 
>  [range] s[ubstitute] [/pattern/replace/] ??[options] [count] [flags]
>  [range] & [options] [count] [flags]
>  [range] ~ [options] [count] [flags]
>  Make substitutions.  The replace field may contain any of the
>  following sequences:
>  (...snip...)
> 
> There are a couple of issues here.
> 
> First, the command is "s" and only "s"; trying "substitute" results in
> an error.
> 
> Additionally &, ~, and the options field remain unexplained.
> 
> This diff tries to makes things better. ok?
> 

ok by me. note that posix ex(1) does detail a working [s]ubstitute command,
so i'm not sure whether we should support this or not.

if you do make these changes, you may want to move the "s" text up a
little, since these commands are sorted.

jmc

> Index: docs/USD.doc/vi.man/vi.1
> ===
> RCS file: /cvs/src/usr.bin/vi/docs/USD.doc/vi.man/vi.1,v
> retrieving revision 1.67
> diff -u -p -r1.67 vi.1
> --- docs/USD.doc/vi.man/vi.1  15 Jun 2017 06:44:47 -  1.67
> +++ docs/USD.doc/vi.man/vi.1  15 Jun 2017 08:08:49 -
> @@ -356,7 +356,7 @@ matches the end of the word.
>  .It
>  .Sq ~
>  matches the replacement part of the last
> -.Cm substitute
> +.Cm s
>  command.
>  .El
>  .Sh BUFFERS
> @@ -2019,7 +2019,7 @@ commands from a file.
>  .Pp
>  .It Xo
>  .Op Ar range
> -.Cm s Ns Op Cm ubstitute
> +.Cm s
>  .Sm off
>  .Op / Ar pattern No / Ar replace  No /
>  .Sm on
> @@ -2041,7 +2041,27 @@ commands from a file.
>  .Op Ar count
>  .Op Ar flags
>  .Xc
> -Make substitutions.
> +Substitute the regular expression
> +.Ar pattern
> +with
> +.Ar replace .
> +When invoked as
> +.Cm & ,
> +or if
> +.Bq Ns / Ns Ar pattern Ns / Ns Ar replace Ns /
> +is omitted,
> +.Ar pattern
> +and
> +.Ar replace
> +from the most recent
> +.Cm s
> +command are used.
> +.Cm ~
> +behaves like
> +.Cm & ,
> +except the pattern used is the most recent regular expression used by any
> +command.
> +.Pp
>  The
>  .Ar replace
>  field may contain any of the following sequences:
> @@ -2051,13 +2071,13 @@ The text matched by
>  .Ar pattern .
>  .It Sq \(a~
>  The replacement part of the previous
> -.Cm substitute
> +.Cm s
>  command.
>  .It Sq %
>  If this is the entire
>  .Ar replace
>  pattern, the replacement part of the previous
> -.Cm substitute
> +.Cm s
>  command.
>  .It Sq \e#
>  Where
> @@ -2082,6 +2102,18 @@ to be converted to uppercase.
>  Causes the next character to be converted to uppercase.
>  .El
>  .Pp
> +The
> +.Ar options
> +field may contain any of the following characters:
> +.Bl -tag -width Ds
> +.It Sq c
> +Prompt for confirmation before each replacement is done.
> +.It Sq g
> +Replace all instances of
> +.Ar pattern
> +in a line, not just the first.
> +.El
> +.Pp
>  .It Xo
>  .Cm su Ns Op Cm spend Ns
>  .Op Cm !\&
> @@ -2291,7 +2323,9 @@ Remember the values of the
>  and
>  .Sq g
>  suffixes to the
> -.Cm substitute
> +.Cm s , &
> +and
> +.Cm ~
>  commands, instead of initializing them as unset for each new command.
>  .It Cm escapetime Bq 1
>  The tenths of a second
> 



vi(1): documenting :s

2017-06-15 Thread Anthony J. Bentley
Hi,

>From vi(1):

 [range] s[ubstitute] [/pattern/replace/]  [options] [count] [flags]
 [range] & [options] [count] [flags]
 [range] ~ [options] [count] [flags]
 Make substitutions.  The replace field may contain any of the
 following sequences:
 (...snip...)

There are a couple of issues here.

First, the command is "s" and only "s"; trying "substitute" results in
an error.

Additionally &, ~, and the options field remain unexplained.

This diff tries to makes things better. ok?

Index: docs/USD.doc/vi.man/vi.1
===
RCS file: /cvs/src/usr.bin/vi/docs/USD.doc/vi.man/vi.1,v
retrieving revision 1.67
diff -u -p -r1.67 vi.1
--- docs/USD.doc/vi.man/vi.115 Jun 2017 06:44:47 -  1.67
+++ docs/USD.doc/vi.man/vi.115 Jun 2017 08:08:49 -
@@ -356,7 +356,7 @@ matches the end of the word.
 .It
 .Sq ~
 matches the replacement part of the last
-.Cm substitute
+.Cm s
 command.
 .El
 .Sh BUFFERS
@@ -2019,7 +2019,7 @@ commands from a file.
 .Pp
 .It Xo
 .Op Ar range
-.Cm s Ns Op Cm ubstitute
+.Cm s
 .Sm off
 .Op / Ar pattern No / Ar replace  No /
 .Sm on
@@ -2041,7 +2041,27 @@ commands from a file.
 .Op Ar count
 .Op Ar flags
 .Xc
-Make substitutions.
+Substitute the regular expression
+.Ar pattern
+with
+.Ar replace .
+When invoked as
+.Cm & ,
+or if
+.Bq Ns / Ns Ar pattern Ns / Ns Ar replace Ns /
+is omitted,
+.Ar pattern
+and
+.Ar replace
+from the most recent
+.Cm s
+command are used.
+.Cm ~
+behaves like
+.Cm & ,
+except the pattern used is the most recent regular expression used by any
+command.
+.Pp
 The
 .Ar replace
 field may contain any of the following sequences:
@@ -2051,13 +2071,13 @@ The text matched by
 .Ar pattern .
 .It Sq \(a~
 The replacement part of the previous
-.Cm substitute
+.Cm s
 command.
 .It Sq %
 If this is the entire
 .Ar replace
 pattern, the replacement part of the previous
-.Cm substitute
+.Cm s
 command.
 .It Sq \e#
 Where
@@ -2082,6 +2102,18 @@ to be converted to uppercase.
 Causes the next character to be converted to uppercase.
 .El
 .Pp
+The
+.Ar options
+field may contain any of the following characters:
+.Bl -tag -width Ds
+.It Sq c
+Prompt for confirmation before each replacement is done.
+.It Sq g
+Replace all instances of
+.Ar pattern
+in a line, not just the first.
+.El
+.Pp
 .It Xo
 .Cm su Ns Op Cm spend Ns
 .Op Cm !\&
@@ -2291,7 +2323,9 @@ Remember the values of the
 and
 .Sq g
 suffixes to the
-.Cm substitute
+.Cm s , &
+and
+.Cm ~
 commands, instead of initializing them as unset for each new command.
 .It Cm escapetime Bq 1
 The tenths of a second



Re: sed(1): missing NUL in pattern space

2017-06-15 Thread Otto Moerbeek
On Tue, Jun 13, 2017 at 10:08:11AM +, kshe wrote:

> On Sat, 10 Jun 2017 10:25:27 +, Otto Moerbeek wrote:
> > Thanks for the analysis and diff, I hope to get a chanche to think
> > about this soon. At least I'll make sure this diff is not forgotten,
> > -Otto
> 
> I have seen the tests that you recently added to the tree; in the mean
> time, however, while I continued experimenting and reading the source, I
> discovered more bugs (as expected), as well as other trivial issues
> which might be worthy of note before committing any real fixes.
> 
> First of all, my comprehension of the `c' command has slightly changed
> since my last message: I now think its behaviour should be aligned with
> that of `p', `P', `w' and others, all of which produce no output after
> `c', treating a deleted pattern space as truly inexistent, and not
> merely as an empty one.  It would hence appear more coherent to add
> 
>   if (pd)
>   break;
> 
> before calling lputs(), as is done in many other places.  This would
> then change the expected output of one of the newly added tests.
> 
> In general though, that `c' command is a big mess.  It is supposed to
> delete the pattern space: in other implementations of sed that I have
> tested, that means it will essentially behave like `i' followed by `d'.
> In BSD's sed, however, it continues processing commands...  Here's one
> of the many bugs that arise because of that:
> 
>   $ echo abc | sed 'c\
>   > text
>   > s/.*/x/
>   > p
>   > s/.*/y/
>   > p'
>   text
>   x
> 
> So why isn't `y' printed too?  In comparison, using flags instead of
> standalone commands:
> 
>   $ echo abc | sed 'c\
>   > text
>   > s/.*/x/p
>   > s/.*/y/p'
>   text
>   x
>   y
> 
> Here, we do get a `y', but only one (there should be two given that `-n'
> has not been supplied to suppress normal output).  Since at least `p' as
> a flag given to `s' appears to work, let us try the same with `w':
> 
>   $ echo abc | sed 'c\
>   > text
>   > s/.*/x/w /tmp/sed.out
>   > s/.*/y/w /tmp/sed.out'
>   text
>   $ cat /tmp/sed.out
>   x
> 
> But this time it doesn't work.  Now add a third `s' command...
> 
>   $ echo abc | sed 'c\
>   > text
>   > s/.*/x/w /tmp/sed.out
>   > s/.*/y/w /tmp/sed.out
>   > s/.*/z/w /tmp/sed.out'
>   text
>   z
>   $ cat /tmp/sed.out
>   x
>   z
> 
> And it works.  Well, not really: we do see a `z', but still no `y'.
> Indeed, this does not make any sense.
> 
> The reason for this strange behaviour is nevertheless easy to figure out
> when looking at how `s' is implemented, keeping in mind that the
> substitute space is globally defined.  In any case, even if one was to
> fix that particular bug, I would probably be able to find more.  This is
> why I believe that the only sane solution is that, just like `d', `c'
> should simply
> 
>   goto new;
> 
> which would, at the same time, obviate the need for the `deleted' field
> of the SPACE structure, subsequently allowing for substantial cleanups.
> 
> Also, aside of that `c' mess, there is still the fact that NUL bytes
> originally present in the input are not well handled at all.  Of course,
> `l' cannot print past them (whereas `p' can), which is a shame; but
> significantly worse is the inconsistent behaviour of `s' in that case.
> At first, it seems to work well (in the output, `^@' represents a NUL):
> 
>   $ printf 'x\0y' | sed 's/x/_/'
>   _^@y
> 
> It can even handle some empty matches perfectly:
> 
>   $ printf 'x\0y' | sed 's/[[:<:]]/_/' # pattern space survives
>   _x^@y
> 
> But not all of them:
> 
>   $ printf 'x\0y' | sed 's/[[:>:]]/_/' # pattern space is cut off
>   x_
>   $ printf 'x\0y' | sed 's/[[:>:]]/_/2' # likewise
>   x
> 
> One might argue that NUL bytes are not supposed to happen in textual
> input, so that these examples are irrelevant; the real problem however
> is that it also affects newlines, thus perfectly valid text:
> 
>   $ printf 'x\ny' | sed 'N;s/[[:<:]]/_/'
>   _x
>   y
>   $ printf 'x\ny' | sed 'N;s/[[:>:]]/_/'
>   x_
>   $ printf 'x\ny' | sed 'N;s/[[:>:]]/_/2'
>   x
> 
> So maybe substitute() should test for (slen == 0) instead of the
> questionable (*s == '\0' || *s == '\n').  That would seem more correct
> to me, but this is only a guess since I must admit that I don't fully
> understand that part of the code yet.  Here is a diff, for instance,
> that fixes all the above issues.
> 
> --- a/src/usr.bin/sed/process.c   Wed Jun  7 09:56:20 2017
> +++ b/src/usr.bin/sed/process.c   Tue Jun 13 07:21:59 2017
> @@ -387,14 +388,11 @@ substitute(struct s_command *cp)
>* and at the end of the line, terminate.
>*/
>   if (match[0].rm_so == match[0].rm_eo) {
> - if (*s == '\0' || *s == '\n')
> - slen