grow the number of pool items a cpu can cache on demand
the pool cache code counts the number of times the mutex on the global info was already held, which we can interpret as demand. in response to this demand we can grow the number of items a cpu can cache, which in return should reduce the codemand/contention on the global info. the contention is measured in the gc tick task, which runs once a second. if the cache mtx was contended more than 8 times in that interval, we consider growing the list length. rather than double the lists, this adds 8 each time. ive placed a limit on how much the list can grow to try and avoid having cpus starve each other. basically, we look at how many pool items the current set of pages can provide, and compare that to how many items the cpus could cache if we grew the list. this means that the length of the lists are bound by the amount of memory available to the pools, and is proportional to the number of cpus in the system. this semantic may be wrong or bad, but id like to put it in the tree and see how we go. this is the last major chunk of pool work for a while, which should mean if it doesnt work out it will be simple to revert or disable. NAME LEN NL NGC CPU REQREL LREQ LREL mbufpl56 14 38049972265 164731885 20152420679 1333760577886458 544004 6510 270548537 26047463 979997 25398 363359512 22846575 871000 23154 4 59783371821032 80711 4033 5 157443 69606 2032212 6 12656 5297195 29 72526 2126 19 11 mtagpl 8000 0 0 0 0 1 0 0 0 0 2 0 0 0 0 3 0 0 0 0 4 0 0 0 0 5 0 0 0 0 6 0 0 0 0 7 0 0 0 0 mcl2k 81 220 8 11 0 0 12458 1031179 0 22288 3258 13133 32253 3108 17123 41061873 36 12 5 47 55 1 1 6 6 7 0 0 7 1 5 0 0 mcl2k2563 27049935396 73111701468 1 1 8656802697 0 223701 2 602 22136816 0 774735 3 595 19413129 0 655773 4 1611515461 0 45021 5 2 57970 0 2071 6 0 4204 0153 7 0 1656 0 43 mcl4k 84 530 93 64 10 5 1 120770 43303 11392 1708 2 98822 155361 7703 14770 3 111419 135419 9456 12454 4 24665 21605 1670 1287 5 434942 21 83 6 57 78 3 5 7 4 19 0 0 mcl8k 8 14 6301154188142 20 1 796165 297379 93094 30745 2 7847311153063 91364 137405 3 8418661057558 98997 125957 4 247836 161689 28512 17743 53796 5849434689 6
smtpd session hang
I had a little debugging session with awolk@ over at #openbsd-daily. His smtpd would over time end up with hung sessions that never timeout. The problem is related to the data_io path's congestion control which may pause the session. In this case the io system will not wait for read events and as such will not have a chance to timeout until it is resumed. If the pause happens when a full message is just about to pass through the data_io path, the session is never resumed, even though there is obviously no more congestion and the session should be reading more input from the client again. A debug trace excerpt shows the course of events: mtp: 0xe54baa1e000: IO_DATAIN debug: smtp: 0xe54baa1e000: filter congestion: pausing session smtp: 0xe54baa1e000 (data): IO_LOWAT debug: smtp: 0xe54baa1e000: data io done (259146 bytes) debug: 0xe54baa1e000: end of message, msgflags=0x smtp: 0xe54baa1e000: >>> 250 2.0.0: 4f36f9e3 Message accepted for delivery smtp: 0xe54baa1e000: STATE_BODY -> STATE_HELO smtp: 0xe54baa1e000: IO_LOWAT >From this point on, session 0xe54baa1e000 and its io 0xe551d0d5000 (which has the pause_in flag) are never seen again in the trace, and fstat shows a corresponding connection to smtpd that never goes away. The proposed fix is to always resume the session if the data_io path hits the low water mark. Mr. Wolk tested this diff against smtpd on 6.1 as well as a against -current version of smtpd (compiled on the same system running 6.1). Index: usr.sbin/smtpd/smtp_session.c === RCS file: /cvs/src/usr.sbin/smtpd/smtp_session.c,v retrieving revision 1.303 diff -u -p -r1.303 smtp_session.c --- usr.sbin/smtpd/smtp_session.c 17 May 2017 14:00:06 - 1.303 +++ usr.sbin/smtpd/smtp_session.c 15 Jun 2017 20:28:12 - @@ -1474,9 +1474,10 @@ smtp_data_io(struct io *io, int evt, voi break; case IO_LOWAT: - if (s->tx->dataeom && io_queued(s->tx->oev) == 0) { + if (s->tx->dataeom && io_queued(s->tx->oev) == 0) smtp_data_io_done(s); - } else if (io_paused(s->io, IO_IN)) { + + if (io_paused(s->io, IO_IN)) { log_debug("debug: smtp: %p: filter congestion over: resuming session", s); io_resume(s->io, IO_IN); }
hack: fix paths, remove hack.sh
Fix paths in Makequest /usr/games/lib/questdir -> /var/games/questdir Remove hack.sh, it is broken since forever and nobody complained Update the READ_ME But don't update this part: 3. The rest of the stuff belonging to hack sits in a subdirectory hackdir (on our system /usr/games/lib/hackdir) with modes drwx-- 3 play 1024 Aug 9 09:03 hackdir As it states "on our system", i.e. the system of the original authors. Comments? OK? Index: games/hack/Makequest === RCS file: /cvs/src/games/hack/Makequest,v retrieving revision 1.3 diff -u -p -r1.3 Makequest --- games/hack/Makequest28 Jan 2001 23:41:42 - 1.3 +++ games/hack/Makequest15 Jun 2017 21:37:23 - @@ -9,7 +9,7 @@ TERMLIB = -lcurses # make hack GAME = quest -GAMEDIR = /usr/games/lib/questdir +GAMEDIR = /var/games/questdir CFLAGS = -g -DQUEST HACKCSRC = hack.Decl.c\ hack.apply.c hack.bones.c hack.c hack.cmd.c hack.do.c\ @@ -37,7 +37,7 @@ HSOURCES = hack.h hack.mfndpos.h config. SOURCES = $(CSOURCES) $(HSOURCES) -AUX = data help hh rumors hack.6 hack.sh +AUX = data help hh rumors hack.6 DISTR = $(SOURCES) $(AUX) READ_ME Makefile date.h hack.onames.h Index: games/hack/READ_ME === RCS file: /cvs/src/games/hack/READ_ME,v retrieving revision 1.3 diff -u -p -r1.3 READ_ME --- games/hack/READ_ME 28 Jan 2001 23:41:42 - 1.3 +++ games/hack/READ_ME 15 Jun 2017 21:37:23 - @@ -31,12 +31,6 @@ Files for hack: (Some of these contain information on the game, others are just plain stupid. Additional rumors are appreciated.) - hack.sh A shell script. - (We have hack.sh in /usr/games/hack and - hack in /usr/games/lib/hackdir/hack and all the other - hack stuff in /usr/games/lib/hackdir - perhaps this - will make the script clear. - There is no need for you to use it.) READ_ME This file. Original_READ_ME Jay Fenlason's READ_ME @@ -88,8 +82,6 @@ $ hack [-d hackdir] [maxnrofplayers] (for playing) or $ hack [-d hackdir] -s [listofusers | limit | all] (for seeing part of the scorelist). -The shell file hack (in this kit called hack.sh) takes care of -calling hack with the right arguments. Send complaints, bug reports, suggestions for improvements to mcvax!aeb - in real life Andries Brouwer. Index: games/hack/hack.sh === RCS file: games/hack/hack.sh diff -N games/hack/hack.sh --- games/hack/hack.sh 16 Mar 2003 21:22:36 - 1.3 +++ /dev/null 1 Jan 1970 00:00:00 - @@ -1,78 +0,0 @@ -#!/bin/sh -# $OpenBSD: hack.sh,v 1.3 2003/03/16 21:22:36 camield Exp $ -# $NetBSD: hack.sh,v 1.2 1995/03/23 08:31:30 cgd Exp $ - -# -# Copyright (c) 1985, Stichting Centrum voor Wiskunde en Informatica, -# Amsterdam -# All rights reserved. -# -# Redistribution and use in source and binary forms, with or without -# modification, are permitted provided that the following conditions are -# met: -# -# - Redistributions of source code must retain the above copyright notice, -# this list of conditions and the following disclaimer. -# -# - Redistributions in binary form must reproduce the above copyright -# notice, this list of conditions and the following disclaimer in the -# documentation and/or other materials provided with the distribution. -# -# - Neither the name of the Stichting Centrum voor Wiskunde en -# Informatica, nor the names of its contributors may be used to endorse or -# promote products derived from this software without specific prior -# written permission. -# -# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS -# IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED -# TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A -# PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER -# OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, -# EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, -# PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR -# PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF -# LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING -# NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS -# SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. -# - -# -# Copyright (c) 1982 Jay Fenlason-# All rights reserved. -# -# Redistribution and use in source and binary forms, with or without -# modification, are permitted provided that the following conditions -# are met: -# 1. Redistributions of source code must retain the above copyright -#notice, this list of
Re: Add support for rdonly keyword in fstab
On Wed, 14 Jun 2017 12:49:19 -0700, Klemens Nanni wrote: fstab(5) has always used ro/rw for this exclusively, which is sufficient enough imho. I dare say mount(8) actually has too many ways to say the same thing: -r -o ro -r norw -o rdonly How about removing some of those? I think you're right, that's sound better. ;)
Re: sppp(4)/pppoe(4) dynamic address hack
On 12/06/17(Mon) 15:46, Stefan Sperling wrote: > On Sat, Jun 10, 2017 at 06:14:02PM +0200, Martin Pieuchot wrote: > > There's no need to fix the routing table, we could something like: > > > > - if (hisaddr == 1) { > > + if (hisaddr < 10) { > > > > Now I think you have a good point that using a flag is better than a > > magic address. But I think the ifconfig(8) interface should be simpler. > > > > What about reusing the 'autoconf' toggle? Could we say: > > > > # ifconfig pppoe0 inet autoconf > > > > Instead of introducing 'dynaddr' and 'dyndest'? > > 'autoconf' is already used in a context where no addresses are configured > yet (apart from the mandatory link-local ones, which don't exist as such > in IPv4). I would rather not overload it. > > I agree that adding new ifconfig(8) commands may not be the ideal solution. > If the routing table were fixed eventually, such commands may become obsolete. > > Here's a shorter diff based on your suggestion above, which bypasses the > tedious ifconfig(8) design decisions but still allows us to improve the > current situation. > > With this, we keep the magic address hack (even though we don't really > like it), allow multiple dynamic pppoe(4) in one box (255 should be > enough for anyone), document our current hack properly, and don't > require existing pppoe(4) setups to change their configuration after 6.2. > > Is this an acceptable short term solution? Yes, but I'm not opposed to the previous diff either, both diffs are correct and ok from my point of view. > Index: share/man/man4/pppoe.4 > === > RCS file: /cvs/src/share/man/man4/pppoe.4,v > retrieving revision 1.33 > diff -u -p -r1.33 pppoe.4 > --- share/man/man4/pppoe.422 Mar 2017 21:37:24 - 1.33 > +++ share/man/man4/pppoe.412 Jun 2017 13:20:51 - > @@ -113,17 +113,19 @@ The physical interface must also be mark > .Pp > Since this is a PPP interface, the addresses assigned to the interface > may change during PPP negotiation. > -There is no fine grained control available for deciding > -which addresses are acceptable and which are not. > -For the local side and the remote address there is exactly one choice: > -hard coded address or wildcard. > -If a real address is assigned to one side of the connection, > -PPP negotiation will only agree to exactly this address. > -If one side is wildcarded, > -every address suggested by the peer will be accepted. > +In the above example, 0.0.0.0 and 0.0.0.1 serve as placeholders for > +dynamic address configuration. > .Pp > -To wildcard the local address set it to 0.0.0.0; to wildcard the remote > -address set it to 0.0.0.1. > +If the local address is set to wildcard address 0.0.0.0, it will > +be changed to an address suggested by the peer. > +.Pp > +If the destination address is set to a wildcard address in the range > +from 0.0.0.1 to 0.0.0.255, it will be changed to an address suggested > +by the peer, and if a default route which uses this interface exists > +the gateway will be changed to the suggested address as well. > +.Pp > +Otherwise, PPP negotiation will only agree to exactly the IPv4 addresses > +which are configured on the interface. > .Sh KERNEL OPTIONS > .Nm > does not interfere with other PPPoE implementations > @@ -254,3 +256,8 @@ The presence of a > .Xr mygate 5 > file will interfere with the routing table. > Make sure this file is either empty or does not exist. > +.Pp > +Two > +.Nm > +interfaces configured with the same wildcard destination address > +cannot share a routing table. > Index: sys/net/if_spppsubr.c > === > RCS file: /cvs/src/sys/net/if_spppsubr.c,v > retrieving revision 1.164 > diff -u -p -r1.164 if_spppsubr.c > --- sys/net/if_spppsubr.c 30 May 2017 07:50:37 - 1.164 > +++ sys/net/if_spppsubr.c 12 Jun 2017 13:10:20 - > @@ -2632,7 +2632,7 @@ sppp_ipcp_tls(struct sppp *sp) > sp->ipcp.flags |= IPCP_MYADDR_DYN; > sp->ipcp.opts |= (1 << IPCP_OPT_ADDRESS); > } > - if (hisaddr == 1) { > + if (hisaddr >= 1 && hisaddr <= 255) { > /* >* XXX - remove this hack! >* remote has no valid address, we need to get one assigned. >
Re: install.sub: ieee80211_{scan,config}: Allow quoted SSIDs
No need for quoting $_nwid within [[ ... ]] since field splitting is not applied. Index: install.sub === RCS file: /cvs/src/distrib/miniroot/install.sub,v retrieving revision 1.1014 diff -u -p -r1.1014 install.sub --- install.sub 3 Jun 2017 22:27:41 - 1.1014 +++ install.sub 15 Jun 2017 14:48:56 - @@ -1060,10 +1060,9 @@ v6_config() { # Perform an 802.11 network scan on interface $1. # The result is cached in $WLANLIST. ieee80211_scan() { - # N.B. Skipping quoted nwid's for now. [[ -f $WLANLIST ]] || ifconfig $1 scan | - sed -n 's/^ nwid \([^"]\)/\1/p' >$WLANLIST + sed -n 's/^[[:space:]]*nwid //p' >$WLANLIST cat $WLANLIST } @@ -1082,11 +1081,12 @@ ieee80211_config() { ask_until "Access point? (ESSID, 'any', list# or '?')" "any" case "$resp" in +([0-9])) - _nwid=$(ieee80211_scan $_if | sed -n "${resp}s/ .*//p") + _nwid=$(ieee80211_scan $_if | sed -n ${resp}'{s/ chan .*//p;q;}') [[ -z $_nwid ]] && echo "There is no line $resp." + [[ $_nwid = \"*\" ]] && _nwid=${_nwid#\"} _nwid=${_nwid%\"} ;; \?) ieee80211_scan $_if | - sed -n 's/^\([^ ]*\) chan .* bssid \([^ ]*\) .*$/ \1 (\2)/p' | + sed -n 's/^\(.*\) chan .* bssid \([^ ]*\) .*$/\1 (\2)/p' | cat -n | more -c ;; *) _nwid=$resp
Re: wsfont: remove iso7/pcvt encoding macros and entries
On Tue, Jun 13, 2017 at 11:29:59AM +, Miod Vallat wrote: > > > Hi tech@, > > > > We do not support iso7 nor pcvt encoding, so remove macro definitions > > and commented entries. > > If you do that, then you can probably optimize away > vga_valid_primary_font() in sys/dev/ic/vga.c as well. Makes sense to me as well, thanks for pointing that out. Diff below. Comments? OK? Index: sys/dev/ic/vga.c === RCS file: /cvs/src/sys/dev/ic/vga.c,v retrieving revision 1.68 diff -u -p -r1.68 vga.c --- sys/dev/ic/vga.c9 Sep 2015 18:23:39 - 1.68 +++ sys/dev/ic/vga.c15 Jun 2017 10:47:09 - @@ -344,10 +344,6 @@ bad: * We want at least ASCII 32..127 be present in the * first font slot. */ -#define vga_valid_primary_font(f) \ - (f->encoding == WSDISPLAY_FONTENC_IBM || \ - f->encoding == WSDISPLAY_FONTENC_ISO) - int vga_selectfont(struct vga_config *vc, struct vgascreen *scr, const char *name1, const char *name2) /* NULL: take first found */ @@ -363,7 +359,8 @@ vga_selectfont(struct vga_config *vc, st if (!f || f->height != type->fontheight) continue; if (!f1 && - vga_valid_primary_font(f) && + (f->encoding == WSDISPLAY_FONTENC_IBM || +f->encoding == WSDISPLAY_FONTENC_ISO) && (!name1 || !*name1 || !strncmp(name1, f->name, WSFONT_NAME_SIZE))) { f1 = f;
gc idle pool cache lists
the diff below adds cleanup of idle lists in the cpu caches in pools. the caches on the cpus move lists of items around instead of individual items. these lists are moved to the global pool struct and accumulate there. if you get a burst of work in the pool (eg, you use a lot of mbufs for a short period) you'll end up with a lot of lists in the global pool struct that dont get used. we gc idle pages in vanilla pools and return them to the system, so this extends that to the cache subsystem. it does this by timestamping the global list of lists when it has been empty, and if it hasnt been empty for a while it returns a list to the pages where they can be gc'ed back to the page allocator. if the items in a pool are being allocated in bursts over relatively short periods, eg, think replenishing rx rings on a nic in rate limited interrupts, then the global lists will be emptied regularly. if we stop rxing packets, then we wont empty the lists and therefore wont timestamp them, making them available for gc. care has be taken when moving items on a cache list back to the pages so the puts arent counted twice (once when the item is put on a free list on a cpu cache, and again when moving from the list back to the pages), so this splits pool_put up. pool_do_put is solely responsible for putting items back into pool pages. this is then called from pool_put (which still does the accounting for normal pools) and the list gc. a neat side effect of this is that the list gc can return multiple items to the pages while only taking the lock around the page structures once. i wrote this in feb, and it's been running solidly for me ever since. i wanted to implement the systat bits beofre putting it in though. the last big pool cache chunk after this is growing the size of the cache lists based on contention. ok? Index: sys/pool.h === RCS file: /cvs/src/sys/sys/pool.h,v retrieving revision 1.70 diff -u -p -r1.70 pool.h --- sys/pool.h 15 Jun 2017 02:52:30 - 1.70 +++ sys/pool.h 15 Jun 2017 11:28:31 - @@ -185,11 +185,13 @@ struct pool { unsigned long pr_cache_magic[2]; struct mutexpr_cache_mtx; struct pool_cache_lists - pr_cache_lists; - u_int pr_cache_nlist; /* # of lists */ + pr_cache_lists; /* list of idle item lists */ + u_int pr_cache_nlist; /* # of idle lists */ u_int pr_cache_items; /* target list length */ u_int pr_cache_contention; + int pr_cache_tick; /* time idle list was empty */ int pr_cache_nout; + uint64_tpr_cache_ngc; /* # of times the gc released a list */ u_int pr_align; u_int pr_maxcolors; /* Cache coloring */ Index: kern/subr_pool.c === RCS file: /cvs/src/sys/kern/subr_pool.c,v retrieving revision 1.212 diff -u -p -r1.212 subr_pool.c --- kern/subr_pool.c15 Jun 2017 03:50:50 - 1.212 +++ kern/subr_pool.c15 Jun 2017 11:28:31 - @@ -1,4 +1,4 @@ -/* $OpenBSD: subr_pool.c,v 1.212 2017/06/15 03:50:50 dlg Exp $ */ +/* $OpenBSD: subr_pool.c,v 1.211 2017/06/15 03:48:50 dlg Exp $ */ /* $NetBSD: subr_pool.c,v 1.61 2001/09/26 07:14:56 chs Exp $ */ /*- @@ -135,6 +135,7 @@ struct pool_cache { void *pool_cache_get(struct pool *); voidpool_cache_put(struct pool *, void *); voidpool_cache_destroy(struct pool *); +voidpool_cache_gc(struct pool *); #endif voidpool_cache_pool_info(struct pool *, struct kinfo_pool *); int pool_cache_info(struct pool *, void *, size_t *); @@ -156,6 +157,7 @@ void pool_p_free(struct pool *, struct voidpool_update_curpage(struct pool *); void *pool_do_get(struct pool *, int, int *); +voidpool_do_put(struct pool *, void *); int pool_chk_page(struct pool *, struct pool_page_header *, int); int pool_chk(struct pool *); voidpool_get_done(void *, void *); @@ -711,7 +713,6 @@ pool_do_get(struct pool *pp, int flags, void pool_put(struct pool *pp, void *v) { - struct pool_item *pi = v; struct pool_page_header *ph, *freeph = NULL; #ifdef DIAGNOSTIC @@ -728,6 +729,37 @@ pool_put(struct pool *pp, void *v) mtx_enter(>pr_mtx); + pool_do_put(pp, v); + + pp->pr_nout--; + pp->pr_nput++; + + /* is it time to free a page? */ + if (pp->pr_nidle > pp->pr_maxpages && + (ph = TAILQ_FIRST(>pr_emptypages)) != NULL && + (ticks - ph->ph_tick) > (hz * pool_wait_free)) { + freeph = ph; + pool_p_remove(pp, freeph); + } + + mtx_leave(>pr_mtx); + + if (freeph != NULL) + pool_p_free(pp, freeph); + + if (!TAILQ_EMPTY(>pr_requests)) { + mtx_enter(>pr_requests_mtx); +
Re: vi(1): documenting :s
On Thu, Jun 15, 2017 at 02:14:46AM -0600, Anthony J. Bentley wrote: > Hi, > > From vi(1): > > [range] s[ubstitute] [/pattern/replace/] ??[options] [count] [flags] > [range] & [options] [count] [flags] > [range] ~ [options] [count] [flags] > Make substitutions. The replace field may contain any of the > following sequences: > (...snip...) > > There are a couple of issues here. > > First, the command is "s" and only "s"; trying "substitute" results in > an error. > > Additionally &, ~, and the options field remain unexplained. > > This diff tries to makes things better. ok? > ok by me. note that posix ex(1) does detail a working [s]ubstitute command, so i'm not sure whether we should support this or not. if you do make these changes, you may want to move the "s" text up a little, since these commands are sorted. jmc > Index: docs/USD.doc/vi.man/vi.1 > === > RCS file: /cvs/src/usr.bin/vi/docs/USD.doc/vi.man/vi.1,v > retrieving revision 1.67 > diff -u -p -r1.67 vi.1 > --- docs/USD.doc/vi.man/vi.1 15 Jun 2017 06:44:47 - 1.67 > +++ docs/USD.doc/vi.man/vi.1 15 Jun 2017 08:08:49 - > @@ -356,7 +356,7 @@ matches the end of the word. > .It > .Sq ~ > matches the replacement part of the last > -.Cm substitute > +.Cm s > command. > .El > .Sh BUFFERS > @@ -2019,7 +2019,7 @@ commands from a file. > .Pp > .It Xo > .Op Ar range > -.Cm s Ns Op Cm ubstitute > +.Cm s > .Sm off > .Op / Ar pattern No / Ar replace No / > .Sm on > @@ -2041,7 +2041,27 @@ commands from a file. > .Op Ar count > .Op Ar flags > .Xc > -Make substitutions. > +Substitute the regular expression > +.Ar pattern > +with > +.Ar replace . > +When invoked as > +.Cm & , > +or if > +.Bq Ns / Ns Ar pattern Ns / Ns Ar replace Ns / > +is omitted, > +.Ar pattern > +and > +.Ar replace > +from the most recent > +.Cm s > +command are used. > +.Cm ~ > +behaves like > +.Cm & , > +except the pattern used is the most recent regular expression used by any > +command. > +.Pp > The > .Ar replace > field may contain any of the following sequences: > @@ -2051,13 +2071,13 @@ The text matched by > .Ar pattern . > .It Sq \(a~ > The replacement part of the previous > -.Cm substitute > +.Cm s > command. > .It Sq % > If this is the entire > .Ar replace > pattern, the replacement part of the previous > -.Cm substitute > +.Cm s > command. > .It Sq \e# > Where > @@ -2082,6 +2102,18 @@ to be converted to uppercase. > Causes the next character to be converted to uppercase. > .El > .Pp > +The > +.Ar options > +field may contain any of the following characters: > +.Bl -tag -width Ds > +.It Sq c > +Prompt for confirmation before each replacement is done. > +.It Sq g > +Replace all instances of > +.Ar pattern > +in a line, not just the first. > +.El > +.Pp > .It Xo > .Cm su Ns Op Cm spend Ns > .Op Cm !\& > @@ -2291,7 +2323,9 @@ Remember the values of the > and > .Sq g > suffixes to the > -.Cm substitute > +.Cm s , & > +and > +.Cm ~ > commands, instead of initializing them as unset for each new command. > .It Cm escapetime Bq 1 > The tenths of a second >
vi(1): documenting :s
Hi, >From vi(1): [range] s[ubstitute] [/pattern/replace/] [options] [count] [flags] [range] & [options] [count] [flags] [range] ~ [options] [count] [flags] Make substitutions. The replace field may contain any of the following sequences: (...snip...) There are a couple of issues here. First, the command is "s" and only "s"; trying "substitute" results in an error. Additionally &, ~, and the options field remain unexplained. This diff tries to makes things better. ok? Index: docs/USD.doc/vi.man/vi.1 === RCS file: /cvs/src/usr.bin/vi/docs/USD.doc/vi.man/vi.1,v retrieving revision 1.67 diff -u -p -r1.67 vi.1 --- docs/USD.doc/vi.man/vi.115 Jun 2017 06:44:47 - 1.67 +++ docs/USD.doc/vi.man/vi.115 Jun 2017 08:08:49 - @@ -356,7 +356,7 @@ matches the end of the word. .It .Sq ~ matches the replacement part of the last -.Cm substitute +.Cm s command. .El .Sh BUFFERS @@ -2019,7 +2019,7 @@ commands from a file. .Pp .It Xo .Op Ar range -.Cm s Ns Op Cm ubstitute +.Cm s .Sm off .Op / Ar pattern No / Ar replace No / .Sm on @@ -2041,7 +2041,27 @@ commands from a file. .Op Ar count .Op Ar flags .Xc -Make substitutions. +Substitute the regular expression +.Ar pattern +with +.Ar replace . +When invoked as +.Cm & , +or if +.Bq Ns / Ns Ar pattern Ns / Ns Ar replace Ns / +is omitted, +.Ar pattern +and +.Ar replace +from the most recent +.Cm s +command are used. +.Cm ~ +behaves like +.Cm & , +except the pattern used is the most recent regular expression used by any +command. +.Pp The .Ar replace field may contain any of the following sequences: @@ -2051,13 +2071,13 @@ The text matched by .Ar pattern . .It Sq \(a~ The replacement part of the previous -.Cm substitute +.Cm s command. .It Sq % If this is the entire .Ar replace pattern, the replacement part of the previous -.Cm substitute +.Cm s command. .It Sq \e# Where @@ -2082,6 +2102,18 @@ to be converted to uppercase. Causes the next character to be converted to uppercase. .El .Pp +The +.Ar options +field may contain any of the following characters: +.Bl -tag -width Ds +.It Sq c +Prompt for confirmation before each replacement is done. +.It Sq g +Replace all instances of +.Ar pattern +in a line, not just the first. +.El +.Pp .It Xo .Cm su Ns Op Cm spend Ns .Op Cm !\& @@ -2291,7 +2323,9 @@ Remember the values of the and .Sq g suffixes to the -.Cm substitute +.Cm s , & +and +.Cm ~ commands, instead of initializing them as unset for each new command. .It Cm escapetime Bq 1 The tenths of a second
Re: sed(1): missing NUL in pattern space
On Tue, Jun 13, 2017 at 10:08:11AM +, kshe wrote: > On Sat, 10 Jun 2017 10:25:27 +, Otto Moerbeek wrote: > > Thanks for the analysis and diff, I hope to get a chanche to think > > about this soon. At least I'll make sure this diff is not forgotten, > > -Otto > > I have seen the tests that you recently added to the tree; in the mean > time, however, while I continued experimenting and reading the source, I > discovered more bugs (as expected), as well as other trivial issues > which might be worthy of note before committing any real fixes. > > First of all, my comprehension of the `c' command has slightly changed > since my last message: I now think its behaviour should be aligned with > that of `p', `P', `w' and others, all of which produce no output after > `c', treating a deleted pattern space as truly inexistent, and not > merely as an empty one. It would hence appear more coherent to add > > if (pd) > break; > > before calling lputs(), as is done in many other places. This would > then change the expected output of one of the newly added tests. > > In general though, that `c' command is a big mess. It is supposed to > delete the pattern space: in other implementations of sed that I have > tested, that means it will essentially behave like `i' followed by `d'. > In BSD's sed, however, it continues processing commands... Here's one > of the many bugs that arise because of that: > > $ echo abc | sed 'c\ > > text > > s/.*/x/ > > p > > s/.*/y/ > > p' > text > x > > So why isn't `y' printed too? In comparison, using flags instead of > standalone commands: > > $ echo abc | sed 'c\ > > text > > s/.*/x/p > > s/.*/y/p' > text > x > y > > Here, we do get a `y', but only one (there should be two given that `-n' > has not been supplied to suppress normal output). Since at least `p' as > a flag given to `s' appears to work, let us try the same with `w': > > $ echo abc | sed 'c\ > > text > > s/.*/x/w /tmp/sed.out > > s/.*/y/w /tmp/sed.out' > text > $ cat /tmp/sed.out > x > > But this time it doesn't work. Now add a third `s' command... > > $ echo abc | sed 'c\ > > text > > s/.*/x/w /tmp/sed.out > > s/.*/y/w /tmp/sed.out > > s/.*/z/w /tmp/sed.out' > text > z > $ cat /tmp/sed.out > x > z > > And it works. Well, not really: we do see a `z', but still no `y'. > Indeed, this does not make any sense. > > The reason for this strange behaviour is nevertheless easy to figure out > when looking at how `s' is implemented, keeping in mind that the > substitute space is globally defined. In any case, even if one was to > fix that particular bug, I would probably be able to find more. This is > why I believe that the only sane solution is that, just like `d', `c' > should simply > > goto new; > > which would, at the same time, obviate the need for the `deleted' field > of the SPACE structure, subsequently allowing for substantial cleanups. > > Also, aside of that `c' mess, there is still the fact that NUL bytes > originally present in the input are not well handled at all. Of course, > `l' cannot print past them (whereas `p' can), which is a shame; but > significantly worse is the inconsistent behaviour of `s' in that case. > At first, it seems to work well (in the output, `^@' represents a NUL): > > $ printf 'x\0y' | sed 's/x/_/' > _^@y > > It can even handle some empty matches perfectly: > > $ printf 'x\0y' | sed 's/[[:<:]]/_/' # pattern space survives > _x^@y > > But not all of them: > > $ printf 'x\0y' | sed 's/[[:>:]]/_/' # pattern space is cut off > x_ > $ printf 'x\0y' | sed 's/[[:>:]]/_/2' # likewise > x > > One might argue that NUL bytes are not supposed to happen in textual > input, so that these examples are irrelevant; the real problem however > is that it also affects newlines, thus perfectly valid text: > > $ printf 'x\ny' | sed 'N;s/[[:<:]]/_/' > _x > y > $ printf 'x\ny' | sed 'N;s/[[:>:]]/_/' > x_ > $ printf 'x\ny' | sed 'N;s/[[:>:]]/_/2' > x > > So maybe substitute() should test for (slen == 0) instead of the > questionable (*s == '\0' || *s == '\n'). That would seem more correct > to me, but this is only a guess since I must admit that I don't fully > understand that part of the code yet. Here is a diff, for instance, > that fixes all the above issues. > > --- a/src/usr.bin/sed/process.c Wed Jun 7 09:56:20 2017 > +++ b/src/usr.bin/sed/process.c Tue Jun 13 07:21:59 2017 > @@ -387,14 +388,11 @@ substitute(struct s_command *cp) >* and at the end of the line, terminate. >*/ > if (match[0].rm_so == match[0].rm_eo) { > - if (*s == '\0' || *s == '\n') > - slen