Don't scheck `so_pcb' with PR_WANTRCVD flag

2022-08-27 Thread Vitaliy Makkoveev
tcp(4) sockets are the only sockets which could have NULL `so_pcb' and we handle this case within tcp_rcvd() handler. Index: sys/kern/uipc_socket.c === RCS file: /cvs/src/sys/kern/uipc_socket.c,v retrieving revision 1.285 diff -u -p -

move PRU_ABORT request to (*pru_abort)()

2022-08-27 Thread Vitaliy Makkoveev
PRU_ABORT is another candidate to change return type to void. Also actually we abort only the sockets which are linked to `so_q' or `so_q0' queues of listening socket. Such sockets have no corresponding file descriptor and are not accessed from userland, so PRU_ABORT used to destroy them on listeni

Re: remove pr_output

2022-08-28 Thread Vitaliy Makkoveev
> On 28 Aug 2022, at 20:48, Alexander Bluhm wrote: > > Hi, > > Since we have no raw_usrreq anymore, we can retire pr_output. > pfkeyv2 and route can call ther output functions directly. > > ok? > ok mvs@ > bluhm > > Index: net/pfkeyv2.c >

move PRU_SENSE request to (*pru_sense)()

2022-08-28 Thread Vitaliy Makkoveev
Another candidate for future refactoring. Except the tcp(4) and unix(4) cases we do nothing with passed `ub', but in all cases we return no error. Index: sys/kern/uipc_usrreq.c === RCS file: /cvs/src/sys/kern/uipc_usrreq.c,v retrievin

Re: move PRU_SENSE request to (*pru_sense)()

2022-08-28 Thread Vitaliy Makkoveev
On Sun, Aug 28, 2022 at 10:42:11PM +0200, Alexander Bluhm wrote: > On Sun, Aug 28, 2022 at 10:51:31PM +0300, Vitaliy Makkoveev wrote: > > Another candidate for future refactoring. Except the tcp(4) and unix(4) > > cases we do nothing with passed `ub', but in all cases we

move PRU_RCVOOB request to (*pru_rcvoob)()

2022-08-28 Thread Vitaliy Makkoveev
Index: sys/kern/uipc_usrreq.c === RCS file: /cvs/src/sys/kern/uipc_usrreq.c,v retrieving revision 1.178 diff -u -p -r1.178 uipc_usrreq.c --- sys/kern/uipc_usrreq.c 28 Aug 2022 21:35:11 - 1.178 +++ sys/kern/uipc_usrreq.c

move PRU_SENDOOB request to (*pru_sendoob)()

2022-08-29 Thread Vitaliy Makkoveev
PRU_SENDOOB request always consumes passed `top' and `control' mbufs. We don't want to have dummy m_freem(9) handlers for all protocols, so we release passed mbufs in the pru_sendoob() EOPNOTSUPP error path. Also we had the `control' mbuf(9) leak in the tcp(4) PRU_SENDOOB error path, which was fix

Re: udp pcb mutex

2022-08-29 Thread Vitaliy Makkoveev
> On 29 Aug 2022, at 20:34, Alexander Bluhm wrote: > > Hi, > > The diff below is needed to protect the receive socket buffer in > UDP input with per PCB mutex. > > With that, parallel UDP input and soreceive can be activated. There > are still issues with socket splicing and maybe pipex. So I

Re: refactor pcb lookup

2022-08-29 Thread Vitaliy Makkoveev
Looks good by me. > On 29 Aug 2022, at 14:15, Alexander Bluhm wrote: > > Anyone? > > On Sat, Aug 20, 2022 at 03:24:28PM +0200, Alexander Bluhm wrote: >> Hi, >> >> Can we rename the the function in_pcbhashlookup() to in_pcblookup()? >> Then we have in_pcblookup() and in_pcblookup_listen() as pu

Re: pipex syzkaller keylen

2022-08-30 Thread Vitaliy Makkoveev
On Tue, Aug 30, 2022 at 03:41:29PM +0200, Alexander Bluhm wrote: > Hi, > > I looks like syzkaller has found a missing input validation in pipex. > > https://syzkaller.appspot.com/bug?id=c7ac769bd7ee15549b8a2be188bcee07d98a5357 > > As I have no pipex setup, can anyone test this diff please? > o

move PRU_CONNECT2 request to (*pru_connect2)()

2022-08-31 Thread Vitaliy Makkoveev
Index: sys/kern/uipc_usrreq.c === RCS file: /cvs/src/sys/kern/uipc_usrreq.c,v retrieving revision 1.181 diff -u -p -r1.181 uipc_usrreq.c --- sys/kern/uipc_usrreq.c 31 Aug 2022 21:23:02 - 1.181 +++ sys/kern/uipc_usrreq.c

Re: move PRU_CONNECT2 request to (*pru_connect2)()

2022-09-01 Thread Vitaliy Makkoveev
On Thu, Sep 01, 2022 at 05:59:44PM +0200, Alexander Bluhm wrote: > On Thu, Sep 01, 2022 at 01:27:18AM +0300, Vitaliy Makkoveev wrote: > > +int > > +uipc_connect2(struct socket *so, struct socket *so2) > > +{ > > + struct unpcb *unp = sotounpcb(so), *unp2 = sotoun

move PRU_CONTROL request to (*pru_control)()

2022-09-01 Thread Vitaliy Makkoveev
The 'proc *' is not used for PRU_CONTROL request, so remove it from pru_control() wrapper. I want to use existing in{6,}_control for tcp(4) and udp(4) sockets, so for inet6 case I introduced `tcp6_usrreqs' and `udp6_usrreqs' structures. I also want to use them for the following PRU_SOCKADDR and PR

Re: protocol attach wait

2022-09-01 Thread Vitaliy Makkoveev
On Thu, Sep 01, 2022 at 09:00:50PM +0200, Alexander Bluhm wrote: > On Mon, Aug 15, 2022 at 05:12:22PM +0200, Alexander Bluhm wrote: > > System calls should not fail due to temporary memory shortage in > > malloc(9) or pool_get(9). > > > > Pass down a wait flag to pru_attach(). During syscall sock

Re: protocol attach wait

2022-09-01 Thread Vitaliy Makkoveev
On Thu, Sep 01, 2022 at 10:58:49PM +0300, Vitaliy Makkoveev wrote: > On Thu, Sep 01, 2022 at 09:00:50PM +0200, Alexander Bluhm wrote: > > On Mon, Aug 15, 2022 at 05:12:22PM +0200, Alexander Bluhm wrote: > > > System calls should not fail due to temporary memory shortage in

Move PRU_SOCKADDR and PRU_PEERADDR requests to corresponding pru_wrappers

2022-09-02 Thread Vitaliy Makkoveev
Introduce in{,6}_sockaddr() and in{,6}_peeraddr() functions, and use them for all except tcp(4) sockets. Use tcp_sockaddr() and tcp_peeraddr() functions to keep debug ability. The key management and route domain sockets returns EINVAL error for PRU_SOCKADDR request, so keep this behaviour for a wh

Remove "#ifdef INET6" block from in_setpeeraddr()

2022-09-02 Thread Vitaliy Makkoveev
We always call in6_setpeeraddr() and never call in_setpeeraddr() fro the inet6 case. Index: sys/netinet/in_pcb.c === RCS file: /cvs/src/sys/netinet/in_pcb.c,v retrieving revision 1.273 diff -u -p -r1.273 in_pcb.c --- sys/netinet/in_pc

Re: Move PRU_SOCKADDR and PRU_PEERADDR requests to corresponding pru_wrappers

2022-09-02 Thread Vitaliy Makkoveev
On Fri, Sep 02, 2022 at 10:48:56PM +0200, Alexander Bluhm wrote: > On Fri, Sep 02, 2022 at 05:56:33PM +0300, Vitaliy Makkoveev wrote: > > Introduce in{,6}_sockaddr() and in{,6}_peeraddr() functions, and use > > them for all except tcp(4) sockets. Use tcp_sockaddr() and > > tcp

Re: Remove "#ifdef INET6" block from in_setpeeraddr()

2022-09-02 Thread Vitaliy Makkoveev
On Fri, Sep 02, 2022 at 10:46:12PM +0200, Alexander Bluhm wrote: > On Fri, Sep 02, 2022 at 06:01:20PM +0300, Vitaliy Makkoveev wrote: > > We always call in6_setpeeraddr() and never call in_setpeeraddr() fro the > > inet6 case. > > Should we do it the other way around? >

Re: protocol attach wait

2022-09-02 Thread Vitaliy Makkoveev
On Fri, Sep 02, 2022 at 11:56:02AM +0200, Alexander Bluhm wrote: > On Thu, Sep 01, 2022 at 11:04:19PM +0300, Vitaliy Makkoveev wrote: > > On Thu, Sep 01, 2022 at 10:58:49PM +0300, Vitaliy Makkoveev wrote: > > > On Thu, Sep 01, 2022 at 09:00:50PM +0200, Alexander Bluhm wrote: >

move PRU_PEERADDR request to (*pru_peeraddr)()

2022-09-03 Thread Vitaliy Makkoveev
The last one. As for the previous PRU_SOCKADDR, introduce in{,6}_peeraddr() and use it for inet and inet sockets, except tcp(4). Also remove *_usrreq() handlers. This makes diff bigger, but I guess we don't want to commit code like below: rip_usrreq(struct socket *so, int req, struct mbuf *m, st

Re: soreceive with shared netlock

2022-09-03 Thread Vitaliy Makkoveev
> On 4 Sep 2022, at 00:56, Alexander Bluhm wrote: > > On Sat, Sep 03, 2022 at 11:20:17PM +0200, Hrvoje Popovski wrote: >> with this diff while booting I'm getting this witness trace > > It is not related to soreceive() diff, but TCP diff I commited > before. I forgot a mutex initalization which

Re: soreceive with shared netlock

2022-09-03 Thread Vitaliy Makkoveev
> On 3 Sep 2022, at 23:47, Alexander Bluhm wrote: > > Hi, > > The next small step towards parallel network stack is to use shared > netlock in soreceive(). The UDP and IP divert layer provide locking > of the PCB. If that is possible, use shared instead of exclusive > netlock in soreceive().

Change pru_rcvd() return type to the type of void

2022-09-10 Thread Vitaliy Makkoveev
We have no interest on pru_rcvd() return value. Also, we call pru_rcvd() only if the socket's protocol have PR_WANTRCVD flag set. Such sockets are route domain, tcp(4) and unix(4) sockets. This diff keeps the PR_WANTRCVD check. In other hand we could always call pru_rcvd() and do "pru_rcvd != NULL

soreceive() with shared netlock for raw sockets

2022-09-10 Thread Vitaliy Makkoveev
As it was done for udp and divert sockets. Index: sys/netinet/ip_var.h === RCS file: /cvs/src/sys/netinet/ip_var.h,v retrieving revision 1.104 diff -u -p -r1.104 ip_var.h --- sys/netinet/ip_var.h3 Sep 2022 22:43:38 -

Change pru_abort() return type to the type of void and make pru_abort optional

2022-09-17 Thread Vitaliy Makkoveev
We have no interest on pru_abort() return value. Also we call it only through soabort() which is dummy pru_abort() wrapper and has no return value. Also only the connection oriented sockets need to implement (*pru_abort)() handler. Such sockets are tcp(4) and unix(4) sockets, so we could remove ex

Re: Change pru_abort() return type to the type of void and make pru_abort optional

2022-10-06 Thread Vitaliy Makkoveev
ping > On 17 Sep 2022, at 22:44, Vitaliy Makkoveev wrote: > > We have no interest on pru_abort() return value. Also we call it only > through soabort() which is dummy pru_abort() wrapper and has no return > value. > > Also only the connection oriented sockets need to im

Re: loop_clone_destroy: read ifnetlist/rdomain with shared net lock

2022-10-21 Thread Vitaliy Makkoveev
On Fri, Oct 21, 2022 at 10:28:04AM +, Klemens Nanni wrote: > All interface ioctls always run with the kernel lock anyway, so this > doesn't make a difference, except that it reflects how ifnetlist is not > modified. > > OK? > netlock could be completely dropped here. > diff --git a/sys/net/

Re: loop_clone_destroy: read ifnetlist/rdomain with shared net lock

2022-10-21 Thread Vitaliy Makkoveev
On Fri, Oct 21, 2022 at 12:14:16PM +, Klemens Nanni wrote: > On Fri, Oct 21, 2022 at 03:02:33PM +0300, Vitaliy Makkoveev wrote: > > netlock could be completely dropped here. > > We could probably drop the net lock around ifnetlist wherever the kernel > lock is currently he

pppx(4): decrease netlock pressure in pppxioctl()

2022-11-01 Thread Vitaliy Makkoveev
Push netlock down to pppx_add_session(). The 'pppx_if' structure has the `pxi_ready' member to prevent access to incomplete `pxi', so we don't need to hold netlock during all initialisation process. This removes potential PR_WAITOK/M_WAITOK allocations impact on packet processing. Also this removes

pflow(4): make `so' dereference safe

2022-11-04 Thread Vitaliy Makkoveev
Each pflow(4) interface has associated socket, referenced as sc->so. We set this socket in pflowioctl() which is called with both kernel and net locks held. In the pflow_output_process() task we do sc->so dereference, which is protected by kernel lock. But the sosend(), called deeper by pflow_outpu

Split `uipc_dgram_usrreqs' out from `uipc_usrreqs'

2022-11-05 Thread Vitaliy Makkoveev
guenther@ proposed to split out handlers for SOCK_DGRAM unix(4) sockets from SOCK_STREAM and SOCK_SEQPACKET. The diff below introduces `uipc_dgram_usrreqs' to store pointers to dgram specific handlers. The dgram pru_shutdown and pru_send handlers were splitted to uipc_dgram_shutdown() and uipc_dg

Re: push kernel lock inside ifioctl_get()

2022-11-08 Thread Vitaliy Makkoveev
The `if_cloners’ list is immutable. You don't need kernel lock around if_clone_list() call. > case SIOCIFGCLONERS: > + KERNEL_LOCK(); > error = if_clone_list((struct if_clonereq *)data); > + KERNEL_UNLOCK(); > return (error); With this fi

Re: push kernel lock inside ifioctl_get()

2022-11-08 Thread Vitaliy Makkoveev
No reason to keep kernel lock around if_clone_list() call. > On 8 Nov 2022, at 20:27, Klemens Nanni wrote: > > On Tue, Nov 08, 2022 at 08:04:16PM +0300, Vitaliy Makkoveev wrote: >> The `if_cloners’ list is immutable. You don't need kernel lock >> around if_clone_lis

Re: Document ifc_list immutability

2022-11-08 Thread Vitaliy Makkoveev
> On 8 Nov 2022, at 21:08, Klemens Nanni wrote: > > Now properly. How about a single comment block at the top instead of > repeating it for every struct? > > You forgot to mark [I] `if_cloners’ within net/if.c. > diff --git a/sys/net/if_var.h b/sys/net/if_var.h > index 28514a0bfcd..6272be882

Re: Document ifc_list immutability

2022-11-08 Thread Vitaliy Makkoveev
> On 8 Nov 2022, at 21:26, Klemens Nanni wrote: > > On Tue, Nov 08, 2022 at 09:18:47PM +0300, Vitaliy Makkoveev wrote: >>> On 8 Nov 2022, at 21:08, Klemens Nanni wrote: >>> >>> Now properly. How about a single comment block at the top instead

Re: Document ifc_list immutability

2022-11-08 Thread Vitaliy Makkoveev
ok > On 8 Nov 2022, at 21:38, Klemens Nanni wrote: > > On Tue, Nov 08, 2022 at 09:34:36PM +0300, Vitaliy Makkoveev wrote: >>> On 8 Nov 2022, at 21:26, Klemens Nanni wrote: >>> >>> On Tue, Nov 08, 2022 at 09:18:47PM +0300, Vitaliy Makkoveev wrote: >>

Re: pppx(4): decrease netlock pressure in pppxioctl()

2022-11-09 Thread Vitaliy Makkoveev
ping... On Tue, Nov 01, 2022 at 03:16:02PM +0300, Vitaliy Makkoveev wrote: > Push netlock down to pppx_add_session(). The 'pppx_if' structure has > the `pxi_ready' member to prevent access to incomplete `pxi', so we > don't need to hold netlock during all init

Re: pflow(4): make `so' dereference safe

2022-11-10 Thread Vitaliy Makkoveev
ping... On Fri, Nov 04, 2022 at 10:04:35PM +0300, Vitaliy Makkoveev wrote: > Each pflow(4) interface has associated socket, referenced as sc->so. We > set this socket in pflowioctl() which is called with both kernel and net > locks held. In the pflow_output_process() task we do sc->

Merge uipc_bind() with unp_bind()

2022-11-14 Thread Vitaliy Makkoveev
uipc_bind() only calls unp_bind(). Also it is the only caller of unp_bind(). Index: sys/kern/uipc_usrreq.c === RCS file: /cvs/src/sys/kern/uipc_usrreq.c,v retrieving revision 1.192 diff -u -p -r1.192 uipc_usrreq.c --- sys/kern/uipc_us

Re: Merge uipc_bind() with unp_bind()

2022-11-14 Thread Vitaliy Makkoveev
On Mon, Nov 14, 2022 at 09:28:34AM +, Klemens Nanni wrote: > On Mon, Nov 14, 2022 at 12:11:46PM +0300, Vitaliy Makkoveev wrote: > > uipc_bind() only calls unp_bind(). Also it is the only caller of > > unp_bind(). > > For *_bind() alone this looks like zapping a useless i

Turn sowriteable(), sballoc() and sbfree() macro to inline functions

2022-11-14 Thread Vitaliy Makkoveev
We have soreadable() already presented as inline function, but corresponding sowriteable() is still macro. Also it's no reason to keep sballoc() and sbfree() as macro. Index: sys/sys/protosw.h === RCS file: /cvs/src/sys/sys/protosw.h,

Re: Turn sowriteable(), sballoc() and sbfree() macro to inline functions

2022-11-14 Thread Vitaliy Makkoveev
On Mon, Nov 14, 2022 at 12:00:28PM +, Klemens Nanni wrote: > On Mon, Nov 14, 2022 at 02:14:27PM +0300, Vitaliy Makkoveev wrote: > > We have soreadable() already presented as inline function, but > > corresponding sowriteable() is still macro. Also it's no reason to ke

Re: Push kernel lock into pru_control()

2022-11-14 Thread Vitaliy Makkoveev
> On 10 Nov 2022, at 13:54, Klemens Nanni wrote: > > Purely mechanical, then in6_control() and in_control() can be pushed > further individually. > > Feedback? OK? SS_PRIV is immutable, no reason to check it with kernel lock held. > --- > sys/kern/sys_socket.c | 2 -- > sys/netinet/in.c |

The next step of netlock removal from pppx(4)

2022-11-20 Thread Vitaliy Makkoveev
Kernel lock is always taken when we do access to `pxd_pxis' lists and `pppx_ifs' tree, so rely on it instead of netlock. The search in `pppx_ifs' tree has no context switch. We also have no context switch between the `pxi' free unit search and tree insertion. Use reference counters to make `pxi' d

Re: lladdr support for netstart/hostname.if

2022-11-22 Thread Vitaliy Makkoveev
On Tue, Nov 22, 2022 at 11:25:55AM +0100, Claudio Jeker wrote: > On Tue, Nov 22, 2022 at 09:25:08AM +, Stuart Henderson wrote: > > Need to query (and set $if, which might be used in route commands etc) I > > think. > > > > I would prefer if people took a step back from configuring interfaces

Re: lladdr support for netstart/hostname.if

2022-11-22 Thread Vitaliy Makkoveev
On Tue, Nov 22, 2022 at 06:28:31AM -0700, Theo de Raadt wrote: > Vitaliy Makkoveev wrote: > > > On Tue, Nov 22, 2022 at 11:25:55AM +0100, Claudio Jeker wrote: > > > On Tue, Nov 22, 2022 at 09:25:08AM +, Stuart Henderson wrote: > > > > Need to query (and set

Re: Push kernel lock into in6_ioctl()

2022-11-23 Thread Vitaliy Makkoveev
On Wed, Nov 23, 2022 at 08:46:41AM +, Klemens Nanni wrote: > Mechanical move that "unlocks" the errno(2) cases. > > This is another step towards more read-only interface ioctls running > with the shared net lock alone. > > Feedback? OK? > Could this be merged with the following non "Mec

Re: Remove unused struct ifnet's *if_afdata[] and struct domain's dom_if{at,de}tach()

2022-11-23 Thread Vitaliy Makkoveev
On Wed, Nov 23, 2022 at 11:09:31AM +, Klemens Nanni wrote: > On Wed, Nov 23, 2022 at 11:04:55AM +, Klemens Nanni wrote: > > > I don't mind them to be two commits but please share both of them at the > > > same time. Because they should hit the tree at the same time. Changing > > > header fi

Re: Inline useless ND_IFINFO() macro

2022-11-23 Thread Vitaliy Makkoveev
On Wed, Nov 23, 2022 at 02:56:27PM +, Klemens Nanni wrote: > A single cast-free struct pointer dereference needs no indirection. > ND_IFINFO() is under _KERNEL. > > OK? > ok mvs@ > diff --git a/sys/netinet6/nd6.c b/sys/netinet6/nd6.c > index 1924c36c813..d6ccfd3a272 100644 > --- a/sys/netin

Re: Let nd6_if{at,de}tach() be void and take an ifp argument

2022-11-23 Thread Vitaliy Makkoveev
On Wed, Nov 23, 2022 at 02:54:08PM +, Klemens Nanni wrote: > Do it like the rest of at/detach routines which modify a struct ifnet > pointer without returning anything. > > OK? > ok mvs@ > diff --git a/sys/net/if.c b/sys/net/if.c > index c30d7e30e4f..3cb8bbf9176 100644 > --- a/sys/net/if.c

Re: splassert on boot

2022-11-23 Thread Vitaliy Makkoveev
On Wed, Nov 23, 2022 at 02:59:05PM -0500, David Hill wrote: > Hello - > > I am seeing splasserts on boot (before kern.splassert=2 can be set) with > -current. > > > > spdmem0 at iic0 addr 0x50: 8GB DDR3 SDRAM PC3-12800 SO-DIMM > isa0 at pcib0 > isadma0 at isa0 > vga0 at isa0 port 0x3b0/48 i

Re: Let nd6_if{at,de}tach() be void and take an ifp argument

2022-11-23 Thread Vitaliy Makkoveev
> On 24 Nov 2022, at 00:47, Claudio Jeker wrote: > > On Wed, Nov 23, 2022 at 02:54:08PM +, Klemens Nanni wrote: >> Do it like the rest of at/detach routines which modify a struct ifnet >> pointer without returning anything. >> >> OK? >> >> diff --git a/sys/net/if.c b/sys/net/if.c >> index c

Re: lladdr support for netstart/hostname.if

2022-11-24 Thread Vitaliy Makkoveev
On Wed, Nov 23, 2022 at 09:36:28PM -0700, Theo de Raadt wrote: > Theo de Raadt wrote: > > > > The other, that if both exist, > > > /etc/hostname.$if will override /etc/hostname.$lladdr. > > > > We do need to decide which one is priority, and document that. > > > > I am still unsure which is bet

Re: lladdr support for netstart/hostname.if

2022-11-24 Thread Vitaliy Makkoveev
On Thu, Nov 24, 2022 at 01:34:30PM +, Stuart Henderson wrote: > On 2022/11/24 14:36, Vitaliy Makkoveev wrote: > > On Wed, Nov 23, 2022 at 09:36:28PM -0700, Theo de Raadt wrote: > > > Theo de Raadt wrote: > > > > > > > > The other, that if both exist,

Re: lladdr support for netstart/hostname.if

2022-11-24 Thread Vitaliy Makkoveev
> On 24 Nov 2022, at 19:20, Theo de Raadt wrote: > >> I like to exclude pseudo devices. The pseudo device list is immutable, >> so we need to get only once during /etc/netstart. > > Why do we need to excluse them? > > The users will learn when to use this, and when not to. > So, I can't use h

Re: pfioctl: drop net lock from DIOCGETIFACES, DIOC{SET,CLR}IFFLAG

2023-05-26 Thread Vitaliy Makkoveev
On Fri, May 26, 2023 at 01:03:13PM +, Klemens Nanni wrote: > snmpd(8) and 'pfctl -s Interfaces' dump pf's internal list of interfaces. > > pf.conf's 'set skip on ifN' and 'pfctl -F all|Reset' set and clear flags, > PFI_IFLAG_SKIP being the only flag. > > (There's no other usage of these ioctl

Re: Relax netlock to shared netlock and push it down to mrt_sysctl_mfc()

2023-05-26 Thread Vitaliy Makkoveev
On Wed, May 17, 2023 at 01:02:58PM +0300, Vitaliy Makkoveev wrote: > mrt_rtwalk_mfcsysctl() performs read-only access to protected data, so > rtable_walk() could be called with shared netlock. > Regardless on sysctl(2) unlocking backout, the netlock around mrt_sysctl_mfc() could be r

Re: Relax netlock to shared netlock and push it down to mrt_sysctl_vif()

2023-05-26 Thread Vitaliy Makkoveev
On Wed, May 17, 2023 at 01:08:52PM +0300, Vitaliy Makkoveev wrote: > Also read-only access to netlock protected data. > Regardless on sysctl(2) unlocking backout, the netlock around mrt_sysctl_vif() could be relaxed to shared netlock. > Index: sys/netinet/i

Re: Relax netlock to shared netlock and push it down to mrt_sysctl_mfc()

2023-05-26 Thread Vitaliy Makkoveev
On Fri, May 26, 2023 at 05:08:06PM +0200, Alexander Bluhm wrote: > On Fri, May 26, 2023 at 05:29:58PM +0300, Vitaliy Makkoveev wrote: > > On Wed, May 17, 2023 at 01:02:58PM +0300, Vitaliy Makkoveev wrote: > > > mrt_rtwalk_mfcsysctl() performs read-only access to

Re: ifconfig rename tcplro

2023-06-06 Thread Vitaliy Makkoveev
On Tue, Jun 06, 2023 at 02:31:52PM +0200, Alexander Bluhm wrote: > Hi, > > I would suggest to rename ifconfig tcprecvoffload to tcplro. Maybe > it's just because I had to type that long name too often. > > With that we have consistent naming: > # ifconfig ix0 tcplro > # sysctl net.inet.tcp.tso=1

Re: ifconfig rename tcplro

2023-06-06 Thread Vitaliy Makkoveev
> On 6 Jun 2023, at 19:37, Chris Cappuccio wrote: > > Jan Klemkow [j.klem...@wemelug.de] wrote: >> On Tue, Jun 06, 2023 at 05:54:31PM +0300, Vitaliy Makkoveev wrote: >>> On Tue, Jun 06, 2023 at 02:31:52PM +0200, Alexander Bluhm wrote: >>>> I would suggest

Re: ifconfig rename tcplro

2023-06-06 Thread Vitaliy Makkoveev
> On 6 Jun 2023, at 20:29, Alexander Bluhm wrote: > > On Tue, Jun 06, 2023 at 05:54:31PM +0300, Vitaliy Makkoveev wrote: >> On Tue, Jun 06, 2023 at 02:31:52PM +0200, Alexander Bluhm wrote: >>> Hi, >>> >>> I would suggest to rename ifconfig tcprecvofflo

Re: ifconfig rename tcplro

2023-06-07 Thread Vitaliy Makkoveev
On Wed, Jun 07, 2023 at 10:19:32AM +1000, David Gwynne wrote: > > > > On 7 Jun 2023, at 06:33, Vitaliy Makkoveev wrote: > > > >> On 6 Jun 2023, at 20:29, Alexander Bluhm wrote: > >> > >> On Tue, Jun 06, 2023 at 05:54:31PM +0300, Vitaliy Makkove

if_detach(): move nd6_ifdetach() out of netlock

2023-06-07 Thread Vitaliy Makkoveev
In this point, the interface is disconnected from everywhere. No need to hold netlock for dummy 'nd_ifinfo' release. Netlock is also not needed for TAILQ_EMPTY(&ifp->if_*hooks) assertions. Index: sys/net/if.c === RCS file: /cvs/src/sy

Kill if_detached_ioctl()

2023-06-07 Thread Vitaliy Makkoveev
In this point the interface is already removed from the list of all interfaces and from the interface index map and all possible concurrent ioctl() threads finished. Remove this dead code. Index: sys/net/if.c === RCS file: /cvs/src/s

Re: ifconfig rename tcplro

2023-06-07 Thread Vitaliy Makkoveev
On Wed, Jun 07, 2023 at 01:29:09PM +0200, Alexander Bluhm wrote: > On Wed, Jun 07, 2023 at 12:59:11PM +0300, Vitaliy Makkoveev wrote: > > On Wed, Jun 07, 2023 at 10:19:32AM +1000, David Gwynne wrote: > > > > > > > > > > On 7 Jun 2023, at 06:33, Vitaliy M

Re: inpcb sip hash mutex contention

2023-06-24 Thread Vitaliy Makkoveev
> On 22 Jun 2023, at 22:50, Alexander Bluhm wrote: > > Hi, > > I am working on a diff to run UDP input in parallel. Btrace kstack > analysis shows that SIP hash for PCB lookup is quite expensive. > When running in parallel we get lock contention on the PCB table > mutex. > > So it results in b

Introduce M_IFGROUP type of memory allocation

2023-06-27 Thread Vitaliy Makkoveev
M_TEMP seems unreasonable for interface groups data allocations. Don't forget to recompile systat(1) and vmstat(8) with new sys/malloc.h. Index: sys/net/if.c === RCS file: /cvs/src/sys/net/if.c,v retrieving revision 1.700 diff -u -p

Re: Introduce M_IFGROUP type of memory allocation

2023-06-27 Thread Vitaliy Makkoveev
On Tue, Jun 27, 2023 at 11:09:32AM +, Klemens Nanni wrote: > On Tue, Jun 27, 2023 at 01:32:37PM +0300, Vitaliy Makkoveev wrote: > > M_TEMP seems unreasonable for interface groups data allocations. > > After claudio pointed out the wrong type, I thought of the same name, >

Re: sec(4): route based ipsec vpns

2023-07-04 Thread Vitaliy Makkoveev
On Tue, Jul 04, 2023 at 03:26:30PM +1000, David Gwynne wrote: > tl;dr: this adds sec(4) p2p ip interfaces. Traffic in and out of these > interfaces is protected by IPsec security associations (SAs), but > there's no flows (security policy database (SPD) entries) associated > with these SAs. The pol

Re: tso lo keep mss

2023-07-07 Thread Vitaliy Makkoveev
On Fri, Jul 07, 2023 at 11:48:13AM +0300, Alexander Bluhm wrote: > Hi, > > When we preserve M_TCP_TSO we also must keep ph_mss. In lo(4) > LRO/TSO this logic was missing. As this may be relevant only for > weird pf configs that forward from loopback and ifconfig tcplro is > disabled by default,

Re: Use u_long for struct mstat

2023-07-07 Thread Vitaliy Makkoveev
On Fri, Jul 07, 2023 at 12:31:02PM +0300, YASUOKA Masahiko wrote: > Hi, > > I'd like to expand the counters in struct mbstat from u_short to u_long. > > When I was debugging a mbuf leak, I saw the result of "netstat -m" > --- > 28647 mbufs in use: > 28551 mbufs allocated to data >

Re: route leak nd6 detach

2023-07-09 Thread Vitaliy Makkoveev
> On 9 Jul 2023, at 15:15, Alexander Bluhm wrote: > > Hi, > > While testing my ART reference couting fix, I discovered a rtentry > leak that is triggered by regress/sbin/route and detected with > btrace(8) refcnt. > > The reference returned by rtalloc() must be freed with rtfree() in > all case

Replace selwakeup() with knote(9) in wscons(4) and make filterops mpsafe

2023-07-11 Thread Vitaliy Makkoveev
Use per 'wseventvar' structure `mtx' mutex(9) to protect `put' and `get' circular buffer indexes together with klist data. Not a big deal, but Xorg will not kernel lock while polling keyboard and mouse events. Also removed obsolete selinfo. Feedback, objections, oks? Not related to this diff, bu

Move solock() down to sosetopt()

2023-07-12 Thread Vitaliy Makkoveev
This is a part of my standalone sblock() work. I need this movement because buffers related SO_SND* and SO_RCV* socket options modification should be protected with sblock(). However, standalone sblock() has different lock orders with solock() for receive and send buffers. At least sblock() for `so

sobuf_print(): add `sb_state' output

2023-07-21 Thread Vitaliy Makkoveev
It contains SS_CANTSENDMORE, SS_ISSENDING, SS_CANTRCVMORE and SS_RCVATMARK bits. Also do `sb_flags' output as hex, it contains flags too. Index: sys/kern/uipc_socket.c === RCS file: /cvs/src/sys/kern/uipc_socket.c,v retrieving revisio

Re: inetd echo localhost

2023-07-21 Thread Vitaliy Makkoveev
On Thu, Jul 20, 2023 at 09:57:00PM +0200, Alexander Bluhm wrote: > Hi, > > I wonder why UDP echo does not work with inetd on 127.0.0.1. > > Note that it is default off. One of my regress machines has it > enabled for other tests. There perl dist/Net-Ping/t/510_ping_udp.t > expects that UDP echo

Re: Move solock() down to sosetopt()

2023-07-22 Thread Vitaliy Makkoveev
On Fri, Jul 21, 2023 at 07:38:17PM +0200, Alexander Bluhm wrote: > On Thu, Jul 13, 2023 at 02:22:17AM +0300, Vitaliy Makkoveev wrote: > > This is a part of my standalone sblock() work. I need this movement > > because buffers related SO_SND* and SO_RCV* socket options modificatio

Re: uvm_meter: remove wakeup of proc0

2023-07-29 Thread Vitaliy Makkoveev
On Sat, Jul 29, 2023 at 11:16:14AM +0200, Claudio Jeker wrote: > proc0 aka the swapper does not do anything. So there is no need to wake it > up. Now the problem is that last time this was tried some inteldrm systems > did hang during bootup because the drm code unexpectedly depended on this > wake

Re: uvm_meter: remove wakeup of proc0

2023-07-31 Thread Vitaliy Makkoveev
This is the culprit: schedule_timeout_uninterruptible(long timeout) { tsleep(curproc, PWAIT, "schtou", timeout); return 0; }

Re: uvm_meter: remove wakeup of proc0

2023-07-31 Thread Vitaliy Makkoveev
On Mon, Jul 31, 2023 at 09:49:30PM +0200, Claudio Jeker wrote: > On Mon, Jul 31, 2023 at 08:03:41PM +0300, Vitaliy Makkoveev wrote: > > This is the culprit: > > > > schedule_timeout_uninterruptible(long timeout) > > { > > tsleep(curproc, PWAIT, "s

Re: uvm_meter: remove wakeup of proc0

2023-07-31 Thread Vitaliy Makkoveev
On Mon, Jul 31, 2023 at 10:04:44PM +0200, Claudio Jeker wrote: > On Mon, Jul 31, 2023 at 09:49:30PM +0200, Claudio Jeker wrote: > > On Mon, Jul 31, 2023 at 08:03:41PM +0300, Vitaliy Makkoveev wrote: > > > This is the culprit: > > > > > > schedule_

Re: uvm_meter remove wakeup of swapper

2023-08-01 Thread Vitaliy Makkoveev
On Tue, Aug 01, 2023 at 11:24:01AM +0200, Claudio Jeker wrote: > Now that the issue in inteldrm was resolved we can finally remove this > old wakeup of the swapper. > > OK? ok mvs > -- > :wq Claudio > > Index: uvm_meter.c > === >

sosetopt(): merge SO_SND* with corresponding SO_RCV* cases

2023-08-03 Thread Vitaliy Makkoveev
The only difference is the socket buffer. As bonus, in the future solock() will be easily replaced by sblock() instead pushing it down to each SO_SND* and SO_RCV* case. Index: sys/kern/uipc_socket.c === RCS file: /cvs/src/sys/kern/ui

Re: sosetopt(): merge SO_SND* with corresponding SO_RCV* cases

2023-08-08 Thread Vitaliy Makkoveev
On Tue, Aug 08, 2023 at 10:40:46PM +0200, Alexander Bluhm wrote: > On Fri, Aug 04, 2023 at 12:38:23AM +0300, Vitaliy Makkoveev wrote: > > @@ -1856,6 +1856,9 @@ sosetopt(struct socket *so, int level, i > > case SO_SNDLOWAT: > >

Re: tcp sync cache refcount improvement

2023-09-03 Thread Vitaliy Makkoveev
> On 3 Sep 2023, at 21:08, Alexander Bluhm wrote: > > Hi, > > Avoid a useless increment and decrement of of the tcp syn cache > refcount by unexpanding the SYN_CACHE_TIMER_ARM() macro in the timer > callback. > > ok? > ok mvs > bluhm > > Index: netinet/tcp_input.c >

Re: tcp sync cache signed use counter

2023-09-04 Thread Vitaliy Makkoveev
> On 4 Sep 2023, at 16:19, Alexander Bluhm wrote: > > Hi, > > Variable scs_use is basically counting packet insertions to syn > cache, so I would prefer type long to exclude overflow on fast > machines. With the current limits int should be enough, but long > does not hurt. > > It can be negat

Re: tcp sync cache signed use counter

2023-09-04 Thread Vitaliy Makkoveev
> On 4 Sep 2023, at 19:52, Alexander Bluhm wrote: > > On Mon, Sep 04, 2023 at 07:22:03PM +0300, Vitaliy Makkoveev wrote: >>> On 4 Sep 2023, at 16:19, Alexander Bluhm wrote: >>> >>> Hi, >>> >>> Variable scs_use is basically counting pack

Re: tcp sync cache signed use counter

2023-09-04 Thread Vitaliy Makkoveev
> On 4 Sep 2023, at 23:43, Christian Weisgerber wrote: > > Alexander Bluhm: > >> Variable scs_use is basically counting packet insertions to syn >> cache, so I would prefer type long to exclude overflow on fast >> machines. With the current limits int should be enough, but long >> does not hurt

Replace selinfo by klist in vnode structure

2023-09-07 Thread Vitaliy Makkoveev
Remove the remnants of the leftover selinfo from vnode(9) layer. Just mechanical replacement because knote(9) API is already used. I don't want make klist MP safe with this diff. headers added where is was required. Disabled tmpsfs was also tested. ok? Index: sys/dev/hotplug.c =

Re: Replace selinfo by klist in vnode structure

2023-09-08 Thread Vitaliy Makkoveev
On Fri, Sep 08, 2023 at 07:39:10PM +0200, Alexander Bluhm wrote: > On Thu, Sep 07, 2023 at 10:32:58PM +0300, Vitaliy Makkoveev wrote: > > Remove the remnants of the leftover selinfo from vnode(9) layer. Just > > mechanical replacement because knote(9) API is already used. I don&

fuse(4) device: replace selinfo with klist

2023-09-10 Thread Vitaliy Makkoveev
Replace selinfo remnants with knote(9) API. Mechanical conversion because `fuse_rd_filtops' left non MP safe. knote_locked(9) used because the path covered by kernel lock. We have some places where selinfo is still used. All of them could be mechanically converted in this way and obsolete selwakeu

Make `logread_filtops' mpsafe

2023-09-14 Thread Vitaliy Makkoveev
`log_mtx' mutex(9) already used for message buffer protection, so use it to protect `logread_filtops' too. ok? Index: sys/kern/subr_log.c === RCS file: /cvs/src/sys/kern/subr_log.c,v retrieving revision 1.77 diff -u -p -r1.77 subr_lo

Re: Use counters_read(9) from ddb(4)

2023-09-15 Thread Vitaliy Makkoveev
On Fri, Sep 15, 2023 at 04:18:13PM +0200, Martin Pieuchot wrote: > On 11/09/23(Mon) 21:05, Martin Pieuchot wrote: > > On 06/09/23(Wed) 23:13, Alexander Bluhm wrote: > > > On Wed, Sep 06, 2023 at 12:23:33PM -0500, Scott Cheloha wrote: > > > > On Wed, Sep 06, 2023 at 01:04:19PM +0100, Martin Pieuchot

hyperv(4): use shared netlock to protect if_list and ifa_list walkthrough and data

2023-09-18 Thread Vitaliy Makkoveev
Context switch looks fine here. Index: sys/dev/pv/hypervic.c === RCS file: /cvs/src/sys/dev/pv/hypervic.c,v retrieving revision 1.19 diff -u -p -r1.19 hypervic.c --- sys/dev/pv/hypervic.c 11 Apr 2023 00:45:08 - 1.19 +++

hotplug(4): introduce `hotplug_mtx' mutex(9) and make `hotplugread_filterops' mp safe

2023-09-18 Thread Vitaliy Makkoveev
Also use this mutex to protect `evqueue_head', `evqueue_tail' and `evqueue_count'. Index: sys/dev/hotplug.c === RCS file: /cvs/src/sys/dev/hotplug.c,v retrieving revision 1.23 diff -u -p -r1.23 hotplug.c --- sys/dev/hotplug.c 8 Sep

Re: hotplug(4): introduce `hotplug_mtx' mutex(9) and make `hotplugread_filterops' mp safe

2023-09-18 Thread Vitaliy Makkoveev
On Mon, Sep 18, 2023 at 02:03:08PM +0300, Vitaliy Makkoveev wrote: > Also use this mutex to protect `evqueue_head', `evqueue_tail' and > `evqueue_count'. > Sorry, the right diff: Index: sys/dev/hotplug.c ==

Re: fix a wireguard mbuf leak

2023-09-22 Thread Vitaliy Makkoveev
On Fri, Sep 22, 2023 at 12:21:42PM +0900, YASUOKA Masahiko wrote: > A leak may happens when wgpeer is deleted. > > ok? > ok mvs@ > The state queue should be freeed when wg_peer is destroyed. > diff from IIJ. > > Index: sys/net/if_wg.c > =

hotplug(4): simplify buffer cleanup on device close

2023-09-23 Thread Vitaliy Makkoveev
`evqueue' is simple circular buffer. Its pretty enough to set head equal to tail to make it empty. Index: sys/dev/hotplug.c === RCS file: /cvs/src/sys/dev/hotplug.c,v retrieving revision 1.24 diff -u -p -r1.24 hotplug.c --- sys/dev/ho

<    1   2   3   4   5   6   7   8   >