Re: Prevent race in single_thread_set()

2020-12-01 Thread Martin Pieuchot
On 01/12/20(Tue) 10:21, Claudio Jeker wrote: > On Mon, Nov 30, 2020 at 07:19:28PM -0300, Martin Pieuchot wrote: > > On 04/11/20(Wed) 11:19, Martin Pieuchot wrote: > > > Here's a 3rd approach to solve the TOCTOU race in single_thread_set(). > > > The issue being that

Re: Prevent race in single_thread_set()

2020-11-30 Thread Martin Pieuchot
On 04/11/20(Wed) 11:19, Martin Pieuchot wrote: > Here's a 3rd approach to solve the TOCTOU race in single_thread_set(). > The issue being that the lock serializing access to `ps_single' is not > held when calling single_thread_check(). > > The approach below is controversial bec

Use SMR_TAILQ for `ps_threads'

2020-11-30 Thread Martin Pieuchot
Every multi-threaded process keeps a list of threads in `ps_threads'. This list is iterated in interrupt and process context which makes it complicated to protect it with a rwlock. One of the places where such iteration is done is inside the tsleep(9) routines, directly in single_thread_check()

Re: Switch select(2) to kqueue-based implementation

2020-11-30 Thread Martin Pieuchot
On 30/11/20(Mon) 17:06, Visa Hankala wrote: > On Mon, Nov 30, 2020 at 01:28:14PM -0300, Martin Pieuchot wrote: > > I plan to commit this in 3 steps, to ease a possible revert: > > - kevent(2) refactoring > > - introduction of newer kq* APIs > > - dopselect rewrite

Switch select(2) to kqueue-based implementation

2020-11-30 Thread Martin Pieuchot
Now that the kqueue refactoring has been committed, here's once again the diff to modify the internal implementation of {p,}select(2) to query kqfilter handlers instead of poll ones. {p,}poll(2) are left untouched to ease the transition. I plan to commit this in 3 steps, to ease a possible

Correct IPL for UVM pageqlock

2020-11-26 Thread Martin Pieuchot
As reported by AIsha Tammy on bugs@, there's a current recursion possible with the pageqlock: ddb> trace db_enter() at db_enter+0x10 panic(81dcd47c) at panic+0x12a mtx_enter(B219ed00) at mtx_enter+0x81 uvm_objfree(fd8015f2c9a0) at uvm_objfree+0x61

Re: Fewer uvmexp

2020-11-19 Thread Martin Pieuchot
On 19/11/20(Thu) 01:02, Jeremie Courreges-Anglas wrote: > On Wed, Nov 18 2020, Martin Pieuchot wrote: > > While auditing the various uses of the uvmexp fields I came across > > those under #ifdet notyet. May I delete them so I don't have to give > > them some MP l

Re: dt: add kernel function boundary tracing provider

2020-11-19 Thread Martin Pieuchot
Hello Tom, Thanks for sharing your work, that's awesome! On 14/11/20(Sat) 13:13, Tom Rollet wrote: > Here is a diff for dynamic tracing of kernel's functions boundaries. > It's implemented as one of the dt's provider on i386 and amd64. > To activate it, DDBPROF and pseudo device dt must be

Locking of uvm_pageclean()

2020-11-18 Thread Martin Pieuchot
I found another race related to some missing locking, this time around uvm_pageclean(). Diff below fixes the two places in /sys/uvm where the page queue lock should be taken. To prevent further corruption I added some assertions and documented some global data structures that are currently

Fewer uvmexp

2020-11-18 Thread Martin Pieuchot
While auditing the various uses of the uvmexp fields I came across those under #ifdet notyet. May I delete them so I don't have to give them some MP love? Ok? Index: arch/amd64//amd64/cpu.c === RCS file:

Re: uvm_pagealloc() & uvm.free accounting

2020-11-17 Thread Martin Pieuchot
On 17/11/20(Tue) 13:52, Mark Kettenis wrote: > > Date: Tue, 17 Nov 2020 09:32:28 -0300 > > From: Martin Pieuchot > > > > On 17/11/20(Tue) 13:23, Mark Kettenis wrote: > > > > Date: Mon, 16 Nov 2020 10:11:50 -0300 > > > > From: Martin Pieucho

Re: uvm_pagealloc() & uvm.free accounting

2020-11-17 Thread Martin Pieuchot
On 17/11/20(Tue) 13:23, Mark Kettenis wrote: > > Date: Mon, 16 Nov 2020 10:11:50 -0300 > > From: Martin Pieuchot > > > > On 13/11/20(Fri) 21:05, Mark Kettenis wrote: > > > [...] > > > > Careful reviewers will spot an off-by-one change in the check f

uvm_fault: refactoring for case 2 faults

2020-11-17 Thread Martin Pieuchot
Here's another refactoring that moves the remaining logic of uvm_fault() handling lower faults, case 2, to its own function. This logic shouldn't be modified in the first step of unlocking amap & anon and will still be executed under KERNEL_LOCK(). Having a separate function will however help to

Re: uvm_pagealloc() & uvm.free accounting

2020-11-16 Thread Martin Pieuchot
On 13/11/20(Fri) 21:05, Mark Kettenis wrote: > [...] > > Careful reviewers will spot an off-by-one change in the check for > > pagedaemon's reserved memory. My understanding is that it's a bugfix, > > is it correct? > > You mean for uvm_pagealloc(). I'd say yes. But this does mean that > in

uvm_fault: Kill goto Case2

2020-11-13 Thread Martin Pieuchot
Another simple refactoring of uvm_fault() removing a goto, ok? Index: uvm/uvm_fault.c === RCS file: /cvs/src/sys/uvm/uvm_fault.c,v retrieving revision 1.106 diff -u -p -r1.106 uvm_fault.c --- uvm/uvm_fault.c 13 Nov 2020 14:18:25

uvm_pagealloc() & uvm.free accounting

2020-11-13 Thread Martin Pieuchot
The uvmexp.free* variables are read in uvm_pagealloc() & uvm_pglistalloc() before grabbing the `fpageqlock'. `uvmexp.free' is always modified by the pmemrange allocator under the above motioned lock. To avoid races and the chances of missing a wakeup, the diff below move the checks inside the

Re: uvm_fault: is there an anon?

2020-11-13 Thread Martin Pieuchot
On 04/11/20(Wed) 11:04, Martin Pieuchot wrote: > Diff below introduces a helper that looks for existing mapping. The > value returned by this lookup function determines if there's an anon > at the faulting address which tells us if we're dealign with a fault > of type 1 or 2. >

Document art locking fields

2020-11-11 Thread Martin Pieuchot
While discussing the new source address mechanism with denis@, I figured those ought to be documented. Note that `ar_rtableid' is unused and can die. The ART code is actually free from any network knowledge. ok? Index: net/art.c

Re: Use selected source IP when replying to reflecting ICMP

2020-11-08 Thread Martin Pieuchot
On 08/11/20(Sun) 18:05, Denis Fondras wrote: > ICMP error replies are sent from the IP of the interface the packet came in > even > when the source IP was forced with route(8). icmp_reflect() is called without the KERNEL_LOCK(). rtable_getsource() and ifa_ifwithaddr() are not safe to do so. So

sendsig() & sigexit()

2020-11-06 Thread Martin Pieuchot
Diff below moves the various sigexit() from all MD sendsig() to the MI trapsignal(). Apart from the obvious code simplification, this will help with locking as sigexit() does not return. ok? Index: arch/alpha/alpha/machdep.c ===

uvm_fault: split out handling of case 1

2020-11-06 Thread Martin Pieuchot
Diff below moves the logic dealing with faults of case 1A & 1B to its own function. With this, the logic in uvm_fault() now only deals with case 2 and the various if/else/goto dances can be simplified. As for the previous refactoring diffs, the name is taken from NetBSD but the implementation is

uvmfault_unlockall: kill unused anon argument

2020-11-05 Thread Martin Pieuchot
One of the functions call in uvm_fault() passes a non-initialized `oanon' argument. This bug is harmless as long as there is no locking associated to amap & anons. But more importantly an `amap' is passed to the function any given anon should share its lock, so this parameter is redundant. ok

Prevent race in single_thread_set()

2020-11-04 Thread Martin Pieuchot
Here's a 3rd approach to solve the TOCTOU race in single_thread_set(). The issue being that the lock serializing access to `ps_single' is not held when calling single_thread_check(). The approach below is controversial because it extends the scope of the SCHED_LOCK(). On the other hand, the two

uvm_fault: is there an anon?

2020-11-04 Thread Martin Pieuchot
Diff below introduces a helper that looks for existing mapping. The value returned by this lookup function determines if there's an anon at the faulting address which tells us if we're dealign with a fault of type 1 or 2. This small refactoring is part of the current work to separate the code

Turn SCHED_LOCK() into a mutex

2020-11-04 Thread Martin Pieuchot
Diff below removes the recursive attribute of the SCHED_LOCK() by turning it into a IPL_NONE mutex. I'm not intending to commit it yet as it raises multiple questions, see below. This work has been started by art@ more than a decade ago and I'm willing to finish it as I believe it's the easiest

Re: amap: introduce amap_adjref_anons()

2020-10-30 Thread Martin Pieuchot
On 23/10/20(Fri) 10:31, Martin Pieuchot wrote: > More refactoring. This time let's introduce a helper to manipulate > references. The goal is to reduce the upcoming diff adding locking. > > This is extracted from a bigger diff from guenther@ as well as some > bits fr

Re: Please test: switch select(2) to kqfilters

2020-10-30 Thread Martin Pieuchot
On 26/10/20(Mon) 11:57, Scott Cheloha wrote: > On Mon, Oct 12, 2020 at 11:11:36AM +0200, Martin Pieuchot wrote: > [...] > > +/* > > + * Scan the kqueue, blocking if necessary until the target time is reached. > > + * If tsp is NULL we block indefinitely. If tsp->ts_se

amap: introduce amap_adjref_anons()

2020-10-23 Thread Martin Pieuchot
More refactoring. This time let's introduce a helper to manipulate references. The goal is to reduce the upcoming diff adding locking. This is extracted from a bigger diff from guenther@ as well as some bits from NetBSD. ok? Index: uvm/uvm_amap.c

const/C99 & locks for uvm_pagerops

2020-10-20 Thread Martin Pieuchot
Diff below use C99 initializer and constify the various "struct uvm_pagerops" in tree. While here add some KERNEL_ASSERT_LOCKED() to places where the `uobj' locking has been removed and that should be revisited. This is to help a future myself or another developer to look at what needs some

kqueue_scan() refactoring

2020-10-19 Thread Martin Pieuchot
Diff below is the second part of the refactoring [0] that got committed then reverted 6 months ago. The idea of this refactoring is to move the accounting and markers used when collecting events to a context (struct kqueue_scan_state) such that the collecting routine (kqueue_scan()) can be called

UVM fault check

2020-10-19 Thread Martin Pieuchot
uvm_fault() is one of the most contended "entry point" of the kernel. To reduce this contention I'm carefully refactoring this code to be able to push the KERNEL_LOCK() inside the fault handler. The first aim of this project would be to get the upper layer faults (cases 1A and 1B) out of ze big

uao_init() cleanup

2020-10-19 Thread Martin Pieuchot
uao_init() is called from uvm_km_init() which itself is called by uvm_init(). None of the *init() functions in UVM have a guard, so be coherent and remove this one. ok? Index: uvm/uvm_aobj.c === RCS file:

uvm_grow(): serialize updates

2020-10-14 Thread Martin Pieuchot
Getting uvm_fault() out of the KERNEL_LOCK() alone is not enough to reduce the contention due to page faults. A single part of the handler spinning on the lock is enough to hide bugs and increase latency. One recent example is the uvm_map_inentry() check. uvm_grow() is another small function

Re: Please test: switch select(2) to kqfilters

2020-10-12 Thread Martin Pieuchot
On 09/10/20(Fri) 10:38, Martin Pieuchot wrote: > On 02/10/20(Fri) 12:19, Martin Pieuchot wrote: > > Diff below modifies the internal implementation of {p,}select(2) to > > query kqfilter handlers instead of poll ones. > > > > I deliberately left {p,}poll(2) untou

Re: xhci zero length transfers 'leak' one transfer buffer count

2020-10-11 Thread Martin Pieuchot
On 09/10/20(Fri) 12:37, Jonathon Fletcher wrote: > In xhci_xfer_get_trb, the count of transfer buffers in the pipe > (xp->free_trbs) is always decremented but the count of transfer buffers used > in the transfer (xx->ntrb) is not incremented for zero-length transfers. The > result of this is

Re: Please test: switch select(2) to kqfilters

2020-10-09 Thread Martin Pieuchot
On 02/10/20(Fri) 12:19, Martin Pieuchot wrote: > Diff below modifies the internal implementation of {p,}select(2) to > query kqfilter handlers instead of poll ones. > > I deliberately left {p,}poll(2) untouched to ease the transition. > > This diff includes some kqueue ref

Re: amap: KASSERT()s and local variables

2020-10-07 Thread Martin Pieuchot
On 01/10/20(Thu) 14:18, Martin Pieuchot wrote: > Use more KASSERT()s instead of the "if (x) panic()" idiom for sanity > checks and add a couple of local variables to reduce the difference > with NetBSD and help for upcoming locking. deraadt@ mentioned that KASSERT()s are not e

kqueue_scan() refactoring

2020-10-07 Thread Martin Pieuchot
The diff below has already been presented in August [0]. It is the first step at splitting the kqueue refactoring required for the poll and select rewrite. This first iteration introduces the new API with only minimal changes. The only change in behavior is that the markers' `kn_filter' and

uvm/uvm_map.h header cleanup

2020-10-07 Thread Martin Pieuchot
Now that all architectures have been fixed, it's time to remove this... ok? Index: uvm/uvm_map.h === RCS file: /cvs/src/sys/uvm/uvm_map.h,v retrieving revision 1.67 diff -u -p -r1.67 uvm_map.h --- uvm/uvm_map.h 18 Dec 2019

Re: mmap: Do not push KERNEL_LOCK() too far

2020-10-05 Thread Martin Pieuchot
On 03/10/20(Sat) 12:59, Mark Kettenis wrote: > > Date: Fri, 2 Oct 2020 10:32:27 +0200 > > From: Martin Pieuchot > > > > On 01/10/20(Thu) 21:44, Mark Kettenis wrote: > > > > Date: Thu, 1 Oct 2020 14:10:56 +0200 > > > > From: Martin Pieuchot > &g

Re: Please test: switch select(2) to kqfilters

2020-10-03 Thread Martin Pieuchot
On 02/10/20(Fri) 19:09, Scott Cheloha wrote: > On Fri, Oct 02, 2020 at 12:19:35PM +0200, Martin Pieuchot wrote: > > @@ -635,12 +642,39 @@ sys_kevent(struct proc *p, void *v, regi > > goto done; > > } > > > > + > > KQREF(kq); > >

Please test: switch select(2) to kqfilters

2020-10-02 Thread Martin Pieuchot
Diff below modifies the internal implementation of {p,}select(2) to query kqfilter handlers instead of poll ones. I deliberately left {p,}poll(2) untouched to ease the transition. This diff includes some kqueue refactoring from visa@. It is built on top of the changes that went in during the

Re: mmap: Do not push KERNEL_LOCK() too far

2020-10-02 Thread Martin Pieuchot
On 01/10/20(Thu) 21:44, Mark Kettenis wrote: > > Date: Thu, 1 Oct 2020 14:10:56 +0200 > > From: Martin Pieuchot > > > > While studying a bug report from naddy@ in 2017 when testing guenther@'s > > amap/anon locking diff I figured out that we have been too optimi

Re: KASSERT() for VOP_*

2020-10-01 Thread Martin Pieuchot
On 09/09/20(Wed) 08:41, Martin Pieuchot wrote: > This is mostly the same diff that has been backed out months ago with > the VOP_CLOSE() case fixed. VOP_CLOSE() can accept a NULL argument > instead of `curproc' when garbage collecting passed FDs. > > The intent is to stop passing

amap: KASSERT()s and local variables

2020-10-01 Thread Martin Pieuchot
Use more KASSERT()s instead of the "if (x) panic()" idiom for sanity checks and add a couple of local variables to reduce the difference with NetBSD and help for upcoming locking. ok? Index: uvm/uvm_amap.c === RCS file:

mmap: Do not push KERNEL_LOCK() too far

2020-10-01 Thread Martin Pieuchot
While studying a bug report from naddy@ in 2017 when testing guenther@'s amap/anon locking diff I figured out that we have been too optimistic in the !MAP_ANON case. The reported panic involves, I'd guess, a race between fd_getfile() and vref(): panic: vref used where vget required

Garbage fix for USB_GET_FULL_DESC

2020-09-28 Thread Martin Pieuchot
Copy with uiomove(9) the correct size of the descriptor and not a random value from the stack. This is Coverity CID 1497167. As I understand it there's no security impact as the size is always caped by `ufd_size' however the returned descriptor might be corrupted and this can explain why

uvm_swapisfull()

2020-09-25 Thread Martin Pieuchot
Introduce a new helper to check global variables that will need to be protected from concurrent access. The name is taken from NetBSD in order to reduce the difference between our trees. Diff below introduces a syntax change in uvmpd_scan() where the boolean test "<" becomes "!=". This

Re: Push back kernel lock a bit in amd64 pageflttrap()

2020-09-25 Thread Martin Pieuchot
On 24/09/20(Thu) 16:35, Mark Kettenis wrote: > This avoids taking the kernel lock when ci_inatomic is set. This > might speed up inteldrm(4) a bit. Since uvm_grow() still needs the > kernel lock, some reorganization of the code is necessary. > > I'm not sure this actaully has an impact. If we

amap: panic -> KASSERT

2020-09-24 Thread Martin Pieuchot
Convert various "if (x) panic()" idioms into "KASSERT(!x)". The panic message isn't helping for such sanity checks and this help reducing the diff with NetBSD. ok? Index: uvm/uvm_amap.c === RCS file: /cvs/src/sys/uvm/uvm_amap.c,v

Re: uvm: __inline -> inline

2020-09-22 Thread Martin Pieuchot
On 22/09/20(Tue) 10:20, Mark Kettenis wrote: > > Date: Tue, 22 Sep 2020 09:15:00 +0200 > > From: Martin Pieuchot > > > > Spell inline correctly, also reduce the diff with NetBSD for uvm_amap.c > > and uvm_fault.c. > > > > ok? > > In g

pmap_enter(9) doesn't sleep

2020-09-22 Thread Martin Pieuchot
Allocations in the various pmap_enter(9) are done with uvm_pagealloc(9), which sets the UVM_PLA_NOWAIT flag, and/or with pool_get(9) w/ PR_NOWAIT. So the comment below seems outdated to me, ok to kill it? Index: uvm/uvm_fault.c ===

uvm: __inline -> inline

2020-09-22 Thread Martin Pieuchot
Spell inline correctly, also reduce the diff with NetBSD for uvm_amap.c and uvm_fault.c. ok? Index: uvm/uvm_addr.c === RCS file: /cvs/src/sys/uvm/uvm_addr.c,v retrieving revision 1.28 diff -u -p -r1.28 uvm_addr.c --- uvm/uvm_addr.c

Re: sigabort(), p_sigmask & p_siglist

2020-09-16 Thread Martin Pieuchot
On 16/09/20(Wed) 02:08, Theo de Raadt wrote: > Something doesn't feel right. > > db_kill_cmd finds a process, called p, then kills it. In your new diff > calling sigexit, take note of the comment at the top: > > * Force the current process to exit with the specified signal, dumping core > >

Re: KASSERT() for VOP_*

2020-09-16 Thread Martin Pieuchot
On 09/09/20(Wed) 08:41, Martin Pieuchot wrote: > This is mostly the same diff that has been backed out months ago with > the VOP_CLOSE() case fixed. VOP_CLOSE() can accept a NULL argument > instead of `curproc' when garbage collecting passed FDs. > > The intent is to stop passing

Re: sigabort(), p_sigmask & p_siglist

2020-09-16 Thread Martin Pieuchot
On 16/09/20(Wed) 06:09, Miod Vallat wrote: > > > Diff below introduces an helper for sending an uncatchable SIGABRT and > > annotate that `p_siglist' and `p_sigmask' are updated using atomic > > operations. > > Why not use sigexit(p, SIGABRT); for that purpose? That's a better solution indeed.

sigabort(), p_sigmask & p_siglist

2020-09-15 Thread Martin Pieuchot
Diff below introduces an helper for sending an uncatchable SIGABRT and annotate that `p_siglist' and `p_sigmask' are updated using atomic operations. As a result setsigvec() becomes local to kern/kern_sig.c. Note that other places in the kernel use sigexit(p, SIGABRT) for the same purpose and

curproc vs MP vs locking

2020-09-15 Thread Martin Pieuchot
Many functions in the kernel take a "struct proc *" as argument. When reviewing diffs or reading the signature of such functions it is not clear if this pointer can be any thread or if it is, like in many cases, pointing to `curproc'. This distinction matters when it comes to reading/writing

Re: go/rust vs uvm_map_inentry()

2020-09-13 Thread Martin Pieuchot
On 13/09/20(Sun) 16:54, Mark Kettenis wrote: > > Date: Sun, 13 Sep 2020 16:49:48 +0200 > > From: Sebastien Marie > > > > On Sun, Sep 13, 2020 at 03:29:57PM +0200, Martin Pieuchot wrote: > > > I'm no longer able to reproduce the corruption while building

Re: pppoe: move softc list out of NET_LOCK() into new pppoe lock

2020-09-13 Thread Martin Pieuchot
On 13/09/20(Sun) 15:12, Klemens Nanni wrote: > This is my first try trading global locks for interface specific ones. > > pppoe(4) keeps a list of all its interfaces which is then obviously > traversed during create and destroy. > > Currently, the net lock is grabbed for this, but there seems to

go/rust vs uvm_map_inentry()

2020-09-13 Thread Martin Pieuchot
I'm no longer able to reproduce the corruption while building lang/go with the diff below. Something relevant to threading change in go since march? Can someone try this diff and tell me if go and/or rust still fail? Index: uvm/uvm_map.c

Re: pppoe: start documenting locks

2020-09-13 Thread Martin Pieuchot
On 13/09/20(Sun) 10:05, Klemens Nanni wrote: > > Here's a start at struct pppoe_softc; for every member I went through > code paths looking for *_LOCK() or *_ASSERT_LOCKED(). > > Pretty much all members are under the net lock, some are proctected by > both net and kernel lock, e.g. the start

Re: sppp: add free() sizes

2020-09-12 Thread Martin Pieuchot
On 12/09/20(Sat) 14:49, Klemens Nanni wrote: > These are the last free(buf, 0) occurences in if_pppoe.c and > if_spppsubr.c changing to non-zero sizes. > > I've been running with this the last week without any issues. > > Feedback? OK? Maybe store `pwdlen' and `idlen' in "struct sppp" instead

UVM tracepoints for dt(4)

2020-09-11 Thread Martin Pieuchot
To investigate the race exposed by the last locking change in uvm_map_inentry() [0], I'd like to add the following tracepoints. The idea is to compare page fault addresses and permissions with the insertion/removal of entries in a given map. Diff below is the first part of the puzzle, ok? [0]

Re: issignal() w/o KERNEL_LOCK()

2020-09-09 Thread Martin Pieuchot
On 09/09/20(Wed) 10:02, Claudio Jeker wrote: > On Wed, Sep 09, 2020 at 08:33:30AM +0200, Martin Pieuchot wrote: > > Per-process data structures needed to suspend the execution of threads > > are since recently protected by the SCHED_LOCK(). So the KERNEL_LOCK() > >

KASSERT() for VOP_*

2020-09-09 Thread Martin Pieuchot
This is mostly the same diff that has been backed out months ago with the VOP_CLOSE() case fixed. VOP_CLOSE() can accept a NULL argument instead of `curproc' when garbage collecting passed FDs. The intent is to stop passing a "struct proc *" when a function applies only to `curproc'.

sigismasked()

2020-09-09 Thread Martin Pieuchot
Simple helper function to centralize the manipulation of `ps_sigignore' and `p_sigmask' in kern/kern_sig.c and later on add the corresponding asserts, ok? Index: kern/kern_sig.c === RCS file: /cvs/src/sys/kern/kern_sig.c,v retrieving

issignal() w/o KERNEL_LOCK()

2020-09-09 Thread Martin Pieuchot
Per-process data structures needed to suspend the execution of threads are since recently protected by the SCHED_LOCK(). So the KERNEL_LOCK() dance inside issignal() is no longer necessary and can be removed, ok? Note that CURSIG() is currently always called with the KERNEL_LOCK() held so the

m_defrag(9) leak

2020-08-25 Thread Martin Pieuchot
Maxime Villard mentioned a leak due to a missing m_freem() in wg(4): https://marc.info/?l=netbsd-tech-net=159827988018641=2 It seems to be that such leak is present in other uses of m_defrag() in the tree. I won't take the time to go though all of them, but an audit would be welcome.

Enable EVFILT_EXCEPT

2020-08-21 Thread Martin Pieuchot
The kqueue-based poll(2) backend is still a WIP due to regressions in the kqueue layer. In the meantime should we expose EVFILT_EXCEPT to userland? The diff below should be enough to allow userland apps to use the new code paths. ok? Index: sys/event.h

Re: Fewer pool_get() in kqueue_register()

2020-08-19 Thread Martin Pieuchot
On 18/08/20(Tue) 15:30, Visa Hankala wrote: > On Tue, Aug 18, 2020 at 11:04:47AM +0200, Martin Pieuchot wrote: > > Diff below changes the order of operations in kqueue_register() to get > > rid of an unnecessary pool_get(). When an event is already present on > > the list tr

Re: Fewer pool_get() in kqueue_register()

2020-08-18 Thread Martin Pieuchot
On 18/08/20(Tue) 11:22, Mark Kettenis wrote: > > Date: Tue, 18 Aug 2020 11:04:47 +0200 > > From: Martin Pieuchot > > > > Diff below changes the order of operations in kqueue_register() to get > > rid of an unnecessary pool_get(). When an event is already present

Push KERNEL_LOCK/UNLOCK in trapsignal()

2020-08-18 Thread Martin Pieuchot
Taken from a larger diff from claudio@, this reduces the lock dances in MD code and put it where we should focus our effort in kern/kern_sig.c. ok? Index: kern/kern_sig.c === RCS file: /cvs/src/sys/kern/kern_sig.c,v retrieving

Fewer pool_get() in kqueue_register()

2020-08-18 Thread Martin Pieuchot
Diff below changes the order of operations in kqueue_register() to get rid of an unnecessary pool_get(). When an event is already present on the list try to acquire it first. Note that knote_acquire() may sleep in which case the list might have changed so the lookup has to always begin from the

kqueue_scan_setup/finish

2020-08-14 Thread Martin Pieuchot
The previous change introducing the kqueue_scan_setup()/finish() API required to switch poll(2) internals to use the kqueue mechanism has been backed out. The reason for the regression is still unknown, so let's take a baby step approach. Diff below introduces the new API with only minimal

Re: TCP congestion control progression

2020-08-14 Thread Martin Pieuchot
On 13/08/20(Thu) 10:14, Brian Brombacher wrote: > > > >> On Aug 9, 2020, at 6:29 PM, Chris Cappuccio wrote: > > Brian Brombacher [br...@planetunix.net] wrote: > >> > >> I am wondering what approach the project is planning to use to modernize > >> the congestion control algorithms. I'm

Re: pppx(4): move ifnet out of KERNEL_LOCK()

2020-08-06 Thread Martin Pieuchot
On 05/08/20(Wed) 12:50, Vitaliy Makkoveev wrote: > pipex(4) and pppx(4) are ready to became a little bit more MP capable. > Diff below moves pppx(4) related `ifnet' out of KERNEL_LOCK(). Nice, one comment below. > Index: sys/net/if_pppx.c >

Re: pipex(4): kill pipexintr()

2020-07-31 Thread Martin Pieuchot
On 31/07/20(Fri) 21:58, Vitaliy Makkoveev wrote: > [...] > What denies us to move pipex(4) under it's own lock? Such question won't lead us anywhere. It assumes it makes sense to move pipex under its own lock. This assumption has many drawback which clearly haven't been studied and more

Re: pipex(4): kill pipexintr()

2020-07-31 Thread Martin Pieuchot
On 31/07/20(Fri) 12:15, Vitaliy Makkoveev wrote: > On Fri, Jul 31, 2020 at 09:36:32AM +0900, YASUOKA Masahiko wrote: > > On Thu, 30 Jul 2020 22:43:10 +0300 > > Vitaliy Makkoveev wrote: > > > On Thu, Jul 30, 2020 at 10:05:13PM +0900, YASUOKA Masahiko wrote: > > >> On Thu, 30 Jul 2020 15:34:09

Re: usbd_abort_pipe(); usbd_close_pipe; dance

2020-07-31 Thread Martin Pieuchot
On 31/07/20(Fri) 11:22, Marcus Glocker wrote: > Maybe I'm missing something here. You aren't. Historically usbd_close_pipe() wasn't aborting transfers. We changed it to do so as it happened to be the easiest fix to some issues that had been copy/pasted. It's just that nobody took the time to do

Re: pipex(4): kill pipexintr()

2020-07-31 Thread Martin Pieuchot
On 30/07/20(Thu) 21:13, YASUOKA Masahiko wrote: > sys/net/if_ethersubr.c: > 372 void > 373 ether_input(struct ifnet *ifp, struct mbuf *m) > (snip) > 519 #if NPPPOE > 0 || defined(PIPEX) > 520 case ETHERTYPE_PPPOEDISC: > 521 case ETHERTYPE_PPPOE: > 522 if (m->m_flags

Re: pipex(4): kill pipexintr()

2020-07-30 Thread Martin Pieuchot
On 29/07/20(Wed) 23:04, Vitaliy Makkoveev wrote: > Now pipex(4) is fully covered by NET_LOCK() and this is documented. But > we still have an issue with pipex(4) session itself and I guess it's > time to fix it. > > We have `pipexinq' and `pipexoutq' mbuf(9) queues to store mbufs. Each > mbuf(9)

Re: xhci(4) isoc: fix bogus handling of chained TRBs

2020-07-28 Thread Martin Pieuchot
On 26/07/20(Sun) 16:23, Marcus Glocker wrote: > On Sun, 26 Jul 2020 13:27:34 + > sc.dy...@gmail.com wrote: > > > On 2020/07/26 10:54, Marcus Glocker wrote: > > > On Sat, 25 Jul 2020 20:31:44 + > > > sc.dy...@gmail.com wrote: > > > > > >> On 2020/07/25 18:10, Marcus Glocker wrote: > >

Re: net80211: skip input block ack window gaps faster

2020-07-28 Thread Martin Pieuchot
On 17/07/20(Fri) 18:15, Stefan Sperling wrote: > On Fri, Jul 17, 2020 at 03:59:38PM +0200, Stefan Sperling wrote: > > While measuring Tx performance at a fixed Tx rate with iwm(4) I observed > > unexpected dips in throughput measured by tcpbench. These dips coincided > > with one or more gap

Re: pipex(4): document global data locks

2020-07-28 Thread Martin Pieuchot
On 17/07/20(Fri) 17:04, Vitaliy Makkoveev wrote: > Subj. Also add NET_ASSERT_LOCKED() to pipex_{link,unlink,rele}_session() > to be sure they called under NET_LOCK(). pipex_rele_session() is freeing memory. When this function is called those chunks of memory shouldn't be referenced by any other

Re: pipex_iface_fini() release multicast session under NET_LOCK()

2020-07-28 Thread Martin Pieuchot
On 17/07/20(Fri) 16:29, Vitaliy Makkoveev wrote: > We are going to lock the whole pipex(4) by NET_LOCK(). So move > `multicast_session' freeing undet NET_LOCK() too. pipex_iface_fini() should be called on the last reference of the descriptor. So this shouldn't be necessary. If there's an issue

Re: fix races in if_clone_create()

2020-07-13 Thread Martin Pieuchot
On 06/07/20(Mon) 15:44, Vitaliy Makkoveev wrote: > > On 6 Jul 2020, at 12:17, Martin Pieuchot wrote: > > Assertions and documentation are more important than preventing races > > because they allow to build awareness and elegant solutions instead of > > hacking diffs

Re: pppac(4): fix races in pppacopen()

2020-07-13 Thread Martin Pieuchot
On 11/07/20(Sat) 23:51, Vitaliy Makkoveev wrote: > [...] > The way you suggest to go is to introduce rwlock and serialize > pppacopen() and pppacclose(). This is bad idea because we will sleep > while we are holding rwlock. That's the whole point of a rwlock to be able to sleep while holding the

Re: pppac(4): fix races in pppacopen()

2020-07-11 Thread Martin Pieuchot
On 10/07/20(Fri) 14:38, Vitaliy Makkoveev wrote: > On Fri, Jul 10, 2020 at 01:22:40PM +0200, Martin Pieuchot wrote: > > On 10/07/20(Fri) 14:07, Vitaliy Makkoveev wrote: > > > We have some races in pppac(4) > > > 1. malloc(9) can sleep so we must check `sc' presence aft

Re: pppac(4): fix races in pppacopen()

2020-07-10 Thread Martin Pieuchot
On 10/07/20(Fri) 14:07, Vitaliy Makkoveev wrote: > We have some races in pppac(4) > 1. malloc(9) can sleep so we must check `sc' presence after malloc(9) Makes sense. > 2. we can sleep between `sc' insertion to `sc_entry' list and > `sc_pipex_iface' initialization. Concurrent pppacioctl() can

Re: pipex(4): kill pipexintr()

2020-07-10 Thread Martin Pieuchot
On 07/07/20(Tue) 01:01, Vitaliy Makkoveev wrote: > On Mon, Jul 06, 2020 at 08:47:23PM +0200, Martin Pieuchot wrote: > > On 06/07/20(Mon) 19:23, Vitaliy Makkoveev wrote: > > > > On 6 Jul 2020, at 17:36, Martin Pieuchot wrote: > > > [...] > > > Unfortu

Re: pppx_if_output() don't lock `pppx_devs_lk'

2020-07-10 Thread Martin Pieuchot
On 08/07/20(Wed) 12:05, Vitaliy Makkoveev wrote: > `pppx_devs_lk' used to protect `pxd_entry' list. We lock `pppx_devs_lk' > in pppx_if_output() to be sure `pxd' is not destroyed by concurrent > pppxclose() but it's useless. We destroy all corresponding `pxi' before > `pxd' and `ifnet's are

Re: USB3 stack with async. transfers support

2020-07-08 Thread Martin Pieuchot
On 07/07/20(Tue) 11:13, Martin wrote: > Hi tech@, > > Not so long ago I've ported UHD driver to support Ettus USRP devices which > uses libusb and asynchronous USB3 data transfers. > Is USB3 async. data stack implemented or planned to have some devices like > USRP working? I'm not aware of

Re: pipex(4): kill pipexintr()

2020-07-06 Thread Martin Pieuchot
On 06/07/20(Mon) 19:23, Vitaliy Makkoveev wrote: > > On 6 Jul 2020, at 17:36, Martin Pieuchot wrote: > [...] > Unfortunately you can’t be sure about NET_LOCK() status while you are > in pppac_start(). It was described at this thread [1]. > > We have two cases: > 1. p

Re: pipex(4): kill pipexintr()

2020-07-06 Thread Martin Pieuchot
On 06/07/20(Mon) 16:42, Vitaliy Makkoveev wrote: > [...] > pipex(4) is simultaneously locked by NET_LOCK() and KERNEL_LOCK() but > with two exceptions: > > 1. As you pointed pipex_pppoe_input() called without KERNEL_LOCK() held. > 2. pppac_start() called without NET_LOCK() held. Or with

Re: fix races in if_clone_create()

2020-07-06 Thread Martin Pieuchot
On 01/07/20(Wed) 00:02, Vitaliy Makkoveev wrote: > On Tue, Jun 30, 2020 at 03:48:22PM +0300, Vitaliy Makkoveev wrote: > > On Tue, Jun 30, 2020 at 12:08:03PM +0200, Martin Pieuchot wrote: > > > On 29/06/20(Mon) 11:59, Vitaliy Makkoveev wrote: > > > > [...] > &g

Re: pipex(4): kill pipexintr()

2020-07-06 Thread Martin Pieuchot
On 01/07/20(Wed) 22:42, Vitaliy Makkoveev wrote: > pipex(4) has 2 mbuf queues: `pipexinq' and `pipexoutq'. When mbuf passed > to pipex it goes to one of these queues and pipexintr() will be > scheduled to process them. pipexintr() called from `netisr' context. > > It's true for pppac(4) but for

Re: fix races in if_clone_create()

2020-06-30 Thread Martin Pieuchot
On 29/06/20(Mon) 11:59, Vitaliy Makkoveev wrote: > [...] > I reworked tool for reproduce. Now I avoided fork()/exec() route and it > takes couple of minutes to take panic on 4 cores. Also some screenshots > attached. Setting kern.pool_debug=2 makes the race reproducible in seconds. Could you

Re: route add ::/0 ...

2020-06-29 Thread Martin Pieuchot
On 28/06/20(Sun) 20:41, YASUOKA Masahiko wrote: > Hi, > > When "::/0" is used as "default", > > # route add ::/0 fe80::1%em0 > add net ::/0: gateway fe80::1%em0: Invalid argument > > route command trims the sockaddr to { .len = 2, .family = AF_INET6 } > for "::/0", but rtable_satoplen()

Re: pipex(4): use reference counters for `ifnet'

2020-06-28 Thread Martin Pieuchot
On 27/06/20(Sat) 17:58, Vitaliy Makkoveev wrote: > > [...] > > Look at r1.329 of net/if.c. Prior to this change if_detach_queues() was > > used to free all mbufs when an interface was removed. Now lazy freeing > > is used everytime if_get(9) rerturns NULL. > > > > This is possible because we

<    1   2   3   4   5   6   7   8   9   10   >