Re: Unexpected out of memory kills when running parallel find instances over millions of files

2023-10-19 Thread Mateusz Guzik
On Thu, Oct 19, 2023 at 10:49:37AM +, Michael van Elst wrote: > mjgu...@gmail.com (Mateusz Guzik) writes: > > >Running 20 find(1) instances, where each has a "private" tree with > >million of files runs into trouble with the kernel killing them (and > >ot

Unexpected out of memory kills when running parallel find instances over millions of files

2023-10-19 Thread Mateusz Guzik
the expected outcome is that this finishes (extra points for reasonable time) instead of having userspace getting killed. I don't know what kind of diagnostic info would be best here, but given repro steps above I don't think I need to look for something. Have fun. :) -- Mateusz Guzik

Re: tmpfs vs VV_LOCKSWORK

2020-08-20 Thread Mateusz Guzik
Now that I sent the e-mail I had a look at access(). The benchmark at hand is custom code added to will-it-scale, pasted at the bottom. On 8/20/20, Mateusz Guzik wrote: > tmpfs does *NOT* set the flag. While I can't be arsed to verify > semantics, I suspect it is more than el

tmpfs vs VV_LOCKSWORK

2020-08-20 Thread Mateusz Guzik
de_t *node) /* FALLTHROUGH */ case VLNK: case VREG: + vp->v_vflag |= VV_LOCKSWORK; + /* FALLTHROUGH */ case VSOCK: vp->v_op = tmpfs_vnodeop_p; break; -- Mateusz Guzik

Re: notes from running will-it-scale

2020-07-19 Thread Mateusz Guzik
> > Le dim. 19 juil. 2020 à 13:21, Mateusz Guzik a écrit : >> >> Hello, >> >> I recently took an opportunity to run cross-systems microbenchmarks >> with will-it-scale and included NetBSD (amd64). >> >> https://people.freebsd.org/~mjg/freebsd-drag

notes from running will-it-scale

2020-07-19 Thread Mateusz Guzik
the equivalent on FreeBSD (ifunc'ed): ENTRY(pagezero_std) PUSH_FRAME_POINTER movl$PAGE_SIZE/8,%ecx xorl%eax,%eax rep stosq POP_FRAME_POINTER ret END(pagezero_std) ENTRY(pagezero_erms) PUSH_FRAME_POINTER movl$PAGE_SIZE,%ecx xorl%eax,%eax rep stosb POP_FRAME_POINTER ret END(pagezero_erms) -- Mateusz Guzik

Re: Please review: lookup changes

2020-03-11 Thread Mateusz Guzik
win and certainly better than rbtree. > Well in my tests this is all heavily dominated by SMP-effects, which I expect to be exacerbated by just one lock. Side note is that I had a look at your vput. The pre-read + VOP_UNLOCK + actual loop to drop the ref definitely slow things down if only a little bit as this can force a shared cacheline transition from under someone cmpxching. That said, can you generate a flamegraph from a fully patched kernel? Curious where the time is spent now, my bet is spinning on vnode locks. -- Mateusz Guzik

Re: Please review: lookup changes

2020-03-11 Thread Mateusz Guzik
Something ate a huge chunk of my e-mail, resending On 3/8/20, Andrew Doran wrote: > On Sat, Mar 07, 2020 at 12:14:05AM +0100, Mateusz Guzik wrote: >> I believe it is always legal to just upgrade the lock in getattr if >> necessary. Since itimes updates are typically not needed,

Re: Adaptive RW locks

2020-03-11 Thread Mateusz Guzik
urnstile lock held. Then everyone waiting can look it up based on the lock. Of course one can also "give up" and add another word for the pointer, then this mostly degrades to regular read-write lock problems. -- Mateusz Guzik

Re: Blocking vcache_tryvget() across VOP_INACTIVE() - unneeded

2020-01-23 Thread Mateusz Guzik
s well, get the basis in place. > But a hash which has collisions taken care of with linked lists is the easiest data structure to scale in this regard. Additions and removals in the face of concurrent lookups are easy to handle (and in fact the current code already does it to some extent). With no sa

Re: Blocking vcache_tryvget() across VOP_INACTIVE() - unneeded

2020-01-21 Thread Mateusz Guzik
On 1/21/20, Andrew Doran wrote: > On Thu, Jan 16, 2020 at 04:51:44AM +0100, Mateusz Guzik wrote: >> >> I'm assuming the goal for the foreseeable future is to achieve path >> lookup >> with shared vnode locks. > > Good guess. There is a prototype of LK_SHARED lo

Re: Blocking vcache_tryvget() across VOP_INACTIVE() - unneeded

2020-01-15 Thread Mateusz Guzik
is to move some of the current body into smaller halpers and then let filesystem piece together their own VOP_VRELE. -- Mateusz Guzik

Re: fcntl(F_GETPATH) support

2019-09-16 Thread Mateusz Guzik
reated files to the namecache and perhaps NetBSD is not doing it either. In which case F_GETPATH on newly created files is guaranteed to fail unless someone looked them up separately. Whether this should be modified is imho a separate discussion, but is relevant to points made earlier about memory use and DoSability. Imho it should be fine to change the kernel to always add these. -- Mateusz Guzik

Re: fcntl(F_GETPATH) support

2019-09-15 Thread Mateusz Guzik
On 9/15/19, Mateusz Guzik wrote: > On 9/14/19, Christos Zoulas wrote: >> >> Comments? >> >> +error = vnode_to_path(kpath, MAXPATHLEN, fp->f_vnode, l, l->l_proc); > > What motivates this change? > > I think it is a little problematic in that nam

Re: NCHNAMLEN vnode cache limitation removal

2019-09-15 Thread Mateusz Guzik
ing anything but the last component you win big time. (Of course you can't "just" do it, the leapfrogging is there for a reason but it can be worked out with tracking the state and having safe memory reclamation.) TL;DR it's minor quality of life improvement for users and a de facto mandat

Re: fcntl(F_GETPATH) support

2019-09-15 Thread Mateusz Guzik
this should be added unless namecache becomes reliable. The above reasoning is why I did not add it to FreeBSD. -- Mateusz Guzik

Re: mutex vs turnstile

2018-02-14 Thread Mateusz Guzik
On Thu, Jan 18, 2018 at 10:38:02AM +, Nick Hudson wrote: > On 01/09/18 03:30, Mateusz Guzik wrote: > > Some time ago I wrote about performance problems when doing high -j > > build.sh and made few remarks about mutex implementation. > > > > TL;DR for that one

Re: Race condition between an LWP migration and curlwp_bind

2018-02-14 Thread Mateusz Guzik
536,7 @@ curlwp_bind(void) bound = curlwp->l_pflag & LP_BOUND; curlwp->l_pflag |= LP_BOUND; + __insn_barrier(); return bound; } @@ -545,6 +546,7 @@ curlwp_bindx(int bound) { KASSERT(curlwp->l_pflag & LP_BOUND); + __insn_barrier(); curlwp->l_pflag ^= bound ^ LP_BOUND; } -- Mateusz Guzik

mutex vs turnstile

2018-01-09 Thread Mateusz Guzik
egardless *who* owns the lock (or whether they are running), but then only goes to sleep if the *original* owner has the lock. -- Mateusz Guzik

Re: performance issues during build.sh -j 40 kernel

2017-09-11 Thread Mateusz Guzik
On Sat, Sep 09, 2017 at 08:48:19PM +0200, Mateusz Guzik wrote: > > Here is a bunch of "./build.sh -j 40 kernel=MYCONF > /dev/null" on stock > kernel: > 618.65s user 1097.80s system 2502% cpu 1:08.60 total [..] > > And on kernel with total hacks: > 594.08s us

Re: performance issues during build.sh -j 40 kernel

2017-09-11 Thread Mateusz Guzik
On Sun, Sep 10, 2017 at 06:51:31PM +0100, Mindaugas Rasiukevicius wrote: > Mateusz Guzik <mjgu...@gmail.com> wrote: > > 1. exclusive vnode locking (genfs_lock) > > > > ... > > > > 2. uvm_fault_internal > > > > ... > > > > 4. vm l

Re: performance issues during build.sh -j 40 kernel

2017-09-11 Thread Mateusz Guzik
> Le 09/09/2017 à 20:48, Mateusz Guzik a écrit : On Sun, Sep 10, 2017 at 07:29:11PM +0200, Maxime Villard wrote: > Le 09/09/2017 à 20:48, Mateusz Guzik a écrit : > > [...] > > I installed the 7.1 release, downloaded recent git snapshot and built the > > trunk kernel while

performance issues during build.sh -j 40 kernel

2017-09-10 Thread Mateusz Guzik
to grab the lock and that caused problmes for everyone else waiting on the vm obj lock. The spin loop itself is weird in the sense that instead of just having the pause instruction embedded it calls a function. This is probably unnecessarily less power/other thread friendly than it needs to be. Cheers, -- Mateusz Guzik

unsafe ->p_cwdi access in mount_checkdirs

2017-09-08 Thread Mateusz Guzik
== mount_checkdirs would then atomic_inc_not_zero (or however you have this named). With the above, if cwdi was spotted, it is guaranteeed to not get freed until after proc_lock is dropped. A successfull non-zero -> non-zero + 1 refcount bump guarantees it wont get freed and the content will remain valid. -- Mateusz Guzik

fork1 use-after-free of the child process

2017-09-08 Thread Mateusz Guzik
extra lock/unlock of the global process list lock. -- Mateusz Guzik

Re: In-kernel process exit hooks?

2016-01-09 Thread Mateusz Guzik
Nothing was done to keep the parent around, which means it could have exited and be freed by now. Which in turn makes the second loop iteration a use-after-free. The first iteration is safe if the argument is curproc (which it is), as it clearly cannot disappear. Turns out this traversal is even more wrong since p_pptr does not have to be the process you are looking for - ptrace is reparenting tracees. >} >} >return (NULL); >} -- Mateusz Guzik

Re: In-kernel process exit hooks?

2016-01-09 Thread Mateusz Guzik
On Sat, Jan 09, 2016 at 08:25:05AM +0100, Mateusz Guzik wrote: > On Sat, Jan 09, 2016 at 02:25:02PM +0800, Paul Goyette wrote: > > On Sat, 9 Jan 2016, Mateusz Guzik wrote: > > > > >While I don't know all the details, it is clear that the purpose would > > >be m

Re: In-kernel process exit hooks?

2016-01-09 Thread Mateusz Guzik
On Sat, Jan 09, 2016 at 02:25:02PM +0800, Paul Goyette wrote: > On Sat, 9 Jan 2016, Mateusz Guzik wrote: > > >While I don't know all the details, it is clear that the purpose would > >be much better served by ktrace and I would argue efforts should be > >spent there