Re: malloc: less unlock/lock dancing
On Wed, May 10, 2023 at 10:08:09AM +0200, Theo Buehler wrote: > > Thanks! It has been committed. I doubt though if the Go runtime uses > > libc malloc. > > I don't know if the pure Go runtime uses libc malloc. However, some > of the test code involves cgo and calls into various C libraries > including libcrypto. So it definitely exercised malloc in a threaded > environment. Good to know, thanks again, -Otto
Re: malloc: less unlock/lock dancing
> Thanks! It has been committed. I doubt though if the Go runtime uses > libc malloc. I don't know if the pure Go runtime uses libc malloc. However, some of the test code involves cgo and calls into various C libraries including libcrypto. So it definitely exercised malloc in a threaded environment.
Re: malloc: less unlock/lock dancing
On Tue, May 09, 2023 at 09:55:32PM +0200, Theo Buehler wrote: > On Thu, May 04, 2023 at 03:40:35PM +0200, Otto Moerbeek wrote: > > On Thu, Apr 27, 2023 at 02:17:10PM +0200, Otto Moerbeek wrote: > > > > > This was introduced to not stall other threads while mmap is called by > > > a thread. But now that mmap is unlocked, I believe it is no longer > > > useful. > > > > > > A full build is slighlty faster with this. But this also needs testing > > > with you favorite multithreaded program. > > > > I'd really would like some feedback/performance tests on this. > > I see the same: slight reduction of 'make build' time and it also works > fine with heavily threaded things like some Go that I play with. I have > put this through a full bulk build with no fallout. It seems to be > slightly faster for this as well. > > I have been running this on several machines including my main laptop > and could not spot a single downside. > > The diff makes sense, so > > ok tb Thanks! It has been committed. I doubt though if the Go runtime uses libc malloc. -Otto
Re: malloc: less unlock/lock dancing
On Thu, May 04, 2023 at 03:40:35PM +0200, Otto Moerbeek wrote: > On Thu, Apr 27, 2023 at 02:17:10PM +0200, Otto Moerbeek wrote: > > > This was introduced to not stall other threads while mmap is called by > > a thread. But now that mmap is unlocked, I believe it is no longer > > useful. > > > > A full build is slighlty faster with this. But this also needs testing > > with you favorite multithreaded program. > > I'd really would like some feedback/performance tests on this. I see the same: slight reduction of 'make build' time and it also works fine with heavily threaded things like some Go that I play with. I have put this through a full bulk build with no fallout. It seems to be slightly faster for this as well. I have been running this on several machines including my main laptop and could not spot a single downside. The diff makes sense, so ok tb
Re: malloc: less unlock/lock dancing
On Thu, Apr 27, 2023 at 02:17:10PM +0200, Otto Moerbeek wrote: > This was introduced to not stall other threads while mmap is called by > a thread. But now that mmap is unlocked, I believe it is no longer > useful. > > A full build is slighlty faster with this. But this also needs testing > with you favorite multithreaded program. I'd really would like some feedback/performance tests on this. -Otto > > Index: stdlib/malloc.c > === > RCS file: /home/cvs/src/lib/libc/stdlib/malloc.c,v > retrieving revision 1.282 > diff -u -p -r1.282 malloc.c > --- stdlib/malloc.c 21 Apr 2023 06:19:40 - 1.282 > +++ stdlib/malloc.c 27 Apr 2023 05:40:49 - > @@ -264,24 +264,6 @@ static void malloc_exit(void); > (sz) = (uintptr_t)(r)->p & MALLOC_PAGEMASK, \ > (sz) = ((sz) == 0 ? (r)->size : B2SIZE((sz) - 1)) > > -static inline void > -_MALLOC_LEAVE(struct dir_info *d) > -{ > - if (d->malloc_mt) { > - d->active--; > - _MALLOC_UNLOCK(d->mutex); > - } > -} > - > -static inline void > -_MALLOC_ENTER(struct dir_info *d) > -{ > - if (d->malloc_mt) { > - _MALLOC_LOCK(d->mutex); > - d->active++; > - } > -} > - > static inline size_t > hash(void *p) > { > @@ -879,9 +861,7 @@ map(struct dir_info *d, size_t sz, int z > return p; > } > if (psz <= 1) { > - _MALLOC_LEAVE(d); > p = MMAP(cache->max * sz, d->mmap_flag); > - _MALLOC_ENTER(d); > if (p != MAP_FAILED) { > STATS_ADD(d->malloc_used, cache->max * sz); > cache->length = cache->max - 1; > @@ -901,9 +881,7 @@ map(struct dir_info *d, size_t sz, int z > } > > } > - _MALLOC_LEAVE(d); > p = MMAP(sz, d->mmap_flag); > - _MALLOC_ENTER(d); > if (p != MAP_FAILED) > STATS_ADD(d->malloc_used, sz); > /* zero fill not needed */ >
malloc: less unlock/lock dancing
This was introduced to not stall other threads while mmap is called by a thread. But now that mmap is unlocked, I believe it is no longer useful. A full build is slighlty faster with this. But this also needs testing with you favorite multithreaded program. -Otto Index: stdlib/malloc.c === RCS file: /home/cvs/src/lib/libc/stdlib/malloc.c,v retrieving revision 1.282 diff -u -p -r1.282 malloc.c --- stdlib/malloc.c 21 Apr 2023 06:19:40 - 1.282 +++ stdlib/malloc.c 27 Apr 2023 05:40:49 - @@ -264,24 +264,6 @@ static void malloc_exit(void); (sz) = (uintptr_t)(r)->p & MALLOC_PAGEMASK, \ (sz) = ((sz) == 0 ? (r)->size : B2SIZE((sz) - 1)) -static inline void -_MALLOC_LEAVE(struct dir_info *d) -{ - if (d->malloc_mt) { - d->active--; - _MALLOC_UNLOCK(d->mutex); - } -} - -static inline void -_MALLOC_ENTER(struct dir_info *d) -{ - if (d->malloc_mt) { - _MALLOC_LOCK(d->mutex); - d->active++; - } -} - static inline size_t hash(void *p) { @@ -879,9 +861,7 @@ map(struct dir_info *d, size_t sz, int z return p; } if (psz <= 1) { - _MALLOC_LEAVE(d); p = MMAP(cache->max * sz, d->mmap_flag); - _MALLOC_ENTER(d); if (p != MAP_FAILED) { STATS_ADD(d->malloc_used, cache->max * sz); cache->length = cache->max - 1; @@ -901,9 +881,7 @@ map(struct dir_info *d, size_t sz, int z } } - _MALLOC_LEAVE(d); p = MMAP(sz, d->mmap_flag); - _MALLOC_ENTER(d); if (p != MAP_FAILED) STATS_ADD(d->malloc_used, sz); /* zero fill not needed */