Re: uvm_fault: amap & anon locking

2020-12-31 Thread Mark Kettenis
> Date: Wed, 30 Dec 2020 11:19:41 -0300
> From: Martin Pieuchot 
> 
> Diff below adds some locking to UVM's amap & anon data structures that
> should be enough to get the upper part of the fault handler out of the
> KERNEL_LOCK().
> 
> This diff doesn't unlock the fault handler, I'd suggest to do this in a
> later step on an arch by arch basis.
> 
> This is a port of what exists in NetBSD.  A rwlock is attached to every
> amap and is shared with all its anon.  The same lock will be used by
> multiple amaps if they have anons in common.  This diff includes the new
> rw_obj_* API required to have reference-counted rwlocks.
> 
> Other than that a global rwlock is used to protect the list of amap and
> many pool have been converted to use a rwlock internally to not create
> lock ordering problem when allocations are made while holding a rwlock.

Can you explain what those lock odering problems are?  To me it seems
that a pool mutex should always be taken last, and that the pool
system doesn't need to enter the amap or anon code.

Removing the check in pool_get that PR_WAITOK is set for pools created
with PR_RWLOCK is a bit problematic from my perspective.  At the very
least we should adjust the man page.

> The style of the diff is sometimes questionable.  This is done to reduce
> differences with NetBSD sources in order to help porting more locking goos.
> 
> This has been extensively tested as part of the unlocking diff I sent to
> many developers.  However, I'd appreciate if you could test again because
> this diff doesn't include WITNESS and do not unlock the fault handler.
> 
> Index: kern/init_main.c
> ===
> RCS file: /cvs/src/sys/kern/init_main.c,v
> retrieving revision 1.303
> diff -u -p -r1.303 init_main.c
> --- kern/init_main.c  28 Dec 2020 14:01:23 -  1.303
> +++ kern/init_main.c  29 Dec 2020 14:13:52 -
> @@ -232,6 +232,7 @@ main(void *framep)
>   KERNEL_LOCK_INIT();
>   SCHED_LOCK_INIT();
>  
> + rw_obj_init();
>   uvm_init();
>   disk_init();/* must come before autoconfiguration */
>   tty_init(); /* initialise tty's */
> Index: kern/kern_rwlock.c
> ===
> RCS file: /cvs/src/sys/kern/kern_rwlock.c,v
> retrieving revision 1.45
> diff -u -p -r1.45 kern_rwlock.c
> --- kern/kern_rwlock.c2 Mar 2020 17:07:49 -   1.45
> +++ kern/kern_rwlock.c30 Dec 2020 14:03:00 -
> @@ -19,6 +19,7 @@
>  
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -487,4 +488,124 @@ int
>  rrw_status(struct rrwlock *rrwl)
>  {
>   return (rw_status(&rrwl->rrwl_lock));
> +}
> +
> +/*-
> + * Copyright (c) 2008 The NetBSD Foundation, Inc.
> + * All rights reserved.
> + *
> + * This code is derived from software contributed to The NetBSD Foundation
> + * by Andrew Doran.
> + *
> + * Redistribution and use in source and binary forms, with or without
> + * modification, are permitted provided that the following conditions
> + * are met:
> + * 1. Redistributions of source code must retain the above copyright
> + *notice, this list of conditions and the following disclaimer.
> + * 2. Redistributions in binary form must reproduce the above copyright
> + *notice, this list of conditions and the following disclaimer in the
> + *documentation and/or other materials provided with the distribution.
> + *
> + * THIS SOFTWARE IS PROVIDED BY THE NETBSD FOUNDATION, INC. AND CONTRIBUTORS
> + * ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT 
> LIMITED
> + * TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
> + * PURPOSE ARE DISCLAIMED.  IN NO EVENT SHALL THE FOUNDATION OR CONTRIBUTORS
> + * BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
> + * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
> + * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
> + * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
> + * CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
> + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
> + * POSSIBILITY OF SUCH DAMAGE.
> + */
> +
> +#define  RWLOCK_OBJ_MAGIC0x5aa3c85d
> +struct rwlock_obj {
> + struct rwlock   ro_lock;
> + u_int   ro_magic;
> + u_int   ro_refcnt;
> +};
> +
> +
> +struct pool rwlock_obj_pool;
> +
> +/*
> + * rw_obj_init:
> + *
> + *   Initialize the mutex object store.
> + */
> +void
> +rw_obj_init(void)
> +{
> + pool_init(&rwlock_obj_pool, sizeof(struct rwlock_obj), 0, IPL_NONE,
> + PR_WAITOK | PR_RWLOCK, "rwobjpl", NULL);
> +}
> +
> +/*
> + * rw_obj_alloc:
> + *
> + *   Allocate a single lock object.
> + */
> +void
> +_rw_obj_alloc_flags(struct rwlock **lock, const char *name, int flags,
> +struct lock_type *type)
> 

drm(4) memory allocation diff

2020-12-31 Thread Mark Kettenis
The diff below changes the emulated Linux memory allocation functions
a bit such that they only use malloc(9) for allocations smaller than a
page.  This reduces pressure on the "interrupt safe" map and hopefully
will avoid the

uvm_mapent_alloc: out of static map entries

messages that some people have seen more often in the last few months.
It also could help with some memory allocation issues seem by
amdgpu(4) users.

The downside of this approach is that memory leaks will be harder to
spot as the larger memory allocations are no longer included in the
DRM type as reported by vmstat -m.  Another downside is that this no
longer caps the amount of kernel memory that can be allocated by
drm(4).  If that turns out to be a problem, we can impose a limit on
the amount of memory that can be allocated this way.

The implementation needs to keep track of the size of each allocated
memory block.  This is done using a red-black tree.  Our kernel uses
red-black trees in similar situations but I wouldn't call myself an
expert in the area of data structures so a there may be a better
approach.

Some real-life testing would be appreciated.


Index: dev/pci/drm/drm_linux.c
===
RCS file: /cvs/src/sys/dev/pci/drm/drm_linux.c,v
retrieving revision 1.74
diff -u -p -r1.74 drm_linux.c
--- dev/pci/drm/drm_linux.c 31 Dec 2020 06:31:55 -  1.74
+++ dev/pci/drm/drm_linux.c 31 Dec 2020 20:44:38 -
@@ -430,6 +430,116 @@ dmi_check_system(const struct dmi_system
return (num);
 }
 
+struct vmalloc_entry {
+   const void  *addr;
+   size_t  size;
+   RBT_ENTRY(vmalloc_entry) vmalloc_node;
+};
+
+struct pool vmalloc_pool;
+RBT_HEAD(vmalloc_tree, vmalloc_entry) vmalloc_tree;
+
+RBT_PROTOTYPE(vmalloc_tree, vmalloc_entry, vmalloc_node, vmalloc_compare);
+
+static inline int
+vmalloc_compare(const struct vmalloc_entry *a, const struct vmalloc_entry *b)
+{
+   vaddr_t va = (vaddr_t)a->addr;
+   vaddr_t vb = (vaddr_t)b->addr;
+
+   return va < vb ? -1 : va > vb;
+}
+
+RBT_GENERATE(vmalloc_tree, vmalloc_entry, vmalloc_node, vmalloc_compare);
+
+bool
+is_vmalloc_addr(const void *addr)
+{
+   struct vmalloc_entry key;
+   struct vmalloc_entry *entry;
+
+   key.addr = addr;
+   entry = RBT_FIND(vmalloc_tree, &vmalloc_tree, &key);
+   return (entry != NULL);
+}
+
+void *
+vmalloc(unsigned long size)
+{
+   struct vmalloc_entry *entry;
+   void *addr;
+
+   size = roundup(size, PAGE_SIZE);
+   addr = km_alloc(size, &kv_any, &kp_dirty, &kd_waitok);
+   if (addr) {
+   entry = pool_get(&vmalloc_pool, PR_WAITOK);
+   entry->addr = addr;
+   entry->size = size;
+   RBT_INSERT(vmalloc_tree, &vmalloc_tree, entry);
+   }
+
+   return addr;
+}
+
+void *
+vzalloc(unsigned long size)
+{
+   struct vmalloc_entry *entry;
+   void *addr;
+
+   size = roundup(size, PAGE_SIZE);
+   addr = km_alloc(size, &kv_any, &kp_zero, &kd_waitok);
+   if (addr) {
+   entry = pool_get(&vmalloc_pool, PR_WAITOK);
+   entry->addr = addr;
+   entry->size = size;
+   RBT_INSERT(vmalloc_tree, &vmalloc_tree, entry);
+   }
+
+   return addr;
+}
+
+void
+vfree(const void *addr)
+{
+   struct vmalloc_entry key;
+   struct vmalloc_entry *entry;
+
+   key.addr = addr;
+   entry = RBT_FIND(vmalloc_tree, &vmalloc_tree, &key);
+   if (entry == NULL)
+   panic("%s: non vmalloced addr %p", __func__, addr);
+
+   RBT_REMOVE(vmalloc_tree, &vmalloc_tree, entry);
+   km_free((void *)addr, entry->size, &kv_any, &kp_dirty);
+   pool_put(&vmalloc_pool, entry);
+}
+
+void *
+kvmalloc(size_t size, gfp_t flags)
+{
+   if (flags != GFP_KERNEL || size < PAGE_SIZE)
+   return malloc(size, M_DRM, flags);
+   return vmalloc(size);
+}
+
+void *
+kvzalloc(size_t size, gfp_t flags)
+{
+   if (flags != GFP_KERNEL || size < PAGE_SIZE)
+   return malloc(size, M_DRM, flags | M_ZERO);
+   return vzalloc(size);
+}
+
+void
+kvfree(const void *addr)
+{
+   if (is_vmalloc_addr(addr))
+   vfree(addr);
+   else
+   free((void *)addr, M_DRM, 0);
+}
+
 struct vm_page *
 alloc_pages(unsigned int gfp_mask, unsigned int order)
 {
@@ -1939,6 +2049,10 @@ drm_linux_init(void)
 
pool_init(&idr_pool, sizeof(struct idr_entry), 0, IPL_TTY, 0,
"idrpl", NULL);
+
+   pool_init(&vmalloc_pool, sizeof(struct vmalloc_entry), 0, IPL_NONE, 0,
+   "vmallocpl", NULL);
+   RBT_INIT(vmalloc_tree, &vmalloc_tree);
 }
 
 void
Index: dev/pci/drm/include/linux/mm.h
===
RCS file: /cvs/src/sys/dev/pci/drm/include/linux/mm.h,v
retrieving revision 1.6
diff -u -p -r1.6 mm.h
--- dev/pci/drm/include/linux/mm.h  7 Dec 2020 09:10:42 -   1.6

Re: ftp(1): handle HTTP 308

2020-12-31 Thread Lucas
Weekly bump

Index: fetch.c
===
RCS file: /home/cvs/src/usr.bin/ftp/fetch.c,v
retrieving revision 1.198
diff -u -p -r1.198 fetch.c
--- fetch.c 18 Oct 2020 20:35:18 -  1.198
+++ fetch.c 24 Dec 2020 14:03:03 -
@@ -843,6 +843,7 @@ noslash:
case 302:   /* Found */
case 303:   /* See Other */
case 307:   /* Temporary Redirect */
+   case 308:   /* Permanent Redirect */
isredirect++;
if (redirect_loop++ > 10) {
warnx("Too many redirections requested");



Re: libc/regex: const'r'us

2020-12-31 Thread Todd C . Miller
All three committed, thanks.

 - todd



Re: libc/regex: more regular error handling

2020-12-31 Thread Todd C . Miller
On Thu, 31 Dec 2020 07:29:51 +, Miod Vallat wrote:

> The REQUIRE macro is used to check for a condition, and set an error in
> the parse struct if it is not satisfied.
>
> This changes it from ((condition) || function call) to a, IMHO more
> readable, if() test.

OK millert@

 - todd



Re: libc/regex: drop more unused data

2020-12-31 Thread Todd C . Miller
On Thu, 31 Dec 2020 07:25:19 +, Miod Vallat wrote:

> re_guts catspace[] is only written to (via categories[]), and never used
> for anything, so don't bother keeping that.

OK millert@

 - todd



Re: libc/regex: const'r'us

2020-12-31 Thread Todd C . Miller
On Thu, 31 Dec 2020 07:27:49 +, Miod Vallat wrote:

> Spencer's code was written before const was a thing, but we can do
> better. Neither regcomp(3) nor regex(3) modify the strings they are
> being passed, so we can keep internal pointers as const as well and
> avoid {dub,spur}ious casts.
>
> While there, the temporary array in nonnewline() can be made static
> const as well rather than recreated every time.

OK millert@

 - todd



Re: Thread local data setup and destruct

2020-12-31 Thread Alexander Bluhm
On Tue, Dec 29, 2020 at 04:07:19PM +0100, Otto Moerbeek wrote:
> This workds better, checking the flags does not work if the thread is
> already on the road to desctruction.

This diff survived a full regress run on amd64.

bluhm

> Index: asr/asr.c
> ===
> RCS file: /cvs/src/lib/libc/asr/asr.c,v
> retrieving revision 1.64
> diff -u -p -r1.64 asr.c
> --- asr/asr.c 6 Jul 2020 13:33:05 -   1.64
> +++ asr/asr.c 29 Dec 2020 15:05:45 -
> @@ -117,7 +117,7 @@ _asr_resolver_done(void *arg)
>   _asr_ctx_unref(ac);
>   return;
>   } else {
> - priv = _THREAD_PRIVATE(_asr, _asr, &_asr);
> + priv = _THREAD_PRIVATE_DT(_asr, _asr, NULL, &_asr);
>   if (*priv == NULL)
>   return;
>   asr = *priv;
> @@ -128,6 +128,23 @@ _asr_resolver_done(void *arg)
>   free(asr);
>  }
>  
> +static void
> +_asr_resolver_done_tp(void *arg)
> +{
> + char buf[100];
> + int len;
> + struct asr **priv = arg;
> + struct asr *asr;
> +
> + if (*priv == NULL)
> + return;
> + asr = *priv;
> +
> + _asr_ctx_unref(asr->a_ctx);
> + free(asr);
> + free(priv);
> +}
> +
>  void *
>  asr_resolver_from_string(const char *str)
>  {
> @@ -349,7 +366,8 @@ _asr_use_resolver(void *arg)
>   }
>   else {
>   DPRINT("using thread-local resolver\n");
> - priv = _THREAD_PRIVATE(_asr, _asr, &_asr);
> + priv = _THREAD_PRIVATE_DT(_asr, _asr, _asr_resolver_done_tp,
> + &_asr);
>   if (*priv == NULL) {
>   DPRINT("setting up thread-local resolver\n");
>   *priv = _asr_resolver();
> Index: include/thread_private.h
> ===
> RCS file: /cvs/src/lib/libc/include/thread_private.h,v
> retrieving revision 1.35
> diff -u -p -r1.35 thread_private.h
> --- include/thread_private.h  13 Feb 2019 13:22:14 -  1.35
> +++ include/thread_private.h  29 Dec 2020 15:05:45 -
> @@ -98,7 +98,8 @@ struct thread_callbacks {
>   void(*tc_mutex_destroy)(void **);
>   void(*tc_tag_lock)(void **);
>   void(*tc_tag_unlock)(void **);
> - void*(*tc_tag_storage)(void **, void *, size_t, void *);
> + void*(*tc_tag_storage)(void **, void *, size_t, void (*)(void *),
> +void *);
>   __pid_t (*tc_fork)(void);
>   __pid_t (*tc_vfork)(void);
>   void(*tc_thread_release)(struct pthread *);
> @@ -142,6 +143,7 @@ __END_HIDDEN_DECLS
>  #define _THREAD_PRIVATE_MUTEX_LOCK(name) do {} while (0)
>  #define _THREAD_PRIVATE_MUTEX_UNLOCK(name)   do {} while (0)
>  #define _THREAD_PRIVATE(keyname, storage, error) &(storage)
> +#define _THREAD_PRIVATE_DT(keyname, storage, dt, error)  &(storage)
>  #define _MUTEX_LOCK(mutex)   do {} while (0)
>  #define _MUTEX_UNLOCK(mutex) do {} while (0)
>  #define _MUTEX_DESTROY(mutex)do {} while (0)
> @@ -168,7 +170,12 @@ __END_HIDDEN_DECLS
>  #define _THREAD_PRIVATE(keyname, storage, error) \
>   (_thread_cb.tc_tag_storage == NULL ? &(storage) :   \
>   _thread_cb.tc_tag_storage(&(__THREAD_NAME(keyname)),\
> - &(storage), sizeof(storage), error))
> + &(storage), sizeof(storage), NULL, (error)))
> +
> +#define _THREAD_PRIVATE_DT(keyname, storage, dt, error)  
> \
> + (_thread_cb.tc_tag_storage == NULL ? &(storage) :   \
> + _thread_cb.tc_tag_storage(&(__THREAD_NAME(keyname)),\
> + &(storage), sizeof(storage), (dt), (error)))
>  
>  /*
>   * Macros used in libc to access mutexes.
> Index: thread/rthread_cb.h
> ===
> RCS file: /cvs/src/lib/libc/thread/rthread_cb.h,v
> retrieving revision 1.2
> diff -u -p -r1.2 rthread_cb.h
> --- thread/rthread_cb.h   5 Sep 2017 02:40:54 -   1.2
> +++ thread/rthread_cb.h   29 Dec 2020 15:05:45 -
> @@ -35,5 +35,5 @@ void_thread_mutex_unlock(void **);
>  void _thread_mutex_destroy(void **);
>  void _thread_tag_lock(void **);
>  void _thread_tag_unlock(void **);
> -void *_thread_tag_storage(void **, void *, size_t, void *);
> +void *_thread_tag_storage(void **, void *, size_t, void (*)(void*), void *);
>  __END_HIDDEN_DECLS
> Index: thread/rthread_libc.c
> ===
> RCS file: /cvs/src/lib/libc/thread/rthread_libc.c,v
> retrieving revision 1.3
> diff -u -p -r1.3 rthread_libc.c
> --- thread/rthread_libc.c 10 Jan 2019 18:45:33 -  1.3
> +++ thread/rthread_libc.c 29 Dec 2020 15:05:45 -
> @@ -5,6 +5,8 @@
>  #include 
>  #include 
>  #include 
> +#include 
> +#include 
>  
>  #include "rth

Re: login_passwd.c (etc.) and auth_mkvalue(3) returning NULL

2020-12-31 Thread Todd C . Miller
On Thu, 31 Dec 2020 15:27:02 +1100, Ross L Richardson wrote:

> It could, of course, just use a fixed string rather than the "%s" format,
> although the latter is certainly clear(er) and consistent.

I originally had a fixed string but decided that using the "%s"
format was clearer.

> With auth_mkvalue() not being used, I don't think it needs to include
>  any more.

Good catch.

 - todd



Re: httpd: another log related leak

2020-12-31 Thread Claudio Jeker
On Thu, Dec 31, 2020 at 11:21:44AM +0100, Theo Buehler wrote:
> msg is allocated by vasprintf, and is leaked on return of server_sendlog.
> vasprintf calculates the length of the string, so we can zap a needless
> call to strlen while there.
> 
> Index: server.c
> ===
> RCS file: /cvs/src/usr.sbin/httpd/server.c,v
> retrieving revision 1.121
> diff -u -p -r1.121 server.c
> --- server.c  11 Oct 2020 03:21:44 -  1.121
> +++ server.c  31 Dec 2020 10:06:28 -
> @@ -1251,12 +1251,14 @@ server_sendlog(struct server_config *srv
>   iov[0].iov_base = &srv_conf->id;
>   iov[0].iov_len = sizeof(srv_conf->id);
>   iov[1].iov_base = msg;
> - iov[1].iov_len = strlen(msg) + 1;
> + iov[1].iov_len = ret + 1;
>  
>   if (proc_composev(httpd_env->sc_ps, PROC_LOGGER, cmd, iov, 2) != 0) {
>   log_warn("%s: failed to compose imsg", __func__);
> + free(msg);
>   return;
>   }
> + free(msg);
>  }
>  
>  void
> 

OK claudio@

-- 
:wq Claudio



httpd: another log related leak

2020-12-31 Thread Theo Buehler
msg is allocated by vasprintf, and is leaked on return of server_sendlog.
vasprintf calculates the length of the string, so we can zap a needless
call to strlen while there.

Index: server.c
===
RCS file: /cvs/src/usr.sbin/httpd/server.c,v
retrieving revision 1.121
diff -u -p -r1.121 server.c
--- server.c11 Oct 2020 03:21:44 -  1.121
+++ server.c31 Dec 2020 10:06:28 -
@@ -1251,12 +1251,14 @@ server_sendlog(struct server_config *srv
iov[0].iov_base = &srv_conf->id;
iov[0].iov_len = sizeof(srv_conf->id);
iov[1].iov_base = msg;
-   iov[1].iov_len = strlen(msg) + 1;
+   iov[1].iov_len = ret + 1;
 
if (proc_composev(httpd_env->sc_ps, PROC_LOGGER, cmd, iov, 2) != 0) {
log_warn("%s: failed to compose imsg", __func__);
+   free(msg);
return;
}
+   free(msg);
 }
 
 void