Re: veriexec(4) maintenance
Jan, I apologise for taking so long to get back to you. I do appreciate you taking the time to help with this. > > first it is difficult to read a diff that also introduces a "notyet" operation > on file pages. > OK, I will remove those sections to make things clearer. I still have a dream of getting per-page signatures in to the kernel some day... > Looks like you replace global and item rw-locks with refcounts/signals. > I tried to understand the rationale behind and failed. Your implementation > seems at least vulnerable to livelocks as the writer side is missing priority. > My aim was to improve multi-thread performance and remove some of the more horrible hacks I made to make things work - like looping on the acquisition of a rw lock. I also was lead to believe that rw locks are very heavy weight so I was trying to replace them with something less onerous. > In general, what problem(s) are you trying to solve here? > The general process for veriexec is when the kernel comes up the tables are blank. Early in the boot process the signatures are loaded into the kernel. After the signatures are loaded and a file is executed the signature for the file is evaluated and the result of the comparison between the loaded and calculated signatures is cached which improves the performance. The things that I think need to be protected are: 1) If the tables are being loaded then any evaluations should wait until the table is loaded. 2) If the signature for a file is being evaluated then any other accesses to that entry should wait until the evaulation is done. 3) At some securelevel settings, the signatures can be updated on the fly. The signature entry must not be updated while it is being accessed. 4) If the file system associated with a particular table of signatures is being unmounted then that table must be cleared. > What is wrong with a quite simple hashtable to hold fileid<->fingerprint > mapping accessed as: > > mutex_enter table lock > lookup fileid -> item > mutex_enter item > mutex_exit table lock > > while item state is busy > cv_wait item cv > > if item state is unknown > set item state busy > mutex_exit item > compute fingerprint > mutex_enter item > set item state valid/invalid > cv_broadcast item cv > > Here the item state is either valid or invalid > > mutex_exit item > well, that was pretty much what my intention was but clearly I failed in the implementation. > > lock_state is sometimes VERIEXEC_UNLOCKED, sometimes IO_NODELOCKED. > > 1629veriexec_dump(struct lwp *l, prop_array_t rarray) > 1630{ > 1631struct mount *mp; > 1632mount_iterator_t *mpiter; > 1633 > 1634mountlist_iterator_init(&mpiter); > 1635 > 1636while ((mp = mountlist_iterator_trynext(mpiter)) != > NULL) { > 1637/* If it fails, the file-system is [being] > unmounted. */ > 1638if (vfs_busy(mp) != 0) > 1639continue; > > This is completely wrong. Operation mountlist_iterator_trynext() already > returns the mount busy and skips mounts currently unmounting or remounting. > Operation mountlist_iterator_next() was right or does this function get > called from inside unmount? > No, not this one, when I first did this code there was no mountlist_iterator framework. This dump call should ignore file systems being unmounted. > 1653veriexec_flush(struct lwp *l) > > This is completely wrong, see above. > This one should be called on unmount. I will fix up and simplify the locking and make sure I use the mountlist_iterator stuff properly Thanks again for the feedback. -- Brett Lymn -- Sent from my NetBSD device. "We are were wolves", "You mean werewolves?", "No we were wolves, now we are something else entirely", "Oh"
Re: veriexec(4) maintenance
On Tue, Dec 19, 2023 at 07:50:25AM +1030, Brett Lymn wrote: > On Sat, Sep 02, 2023 at 08:57:03AM +0930, Brett Lymn wrote: > > On Thu, Aug 31, 2023 at 10:24:07AM +0200, J. Hannken-Illjes wrote: > > > > > > I'm short on time, some remarks: > > > > > > > Thanks, I will have a closer look at these issues. This could be where the > > deadlock I was > > seeing is coming from. > > > > OK, I have, finally, updated the diff in line with the comments (thanks > again for the good feedback). Brett, first it is difficult to read a diff that also introduces a "notyet" operation on file pages. Looks like you replace global and item rw-locks with refcounts/signals. I tried to understand the rationale behind and failed. Your implementation seems at least vulnerable to livelocks as the writer side is missing priority. In general, what problem(s) are you trying to solve here? What is wrong with a quite simple hashtable to hold fileid<->fingerprint mapping accessed as: mutex_enter table lock lookup fileid -> item mutex_enter item mutex_exit table lock while item state is busy cv_wait item cv if item state is unknown set item state busy mutex_exit item compute fingerprint mutex_enter item set item state valid/invalid cv_broadcast item cv Here the item state is either valid or invalid mutex_exit item Some more problems (line numbers with your diff applied): 476 veriexec_fp_calc(struct lwp *l, struct vnode *vp, int lock_state, 477 struct veriexec_file_entry *vfe, u_char *fp) 478 { 524 error = vn_rdwr(UIO_READ, vp, buf, len, offset, 525 UIO_SYSSPACE, lock_state, lock_state is sometimes VERIEXEC_UNLOCKED, sometimes IO_NODELOCKED. 1629 veriexec_dump(struct lwp *l, prop_array_t rarray) 1630 { 1631 struct mount *mp; 1632 mount_iterator_t *mpiter; 1633 1634 mountlist_iterator_init(&mpiter); 1635 1636 while ((mp = mountlist_iterator_trynext(mpiter)) != NULL) { 1637 /* If it fails, the file-system is [being] unmounted. */ 1638 if (vfs_busy(mp) != 0) 1639 continue; This is completely wrong. Operation mountlist_iterator_trynext() already returns the mount busy and skips mounts currently unmounting or remounting. Operation mountlist_iterator_next() was right or does this function get called from inside unmount? 1653 veriexec_flush(struct lwp *l) 1654 { 1655 struct mount *mp; 1656 mount_iterator_t *mpiter; 1657 int error = 0; 1658 1659 mountlist_iterator_init(&mpiter); 1660 1661 while ((mp = mountlist_iterator_trynext(mpiter)) != NULL) { 1662 int lerror; 1663 1664 /* If it fails, the file-system is [being] unmounted. */ 1665 if (vfs_busy(mp) != 0) 1666 continue; This is completely wrong, see above. -- J. Hannken-Illjes
Re: veriexec(4) maintenance
On Sat, Sep 02, 2023 at 08:57:03AM +0930, Brett Lymn wrote: > On Thu, Aug 31, 2023 at 10:24:07AM +0200, J. Hannken-Illjes wrote: > > > > I'm short on time, some remarks: > > > > Thanks, I will have a closer look at these issues. This could be where the > deadlock I was > seeing is coming from. > OK, I have, finally, updated the diff in line with the comments (thanks again for the good feedback). -- Brett Lymn -- Sent from my NetBSD device. "We are were wolves", "You mean werewolves?", "No we were wolves, now we are something else entirely", "Oh" Index: kern_veriexec.c === RCS file: /cvsroot/src/sys/kern/kern_veriexec.c,v retrieving revision 1.27 diff -u -r1.27 kern_veriexec.c --- kern_veriexec.c 9 Apr 2023 09:18:09 - 1.27 +++ kern_veriexec.c 18 Dec 2023 21:12:46 - @@ -1,4 +1,4 @@ -/* $NetBSD: kern_veriexec.c,v 1.27 2023/04/09 09:18:09 riastradh Exp $ */ +/* $NetBSD$*/ /*- * Copyright (c) 2005, 2006 Elad Efrat @@ -29,7 +29,7 @@ */ #include -__KERNEL_RCSID(0, "$NetBSD: kern_veriexec.c,v 1.27 2023/04/09 09:18:09 riastradh Exp $"); +__KERNEL_RCSID(0, "$NetBSD$"); #include "opt_veriexec.h" @@ -38,6 +38,7 @@ #include #include #include +#include #include #include #include @@ -49,6 +50,7 @@ #include #include #include +#include #include #include #include @@ -68,11 +70,6 @@ #define VERIEXEC_UNLOCKED 0x00/* Nothing locked, callee does it */ #define VERIEXEC_LOCKED0x01/* Global op lock held */ -/* state of file locking for veriexec_file_verify */ -#define VERIEXEC_FILE_UNLOCKED 0x02/* Nothing locked, callee does it */ -#define VERIEXEC_FILE_LOCKED 0x04/* File locked */ - -#define VERIEXEC_RW_UPGRADE(lock) while((rw_tryupgrade(lock)) == 0){}; struct veriexec_fpops { const char *type; @@ -86,11 +83,17 @@ /* Veriexec per-file entry data. */ struct veriexec_file_entry { - krwlock_t lock; /* r/w lock */ + kmutex_t vfe_lock; /* operations lock */ + int vfe_refcnt; /* number of refs */ + kcondvar_t vfe_cv; /* wait for no refs */ u_char *filename; /* File name. */ u_char type;/* Entry type. */ u_char status; /* Evaluation status. */ + u_char page_fp_status; /* Per-page FP status. */ u_char *fp; /* Fingerprint. */ + void *page_fp; /* Per-page fingerprints */ + size_t npages; /* Number of pages. */ + size_t last_page_size; /* To support < PAGE_SIZE */ struct veriexec_fpops *ops; /* Fingerprint ops vector*/ size_t filename_len;/* Length of filename. */ }; @@ -120,15 +123,69 @@ void *, void *, void *, void *); static struct veriexec_fpops *veriexec_fpops_lookup(const char *); static void veriexec_file_free(struct veriexec_file_entry *); +static void veriexec_busy(void); +static void veriexec_unbusy(void); +static void vfe_busy(struct veriexec_file_entry *vfe); +static void vfe_unbusy(struct veriexec_file_entry *vfe); static unsigned int veriexec_tablecount = 0; /* - * Veriexec operations global lock - most ops hold this as a read - * lock, it is upgraded to a write lock when destroying veriexec file - * table entries. + * Veriexec operations global lock. + */ +static kmutex_t veriexec_op_lock; +static kcondvar_t veriexec_cv; +static int veriexec_refcnt; + +/* + * Busy a file table entry. + */ +static void +vfe_busy(struct veriexec_file_entry *vfe) +{ + mutex_enter(&vfe->vfe_lock); + vfe->vfe_refcnt++; + mutex_exit(&vfe->vfe_lock); +} + +/* + * Drop a reference to a file table entry, if count zero then wake + * up any potential waiters. + */ +static void +vfe_unbusy(struct veriexec_file_entry *vfe) +{ + mutex_enter(&vfe->vfe_lock); + KASSERT(vfe->vfe_refcnt > 0); + if (--(vfe->vfe_refcnt) == 0) + cv_broadcast(&vfe->vfe_cv); + mutex_exit(&vfe->vfe_lock); +} + +/* + * Grab the oplock and bump the reference count. */ -static krwlock_t veriexec_op_lock; +static void +veriexec_busy(void) +{ + mutex_enter(&veriexec_op_lock); + veriexec_refcnt++; + mutex_exit(&veriexec_op_lock); +} + +/* + * Decrement the reference count, if we hit count zero then wake up + * any possible waiters. + */ +static void +veriexec_unbusy(void) +{ + mutex_enter(&veriexec_op_lock); + KASSERT(veriexec_refcnt > 0); + if (--veriexec_refcnt == 0) + cv_broadcast(&veriexec_cv); + mutex_exit(&veriexec_op_lock); +} /* * Sysctl helper routine for Veriexec. @@ -221,7 +278,7 @@ } /* - * Add ops to the fingerprint ops vect
Re: veriexec(4) maintenance
On Thu, Aug 31, 2023 at 10:24:07AM +0200, J. Hannken-Illjes wrote: > > I'm short on time, some remarks: > Thanks, I will have a closer look at these issues. This could be where the deadlock I was seeing is coming from. -- Brett Lymn -- Sent from my NetBSD device. "We are were wolves", "You mean werewolves?", "No we were wolves, now we are something else entirely", "Oh"
Re: veriexec(4) maintenance
> On 30. Aug 2023, at 00:16, Brett Lymn wrote: > > On Tue, Aug 15, 2023 at 07:52:46AM +0930, Brett Lymn wrote: >> >> I tried sending this with the diff but I think it was a bit big, >> here is the text, the diff is at ftp.netbsd.org under >> /pub/NetBSD/misc/blymn/veriexec_update >> > > OK, it has been a couple of weeks since I posted this and I have > had zero responses. I have tested the changes and they seem ok > on my daily usage pattern. > > Does anyone have _any_ comments or should I just commit the > changes? > I'm short on time, some remarks: - lockstate as an ioflag to vn_rdwr() cant be right. - waiting for condition is usually mutex_enter() while (!cond) cv_wait() ... mutex_exit() doing it as if (!cond) cv_wait() looks wrong. - if the reference counting is used to free an unreferenced object ist is obviously racy -- J. Hannken-Illjes - hann...@mailbox.org signature.asc Description: Message signed with OpenPGP
Re: veriexec(4) maintenance
On Tue, Aug 15, 2023 at 07:52:46AM +0930, Brett Lymn wrote: > > I tried sending this with the diff but I think it was a bit big, > here is the text, the diff is at ftp.netbsd.org under > /pub/NetBSD/misc/blymn/veriexec_update > OK, it has been a couple of weeks since I posted this and I have had zero responses. I have tested the changes and they seem ok on my daily usage pattern. Does anyone have _any_ comments or should I just commit the changes? -- Brett Lymn -- Sent from my NetBSD device. "We are were wolves", "You mean werewolves?", "No we were wolves, now we are something else entirely", "Oh"
Re: veriexec(4) maintenance
I tried sending this with the diff but I think it was a bit big, here is the text, the diff is at ftp.netbsd.org under /pub/NetBSD/misc/blymn/veriexec_update On Mon, Aug 14, 2023 at 07:51:50AM +0930, Brett Lymn wrote: > On Fri, Aug 04, 2023 at 11:24:22PM +, David Holland wrote: > > > > I was looking at this last weekend and for veriexec I have one fix and > > some comments/notes, which I should probably commit. > > > > OK, that sounds good. > > > fileassoc I would recommend flushing and replacing with a proper > > specificdata implementation for vnodes, which there should be no > > particular barrier to. fileassoc is trying to be that but it's both > > bust and doing it wrong. :-| > > > > I think fileassoc was added with the view for using it for other things > but in the end only veriexec is using it. We can revert it back to > it being veriexec specific. > > I have attached a diff that is a version of veriexec that uses condvars > and ref counters to handle the access control. It compiles but > hasn't been run tested on a current kernel (it worked when i used it > before but it has been a long time). > -- Brett Lymn -- Sent from my NetBSD device. "We are were wolves", "You mean werewolves?", "No we were wolves, now we are something else entirely", "Oh"
Re: veriexec(4) maintenance
On Fri, Aug 04, 2023 at 11:24:22PM +, David Holland wrote: > > I was looking at this last weekend and for veriexec I have one fix and > some comments/notes, which I should probably commit. > OK, that sounds good. > fileassoc I would recommend flushing and replacing with a proper > specificdata implementation for vnodes, which there should be no > particular barrier to. fileassoc is trying to be that but it's both > bust and doing it wrong. :-| > I think fileassoc was added with the view for using it for other things but in the end only veriexec is using it. We can revert it back to it being veriexec specific. I have attached a diff that is a version of veriexec that uses condvars and ref counters to handle the access control. It compiles but hasn't been run tested on a current kernel (it worked when i used it before but it has been a long time). -- Brett Lymn -- Sent from my NetBSD device. "We are were wolves", "You mean werewolves?", "No we were wolves, now we are something else entirely", "Oh" Index: kern_veriexec.c === RCS file: /cvsroot/src/sys/kern/kern_veriexec.c,v retrieving revision 1.27 diff -u -r1.27 kern_veriexec.c --- kern_veriexec.c 9 Apr 2023 09:18:09 - 1.27 +++ kern_veriexec.c 11 Aug 2023 07:17:50 - @@ -1,4 +1,4 @@ -/* $NetBSD: kern_veriexec.c,v 1.27 2023/04/09 09:18:09 riastradh Exp $ */ +/* $NetBSD$*/ /*- * Copyright (c) 2005, 2006 Elad Efrat @@ -29,7 +29,7 @@ */ #include -__KERNEL_RCSID(0, "$NetBSD: kern_veriexec.c,v 1.27 2023/04/09 09:18:09 riastradh Exp $"); +__KERNEL_RCSID(0, "$NetBSD$"); #include "opt_veriexec.h" @@ -38,6 +38,7 @@ #include #include #include +#include #include #include #include @@ -49,6 +50,7 @@ #include #include #include +#include #include #include #include @@ -68,11 +70,6 @@ #define VERIEXEC_UNLOCKED 0x00/* Nothing locked, callee does it */ #define VERIEXEC_LOCKED0x01/* Global op lock held */ -/* state of file locking for veriexec_file_verify */ -#define VERIEXEC_FILE_UNLOCKED 0x02/* Nothing locked, callee does it */ -#define VERIEXEC_FILE_LOCKED 0x04/* File locked */ - -#define VERIEXEC_RW_UPGRADE(lock) while((rw_tryupgrade(lock)) == 0){}; struct veriexec_fpops { const char *type; @@ -86,11 +83,17 @@ /* Veriexec per-file entry data. */ struct veriexec_file_entry { - krwlock_t lock; /* r/w lock */ + kmutex_t vfe_lock; /* operations lock */ + int vfe_refcnt; /* number of refs */ + kcondvar_t vfe_cv; /* wait for no refs */ u_char *filename; /* File name. */ u_char type;/* Entry type. */ u_char status; /* Evaluation status. */ + u_char page_fp_status; /* Per-page FP status. */ u_char *fp; /* Fingerprint. */ + void *page_fp; /* Per-page fingerprints */ + size_t npages; /* Number of pages. */ + size_t last_page_size; /* To support < PAGE_SIZE */ struct veriexec_fpops *ops; /* Fingerprint ops vector*/ size_t filename_len;/* Length of filename. */ }; @@ -120,15 +123,69 @@ void *, void *, void *, void *); static struct veriexec_fpops *veriexec_fpops_lookup(const char *); static void veriexec_file_free(struct veriexec_file_entry *); +static void veriexec_busy(void); +static void veriexec_unbusy(void); +static void vfe_busy(struct veriexec_file_entry *vfe); +static void vfe_unbusy(struct veriexec_file_entry *vfe); static unsigned int veriexec_tablecount = 0; /* - * Veriexec operations global lock - most ops hold this as a read - * lock, it is upgraded to a write lock when destroying veriexec file - * table entries. + * Veriexec operations global lock. + */ +static kmutex_t veriexec_op_lock; +static kcondvar_t veriexec_cv; +static int veriexec_refcnt; + +/* + * Busy a file table entry. + */ +static void +vfe_busy(struct veriexec_file_entry *vfe) +{ + mutex_enter(&vfe->vfe_lock); + vfe->vfe_refcnt++; + mutex_exit(&vfe->vfe_lock); +} + +/* + * Drop a reference to a file table entry, if count zero then wake + * up any potential waiters. */ -static krwlock_t veriexec_op_lock; +static void +vfe_unbusy(struct veriexec_file_entry *vfe) +{ + mutex_enter(&vfe->vfe_lock); + KASSERT(vfe->vfe_refcnt > 0); + if (--(vfe->vfe_refcnt) == 0) + cv_broadcast(&vfe->vfe_cv); + mutex_exit(&vfe->vfe_lock); +} + +/* + * Grab the oplock and bump the reference count. + */ +static void +veriexec_busy(void) +{ + mutex_enter(&veriexec_op_lock); + veriexec_refcnt++; + mutex_exit(&veriexec_op_lock
Re: veriexec(4) maintenance
On Thu, Aug 03, 2023 at 03:03:06PM +0930, Brett Lymn wrote: > > If anyone cares about veriexec(4) and would like to volunteer to take > > this on, we can discuss what needs to be done and how to proceed. > > I will take this on since I was originally responsible for the mess. It > does play fast and loose with synchronisation and should be fixed. > > I have some fixes in my private tree that went some way to addressing > the issues but they have been languishing for many many years because I > had an issue of a deadlock that would occaisionally occur. At the time > I couldn't work out where the deadlock was coming from but maybe with > some expert help and the improved tools I can nut it out. I was looking at this last weekend and for veriexec I have one fix and some comments/notes, which I should probably commit. fileassoc I would recommend flushing and replacing with a proper specificdata implementation for vnodes, which there should be no particular barrier to. fileassoc is trying to be that but it's both bust and doing it wrong. :-| -- David A. Holland dholl...@netbsd.org
Re: veriexec(4) maintenance
On Wed, Aug 02, 2023 at 07:47:48AM +, Taylor R Campbell wrote: > veriexec(4), and the fileassoc(9) API it uses internally, needs > maintenance and probably some serious rework to fix synchronization > problems on any MP and/or preemptible kernels (i.e., all x86 of the > past couple decades) with implications for any security properties it > is supposed to provide. > > If anyone cares about veriexec(4) and would like to volunteer to take > this on, we can discuss what needs to be done and how to proceed. > I will take this on since I was originally responsible for the mess. It does play fast and loose with synchronisation and should be fixed. I have some fixes in my private tree that went some way to addressing the issues but they have been languishing for many many years because I had an issue of a deadlock that would occaisionally occur. At the time I couldn't work out where the deadlock was coming from but maybe with some expert help and the improved tools I can nut it out. -- Brett Lymn -- Sent from my NetBSD device. "We are were wolves", "You mean werewolves?", "No we were wolves, now we are something else entirely", "Oh"
veriexec(4) maintenance
veriexec(4), and the fileassoc(9) API it uses internally, needs maintenance and probably some serious rework to fix synchronization problems on any MP and/or preemptible kernels (i.e., all x86 of the past couple decades) with implications for any security properties it is supposed to provide. If anyone cares about veriexec(4) and would like to volunteer to take this on, we can discuss what needs to be done and how to proceed. (If not, we should maybe just delete it.)