Hi there!
   I've been looking through the commit logs for the files in src/lwp to see if 
there is any indication as to what introduced this issue, but there's not very 
much.  The old Linux server I have participating in the old kauth cell is 
running OpenAFS 1.6.13, so that's ~circa July 2015.  Therefore the changes 
would have to be between then and now.  While I'm building this on AIX, another 
person I know had the same problem on Linux, so it is not a platform-specific 
issue.

   Any idea how to track it down?  This is made more difficult by the limited 
debugger output.

Thank you!

-Ben

________________________________
From: Benjamin Kaduk <ka...@mit.edu>
Sent: Monday, November 11, 2024 8:21 PM
To: Ben Huntsman <b...@huntsmans.net>
Cc: openafs-devel@openafs.org <openafs-devel@openafs.org>
Subject: Re: [OpenAFS-devel] Re: kauth broken - problem with IOMGR in lwp?

That makes it sound like someone put some code with side effects inside an
assertion statement, so that it gets compiled out in non-debug builds.
In the, um, more maintained parts of the tree we are using opr_Assert()
vs opr_Verify() to indicate things that do not or do always need to be
executed, but it looks like kauth has not gotten such treatment.
Which does make one wonder just how long it's been broken in this way...

-Ben

On Tue, Nov 12, 2024 at 01:58:51AM +0000, Ben Huntsman wrote:
> In continuing to research this, I see there's a lot of interesting code in 
> lwp that can be enabled by defining DEBUG.  So just to give it a whirl, I 
> added a line to lwp.h:
>
> #define DEBUG 1
>
> And recompiled just lwp and kauth.  Now the resulting klog works.  Very 
> bizarre...  I'm not sure what to make of it yet.
>
> -Ben
>
> ________________________________
> From: openafs-devel-ad...@openafs.org <openafs-devel-ad...@openafs.org> on 
> behalf of Ben Huntsman <b...@huntsmans.net>
> Sent: Monday, November 11, 2024 12:26 PM
> To: openafs-devel@openafs.org <openafs-devel@openafs.org>
> Subject: [OpenAFS-devel] kauth broken - problem with IOMGR in lwp?
>
> Hi everyone-
>    First of all, please don't laugh, but I do have an older test cell that 
> runs kauth instead of krb5.  This is at home, not for anything production, so 
> don't worry.
>
>    That being said, is kauth currently broken?  A colleague of mine tried it 
> on Linux and gets a segfault when running klog, and I get the same behavior 
> on AIX:
>
> $ dbx /usr/afs/bin/klog core
> Type 'help' for help.
> [using memory image in core]
> reading symbolic information ...
>
> Segmentation fault in unnamed block in IOMGR at line 362 in file "iomgr.c"
>   362           FD_ZERO(&IOMGR_writefds);
> (dbx) where
> libdebug assertion "(framep->getGpr(STKP, &addr) == DB_SUCCESS && *nextStkpp 
> == addr)" failed at line 1418 in file 
> ../../../../../../../../../../../src/bos/usr/ccs/lib/libdbx/libdebug/modules/stackdebug/POWER/stackdb_FrameProgress.C
> unnamed block in IOMGR(dummy = (nil)), line 362 in "iomgr.c"
> (dbx)
>
> And here's a blurb from around that line in src/lwp/iomgr.c:
> ...
> /* These are not declared in IOMGR so that they don't use up 6K of stack. */
> static fd_set IOMGR_readfds, IOMGR_writefds, IOMGR_exceptfds;
> static int IOMGR_nfds = 0;
>
> static void *IOMGR(void *dummy)
> {
>     for (;;) {
>         int code;
>         struct TM_Elem *earliest;
>         struct timeval timeout, junk;
>         bool woke_someone;
>
>         FD_ZERO(&IOMGR_readfds);
>         FD_ZERO(&IOMGR_writefds);
>         FD_ZERO(&IOMGR_exceptfds);
>         IOMGR_nfds = 0;
> ...
>
>
>    I did the compile with the ./configure --enable-debug and 
> -enable-debug-lwp options specified.  Can someone help explain how this code 
> works, and what might done to fix it?  I'm a little fuzzy on the *IOMGR piece 
> and I don't see that anyone calls IOMGR() directly in the code...
>
> Thank you!
>
> -Ben
>

Reply via email to