Re: panic: reducing sbsize: lost count, uid = 1001

2000-08-28 Thread John Polstra

In article <[EMAIL PROTECTED]>,
Brian Fundakowski Feldman  <[EMAIL PROTECTED]> wrote:
> 
> Woops, I have the KASSERT bungled up.  Please change
> KASSERT(to < *hiwat && uip != NULL,
> to
> KASSERT(to >= *hiwat || uip != NULL,

It seems to be fixed now.  I've had a script pounding on it all
afternoon -- 843 runs so far -- and haven't been able to make it
misbehave.  Before, it only took a few tries to make it panic.

John
-- 
  John Polstra   [EMAIL PROTECTED]
  John D. Polstra & Co., Inc.Seattle, Washington USA
  "Disappointment is a good sign of basic intelligence."  -- Chögyam Trungpa



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: panic: reducing sbsize: lost count, uid = 1001

2000-08-28 Thread John Polstra

In article <[EMAIL PROTECTED]>,
Brian Fundakowski Feldman  <[EMAIL PROTECTED]> wrote:
> 
> Woops, I have the KASSERT bungled up.  Please change
> KASSERT(to < *hiwat && uip != NULL,
> to
> KASSERT(to >= *hiwat || uip != NULL,

Thanks.  The system comes up OK now.  I'll try to provoke the lost
count panic some more today, and I'll let you know what happens.

John


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: panic: reducing sbsize: lost count, uid = 1001

2000-08-27 Thread Brian Fundakowski Feldman

On Sun, 27 Aug 2000, John Polstra wrote:

> In article <[EMAIL PROTECTED]>,
> Brian Fundakowski Feldman  <[EMAIL PROTECTED]> wrote:
> > If this is a problem with sbsize, this should take care of any possibility
> > ever of there being a problem...
> 
> I tried your patch, but it panics reliably on start-up:

Woops, I have the KASSERT bungled up.  Please change
KASSERT(to < *hiwat && uip != NULL,
to
KASSERT(to >= *hiwat || uip != NULL,

--
 Brian Fundakowski Feldman   \  FreeBSD: The Power to Serve!  /
 [EMAIL PROTECTED]`--'



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: panic: reducing sbsize: lost count, uid = 1001

2000-08-27 Thread John Polstra

In article <[EMAIL PROTECTED]>,
Brian Fundakowski Feldman  <[EMAIL PROTECTED]> wrote:
> If this is a problem with sbsize, this should take care of any possibility
> ever of there being a problem...

I tried your patch, but it panics reliably on start-up:

Automatic boot in progress...
/dev/da0s1a: FILESYSTEM CLEAN; SKIPPING CHECKS
/dev/da0s1a: clean, 335363 free (8667 frags, 40837 blocks, 1.1% fragmentation)
/dev/da0s1e: FILESYSTEM CLEAN; SKIPPING CHECKS
/dev/da0s1e: clean, 1966150 free (46222 frags, 239991 blocks, 1.0% fragmentation)
Doing initial network setup: hostname.
panic: reducing sbsize: lost count, uid = 0
Debugger("panic")
Stopped at  Debugger+0x34:  movb$0,in_Debugger.390
db> trace
Debugger(c0280363) at Debugger+0x34
panic(c027fc80,0,2400,0,c7c20f74) at panic+0x70
chgsbsize(0,c7c20f78,2400,,7fff) at chgsbsize+0x33
sbreserve(c7c20f74,2400,c7c20f00,c77ee440) at sbreserve+0x6a
soreserve(c7c20f00,2400,a280,c7c20f00,c02bb368) at soreserve+0x1c
udp_attach(c7c20f00,0,c77ee440,0,c86f6f80) at udp_attach+0x2a
socreate(2,c86f6f20,2,0,c77ee440) at socreate+0xe8
socket(c77ee440,c86f6f80,8085098,bfbffda0,3) at socket+0x3e
syscall2(2f,2f,2f,3,bfbffda0) at syscall2+0x1f1
Xint0x80_syscall() at Xint0x80_syscall+0x25

The value of *hiwat in chgsbsize() is 0:

db> x/ul 0xc7c20f78
0xc7c20f78: 0

I can't get it to generate a core dump that it will recognize on
reboot, and I'm not set up for remote gdb on this machine.  But I can
check anything you'd like with ddb.

John
-- 
  John Polstra   [EMAIL PROTECTED]
  John D. Polstra & Co., Inc.Seattle, Washington USA
  "Disappointment is a good sign of basic intelligence."  -- Chögyam Trungpa



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: panic: reducing sbsize: lost count, uid = 1001

2000-08-26 Thread Brian Fundakowski Feldman

If this is a problem with sbsize, this should take care of any possibility
ever of there being a problem...

Index: kern/kern_proc.c
===
RCS file: /usr2/ncvs/src/sys/kern/kern_proc.c,v
retrieving revision 1.69
diff -u -r1.69 kern_proc.c
--- kern/kern_proc.c2000/07/04 11:25:22 1.69
+++ kern/kern_proc.c2000/08/26 23:50:40
@@ -190,25 +190,33 @@
  * Change the total socket buffer size a user has used.
  */
 int
-chgsbsize(uid, diff, max)
+chgsbsize(uid, hiwat, to, max)
uid_t   uid;
-   rlim_t  diff;
+   u_long *hiwat;
+   u_long  to;
rlim_t  max;
 {
struct uidinfo *uip;
+   rlim_t diff;
+   int s;
 
uip = uifind(uid);
-   if (diff < 0)
-   KASSERT(uip != NULL, ("reducing sbsize: lost count, uid = %d", uid));
+   KASSERT(to < *hiwat && uip != NULL,
+   ("reducing sbsize: lost count, uid = %d", uid));
if (uip == NULL)
uip = uicreate(uid);
+   s = splnet();
+   diff = to - *hiwat;
/* don't allow them to exceed max, but allow subtraction */
if (diff > 0 && uip->ui_sbsize + diff > max) {
(void)uifree(uip);
+   splx(s);
return (0);
}
uip->ui_sbsize += diff;
+   *hiwat = to;
(void)uifree(uip);
+   splx(s);
return (1);
 }
 
Index: kern/uipc_socket2.c
===
RCS file: /usr2/ncvs/src/sys/kern/uipc_socket2.c,v
retrieving revision 1.61
diff -u -r1.61 uipc_socket2.c
--- kern/uipc_socket2.c 2000/07/31 08:23:43 1.61
+++ kern/uipc_socket2.c 2000/08/26 23:36:25
@@ -420,7 +420,6 @@
struct socket *so;
struct proc *p;
 {
-   rlim_t delta;
 
/*
 * p will only be NULL when we're in an interrupt
@@ -428,8 +427,7 @@
 */
if ((u_quad_t)cc > (u_quad_t)sb_max * MCLBYTES / (MSIZE + MCLBYTES))
return (0);
-   delta = (rlim_t)cc - sb->sb_hiwat;
-   if (p && !chgsbsize(so->so_cred->cr_uid, delta,
+   if (p && !chgsbsize(so->so_cred->cr_uid, &sb->sb_hiwat, cc,
p->p_rlimit[RLIMIT_SBSIZE].rlim_cur)) {
return (0);
}
@@ -450,8 +448,8 @@
 {
 
sbflush(sb);
-   (void)chgsbsize(so->so_cred->cr_uid, -(rlim_t)sb->sb_hiwat, RLIM_INFINITY);
-   sb->sb_hiwat = sb->sb_mbmax = 0;
+   (void)chgsbsize(so->so_cred->cr_uid, &sb->sb_hiwat, 0, RLIM_INFINITY);
+   sb->sb_mbmax = 0;
 }
 
 /*
Index: kern/uipc_socket.c
===
RCS file: /usr2/ncvs/src/sys/kern/uipc_socket.c,v
retrieving revision 1.80
diff -u -r1.80 uipc_socket.c
--- kern/uipc_socket.c  2000/08/07 17:52:08 1.80
+++ kern/uipc_socket.c  2000/08/26 23:37:00
@@ -191,10 +191,10 @@
so->so_gencnt = ++so_gencnt;
if (so->so_rcv.sb_hiwat)
(void)chgsbsize(so->so_cred->cr_uid,
-   -(rlim_t)so->so_rcv.sb_hiwat, RLIM_INFINITY);
+   &so->so_rcv.sb_hiwat, 0, RLIM_INFINITY);
if (so->so_snd.sb_hiwat)
(void)chgsbsize(so->so_cred->cr_uid,
-   -(rlim_t)so->so_snd.sb_hiwat, RLIM_INFINITY);
+   &so->so_snd.sb_hiwat, 0, RLIM_INFINITY);
if (so->so_accf != NULL) {
if (so->so_accf->so_accept_filter != NULL && 
so->so_accf->so_accept_filter->accf_destroy != NULL) {
Index: kern/uipc_usrreq.c
===
RCS file: /usr2/ncvs/src/sys/kern/uipc_usrreq.c,v
retrieving revision 1.58
diff -u -r1.58 uipc_usrreq.c
--- kern/uipc_usrreq.c  2000/07/11 22:07:43 1.58
+++ kern/uipc_usrreq.c  2000/08/26 23:52:24
@@ -217,6 +217,7 @@
 {
struct unpcb *unp = sotounpcb(so);
struct socket *so2;
+   u_long newhiwat;
 
if (unp == 0)
return EINVAL;
@@ -235,9 +236,10 @@
 */
so2->so_snd.sb_mbmax += unp->unp_mbcnt - so->so_rcv.sb_mbcnt;
unp->unp_mbcnt = so->so_rcv.sb_mbcnt;
-   so2->so_snd.sb_hiwat += unp->unp_cc - so->so_rcv.sb_cc;
-   (void)chgsbsize(so2->so_cred->cr_uid,
-   (rlim_t)unp->unp_cc - so->so_rcv.sb_cc, RLIM_INFINITY);
+   newhiwat = so2->so_snd.sb_hiwat + unp->unp_cc -
+   so->so_rcv.sb_cc;
+   (void)chgsbsize(so2->so_cred->cr_uid, &so2->so_snd.sb_hiwat,
+   newhiwat, RLIM_INFINITY);
unp->unp_cc = so->so_rcv.sb_cc;
sowwakeup(so2);
break;
@@ -257,6 +259,7 @@
int error = 0;
struct unpcb *unp = sotounpcb(so);
struct socket *so2;
+   u_long newhiwat;
 
if (unp == 0) {
error = EINVAL;
@@ -342,10 +345,10 @@
so->so_snd.sb_mbmax -=
so2->so_rcv.sb_mbcnt -

Re: panic: reducing sbsize: lost count, uid = 1001

2000-08-24 Thread Alfred Perlstein

* Archie Cobbs <[EMAIL PROTECTED]> [000824 14:52] wrote:
> I don't know if this is related to the problems you guys are looking at,
> but I have a box that every so often (every couple of months) panics
> with a "panic: recieve 1" panic. This panic happens when the socket
> character count is bogus during a recv(2), etc. system call.
> 
> So several months ago I came up with a patch to try and track this
> down, and with the patch it panics immediately.. but I couldn't
> figure out why at the time and haven't pursued it since then.
> 
> Anyway, for what it's worth, the patch I was using is here:
> 
>   ftp://ftp.whistle.com/pub/archie/misc/sbcheck.patch
> 
> Some variant of it may be useful for tracking down this problem too.

It seems that the socket counters aren't being protected by spl
enough, brian has suggested moving it into the chgsbsize function
which would encapsulate spl+uidinfo+sbsize issues.  I'm hoping he'll
look into it, and I will be as well as soon as I find the time.

-Alfred


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: panic: reducing sbsize: lost count, uid = 1001

2000-08-24 Thread Archie Cobbs

I don't know if this is related to the problems you guys are looking at,
but I have a box that every so often (every couple of months) panics
with a "panic: recieve 1" panic. This panic happens when the socket
character count is bogus during a recv(2), etc. system call.

So several months ago I came up with a patch to try and track this
down, and with the patch it panics immediately.. but I couldn't
figure out why at the time and haven't pursued it since then.

Anyway, for what it's worth, the patch I was using is here:

  ftp://ftp.whistle.com/pub/archie/misc/sbcheck.patch

Some variant of it may be useful for tracking down this problem too.

-Archie

___
Archie Cobbs   *   Whistle Communications, Inc.  *   http://www.whistle.com


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: panic: reducing sbsize: lost count, uid = 1001

2000-08-24 Thread Brian Fundakowski Feldman

Try making them small critical sections.  If it makes it easier,
which it probably will, try this: pass the pointer to sb_hiwat as an
argument to chgsbsize and make that the only way to modify it (sockbuf
creation would have to be a place where it's initialized manually to
0 ;) I'd say stick the hiwat increment of delta at the end, after
malloc, since that would place it in the same context as the setting.

Luckily, doing this right would be making the code clearer in several
of the (few) places sb_hiwat is used.  We just have to assure that
sb_hiwat is always consistent with the ui_sbsize which can be done with
a critical section that "knows" the delta to apply and where to apply
it.

Using splimp() should not be necessary as that is used for mbuf
protection, which is why network card drivers' interrupts must be
called at splimp() (an aggregate mask which includes splnet()): they
need to not corrupt the mbuf subsystem.  Plus, it makes a convenient
critical section for the network drivers in this way :)

At least, this is how I learned it to be.  I'm not sure if it's
absolutely correct, but it should be.

--
 Brian Fundakowski Feldman   \  FreeBSD: The Power to Serve!  /
 [EMAIL PROTECTED]`--'



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: panic: reducing sbsize: lost count, uid = 1001

2000-08-23 Thread Alfred Perlstein

* Brian Fundakowski Feldman <[EMAIL PROTECTED]> [000823 22:05] wrote:
> On Wed, 23 Aug 2000, Alfred Perlstein wrote:
> 
> > * Alfred Perlstein <[EMAIL PROTECTED]> [000823 14:29] wrote:
> > > 
> > > I have a feeling that this is related to missing spl protection around
> > > the chgsbsize subsystem, this was probably an issue before I touched it
> > > but since I touched it last I'll have a look-see.
> > > 
> > > Brian, does that makes sense?
> > [...]
> > 
> > Does it make sense to wrap chgsbsize with spl so callers don't have
> > to worry about it?
> > 
> 
> Yeah, I say to go for it.  I was /certain/ that these functions had
> the right spl()s before; if the patch fixes jdp's problem, I can't see
> a good reason not to change it, other than it would hide what may be
> quite problematic for other reasons even if not for that one...

Actually with my patches he still has problems, but I just realized
that I'm using splnet in my patches, should I be using splimp?

patch is here:

http://people.freebsd.org/~alfred/sbsize_spl.diff

Note that I'm quite sure it's not just sbsize which needs spl, it's
the code that modifies the socketbuffer's size fields as well as
the chgsbsize() calls otherwise user context may be preempted by
a packet closing the connection after sbsize has been adjusted but
not before the buffer sizes have been fixed in the socket struture
causing the interrupt context to try to chgsbsize again.

Or at least that's what I think may be going on.

-- 
-Alfred Perlstein - [[EMAIL PROTECTED]|[EMAIL PROTECTED]]
"I have the heart of a child; I keep it in a jar on my desk."


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: panic: reducing sbsize: lost count, uid = 1001

2000-08-23 Thread Brian Fundakowski Feldman

On Wed, 23 Aug 2000, Alfred Perlstein wrote:

> * Alfred Perlstein <[EMAIL PROTECTED]> [000823 14:29] wrote:
> > 
> > I have a feeling that this is related to missing spl protection around
> > the chgsbsize subsystem, this was probably an issue before I touched it
> > but since I touched it last I'll have a look-see.
> > 
> > Brian, does that makes sense?
> [...]
> 
> Does it make sense to wrap chgsbsize with spl so callers don't have
> to worry about it?
> 

Yeah, I say to go for it.  I was /certain/ that these functions had
the right spl()s before; if the patch fixes jdp's problem, I can't see
a good reason not to change it, other than it would hide what may be
quite problematic for other reasons even if not for that one...

--
 Brian Fundakowski Feldman   \  FreeBSD: The Power to Serve!  /
 [EMAIL PROTECTED]`--'



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: panic: reducing sbsize: lost count, uid = 1001

2000-08-23 Thread John Polstra

In article <[EMAIL PROTECTED]>,
Alfred Perlstein  <[EMAIL PROTECTED]> wrote:
> 
> more paranioa:
> 
> 
> Index: uipc_socket.c
> ===
> RCS file: /home/ncvs/src/sys/kern/uipc_socket.c,v
> retrieving revision 1.80
> diff -u -u -r1.80 uipc_socket.c
> --- uipc_socket.c 2000/08/07 17:52:08 1.80
> +++ uipc_socket.c 2000/08/23 23:06:13
> @@ -187,8 +187,10 @@
>  sodealloc(so)

Nope, still no go.  Here's the stack trace:

#9  0xc0166894 in panic (
fmt=0xc027fca0 "reducing sbsize: lost count, uid = %d")
at /local0/src/sys/kern/kern_shutdown.c:551
#10 0xc0163b67 in chgsbsize (uid=1001, diff=-17520, max=9223372036854775807)
at /local0/src/sys/kern/kern_proc.c:202
#11 0xc0186ffa in sbrelease (sb=0xc7c227f4, so=0xc7c22780)
at /local0/src/sys/kern/uipc_socket2.c:459
#12 0xc0184343 in sofree (so=0xc7c22780)
at /local0/src/sys/kern/uipc_socket.c:264
#13 0xc01b89ae in in_pcbdetach (inp=0xc7c86440)
at /local0/src/sys/netinet/in_pcb.c:542
#14 0xc01c2145 in tcp_close (tp=0xc7c86500)
at /local0/src/sys/netinet/tcp_subr.c:711
#15 0xc01c010a in tcp_input (m=0xc075cb00, off0=20, proto=6)
at /local0/src/sys/netinet/tcp_input.c:2012
#16 0xc01bb0ba in ip_input (m=0xc075cb00)
at /local0/src/sys/netinet/ip_input.c:756
#17 0xc01bb117 in ipintr () at /local0/src/sys/netinet/ip_input.c:784

I see that tcp_close() is in the call stack, but that's surprising.
It didn't seem like the transfer had gone on nearly long enough for it
to be finishing already.  Also, from the peer's point of view it was
not finished.

John
-- 
  John Polstra   [EMAIL PROTECTED]
  John D. Polstra & Co., Inc.Seattle, Washington USA
  "Disappointment is a good sign of basic intelligence."  -- Chögyam Trungpa



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: panic: reducing sbsize: lost count, uid = 1001

2000-08-23 Thread Alfred Perlstein

* John Polstra <[EMAIL PROTECTED]> [000823 15:55] wrote:
> In article <[EMAIL PROTECTED]>,
> Alfred Perlstein  <[EMAIL PROTECTED]> wrote:
> > > Nope, that doesn't fix it.  I got the same panic on the very first
> > > try.
> > 
> > hmm, when does it happen?  During the transfer or at the end of the
> > transfer?
> 
> The first time I reported the problem it had happened during the
> transfer.  This time (with your patch) the transfer actually completed
> from the peer's point of view before the panic occurred.  That's not
> many data points, so it could be coincidence.  I'll keep you posted
> if I can make it happen again.

more paranioa:


Index: uipc_socket.c
===
RCS file: /home/ncvs/src/sys/kern/uipc_socket.c,v
retrieving revision 1.80
diff -u -u -r1.80 uipc_socket.c
--- uipc_socket.c   2000/08/07 17:52:08 1.80
+++ uipc_socket.c   2000/08/23 23:06:13
@@ -187,8 +187,10 @@
 sodealloc(so)
struct socket *so;
 {
+   int s;
 
so->so_gencnt = ++so_gencnt;
+   s = splnet();   /* protect against interrupts messing with sbsize */
if (so->so_rcv.sb_hiwat)
(void)chgsbsize(so->so_cred->cr_uid,
-(rlim_t)so->so_rcv.sb_hiwat, RLIM_INFINITY);
@@ -204,6 +206,7 @@
FREE(so->so_accf->so_accept_filter_str, M_ACCF);
FREE(so->so_accf, M_ACCF);
}
+   splx(s);
crfree(so->so_cred);
zfreei(so->so_zone, so);
 }

-- 
-Alfred Perlstein - [[EMAIL PROTECTED]|[EMAIL PROTECTED]]
"I have the heart of a child; I keep it in a jar on my desk."


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: panic: reducing sbsize: lost count, uid = 1001

2000-08-23 Thread John Polstra

In article <[EMAIL PROTECTED]>,
Alfred Perlstein  <[EMAIL PROTECTED]> wrote:
> > Nope, that doesn't fix it.  I got the same panic on the very first
> > try.
> 
> hmm, when does it happen?  During the transfer or at the end of the
> transfer?

The first time I reported the problem it had happened during the
transfer.  This time (with your patch) the transfer actually completed
from the peer's point of view before the panic occurred.  That's not
many data points, so it could be coincidence.  I'll keep you posted
if I can make it happen again.

John
-- 
  John Polstra   [EMAIL PROTECTED]
  John D. Polstra & Co., Inc.Seattle, Washington USA
  "Disappointment is a good sign of basic intelligence."  -- Chögyam Trungpa



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: panic: reducing sbsize: lost count, uid = 1001

2000-08-23 Thread Alfred Perlstein

* John Polstra <[EMAIL PROTECTED]> [000823 15:39] wrote:
> In article <[EMAIL PROTECTED]>,
> Alfred Perlstein  <[EMAIL PROTECTED]> wrote:
> > 
> > Let's take a more paraniod approach (back out my spl in chgsbsize):
> > 
> > 
> > Index: uipc_socket2.c
> 
> Nope, that doesn't fix it.  I got the same panic on the very first
> try.

hmm, when does it happen?  During the transfer or at the end of the
transfer?

-Alfred


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: panic: reducing sbsize: lost count, uid = 1001

2000-08-23 Thread John Polstra

In article <[EMAIL PROTECTED]>,
Alfred Perlstein  <[EMAIL PROTECTED]> wrote:
> 
> Let's take a more paraniod approach (back out my spl in chgsbsize):
> 
> 
> Index: uipc_socket2.c

Nope, that doesn't fix it.  I got the same panic on the very first
try.

John
-- 
  John Polstra   [EMAIL PROTECTED]
  John D. Polstra & Co., Inc.Seattle, Washington USA
  "Disappointment is a good sign of basic intelligence."  -- Chögyam Trungpa



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: panic: reducing sbsize: lost count, uid = 1001

2000-08-23 Thread Alfred Perlstein

* John Polstra <[EMAIL PROTECTED]> [000823 15:03] wrote:
> In article <[EMAIL PROTECTED]>,
> Alfred Perlstein  <[EMAIL PROTECTED]> wrote:
> 
> > John can you try this patch and let us know if you still experiance
> > crashes?
> 
> Will do.  I'll let you know what happens.

Let's take a more paraniod approach (back out my spl in chgsbsize):


Index: uipc_socket2.c
===
RCS file: /home/ncvs/src/sys/kern/uipc_socket2.c,v
retrieving revision 1.61
diff -u -u -r1.61 uipc_socket2.c
--- uipc_socket2.c  2000/07/31 08:23:43 1.61
+++ uipc_socket2.c  2000/08/23 22:23:47
@@ -448,10 +448,17 @@
struct sockbuf *sb;
struct socket *so;
 {
+   int s;
 
sbflush(sb);
+   /*
+* if we don't spl an interrupt can recurse into us and call chgsbsize
+* before we zero sb->sb_hiwat
+*/
+   s = splnet();
(void)chgsbsize(so->so_cred->cr_uid, -(rlim_t)sb->sb_hiwat, RLIM_INFINITY);
sb->sb_hiwat = sb->sb_mbmax = 0;
+   splx(s);
 }
 
 /*

-- 
-Alfred Perlstein - [[EMAIL PROTECTED]|[EMAIL PROTECTED]]
"I have the heart of a child; I keep it in a jar on my desk."


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: panic: reducing sbsize: lost count, uid = 1001

2000-08-23 Thread John Polstra

In article <[EMAIL PROTECTED]>,
Alfred Perlstein  <[EMAIL PROTECTED]> wrote:

> John can you try this patch and let us know if you still experiance
> crashes?

Will do.  I'll let you know what happens.

John
-- 
  John Polstra   [EMAIL PROTECTED]
  John D. Polstra & Co., Inc.Seattle, Washington USA
  "Disappointment is a good sign of basic intelligence."  -- Chögyam Trungpa



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: panic: reducing sbsize: lost count, uid = 1001

2000-08-23 Thread Alfred Perlstein

* Alfred Perlstein <[EMAIL PROTECTED]> [000823 14:29] wrote:
> 
> I have a feeling that this is related to missing spl protection around
> the chgsbsize subsystem, this was probably an issue before I touched it
> but since I touched it last I'll have a look-see.
> 
> Brian, does that makes sense?

So far, here's functions that look like they call chgsbsize without
splnet:

socreate (called from socket() and socketpair(), on error calls
sofree() which then calls sodealloc() without splnet)

sonewconn3 (called from sonewconn which i'm unsure of the spl at
this point)

I'm sure there's more.

Does it make sense to wrap chgsbsize with spl so callers don't have
to worry about it?

John can you try this patch and let us know if you still experiance
crashes?

Index: kern_proc.c
===
RCS file: /home/ncvs/src/sys/kern/kern_proc.c,v
retrieving revision 1.69
diff -u -u -r1.69 kern_proc.c
--- kern_proc.c 2000/07/04 11:25:22 1.69
+++ kern_proc.c 2000/08/23 21:49:49
@@ -196,6 +196,7 @@
rlim_t  max;
 {
struct uidinfo *uip;
+   int s = splnet();
 
uip = uifind(uid);
if (diff < 0)
@@ -205,10 +206,12 @@
/* don't allow them to exceed max, but allow subtraction */
if (diff > 0 && uip->ui_sbsize + diff > max) {
(void)uifree(uip);
+   splx(s);
return (0);
}
uip->ui_sbsize += diff;
(void)uifree(uip);
+   splx(s);
return (1);
 }


If this doesn't work then it may be nessesary to spl around examining
the socketbuffer's size.
 
thanks,
-- 
-Alfred Perlstein - [[EMAIL PROTECTED]|[EMAIL PROTECTED]]
"I have the heart of a child; I keep it in a jar on my desk."


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: panic: reducing sbsize: lost count, uid = 1001

2000-08-23 Thread Alfred Perlstein

* John Polstra <[EMAIL PROTECTED]> [000823 13:46] wrote:
> I got the above panic in a -current kernel from August 19 with
> INVARIANTS and INVARIANT_SUPPORT compiled in.  I also saw it once
> before on a kernel from a few weeks ago.  In both cases the panic
> occurred when receiving a 25 MB file with FTP over a gigabit Ethernet
> link (wx driver).  Here is the relevant portion of the stack trace:
> 
> #16 0xc016689d in panic (
> fmt=0xc027fc80 "reducing sbsize: lost count, uid = %d")
> at /local0/src/sys/kern/kern_shutdown.c:553
> #17 0xc0163b67 in chgsbsize (uid=1001, diff=-17520, max=9223372036854775807)
> at /local0/src/sys/kern/kern_proc.c:202
> #18 0xc0186fe2 in sbrelease (sb=0xc7c22674, so=0xc7c22600)
> at /local0/src/sys/kern/uipc_socket2.c:453
> #19 0xc0184333 in sofree (so=0xc7c22600)
> at /local0/src/sys/kern/uipc_socket.c:261
> #20 0xc01b898e in in_pcbdetach (inp=0xc7c86880)
> at /local0/src/sys/netinet/in_pcb.c:542
> #21 0xc01c2125 in tcp_close (tp=0xc7c86940)
> at /local0/src/sys/netinet/tcp_subr.c:711
> #22 0xc01c00ea in tcp_input (m=0xc075ca00, off0=20, proto=6)
> at /local0/src/sys/netinet/tcp_input.c:2012
> #23 0xc01bb09a in ip_input (m=0xc075ca00)
> at /local0/src/sys/netinet/ip_input.c:756
> #24 0xc01bb0f7 in ipintr () at /local0/src/sys/netinet/ip_input.c:784
> 
> Unfortunately, I don't have time to dig into it further any time soon.
> I'll append my kernel config file to this mail.  The system is a
> uniprocessor PII/400.

I have a feeling that this is related to missing spl protection around
the chgsbsize subsystem, this was probably an issue before I touched it
but since I touched it last I'll have a look-see.

Brian, does that makes sense?

-- 
-Alfred Perlstein - [[EMAIL PROTECTED]|[EMAIL PROTECTED]]
"I have the heart of a child; I keep it in a jar on my desk."


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message