Re: MajorDomo Problems

1999-10-11 Thread Robert Watson

Love to know why my freebsd-arch subscription disappeared, although the
rest appeared to stick around.  I just resubscribed, but was subscribed
before (but not sure about until when) as

  [EMAIL PROTECTED]

What is strange is that if that was bouncing, I would have expected, say,
my -current subscription to bounce just as much and be unsubscribed. 

From the files in the directory, maybe it is the case that a reboot/crash
hit hub as a subscription request was being processed?  (does it rewrite
the file in place, or does it copy the file out with changes, then do a
double rename to put it back in place over the old version?)

Robert

On Mon, 11 Oct 1999, Jonathan M. Bresler wrote:

 
 yesyou or your provider was bouncing your email.
 
 i unsubscribed you on Oct 07 05:40:22.
 
 jmb
 
 
 To Unsubscribe: send mail to [EMAIL PROTECTED]
 with "unsubscribe freebsd-current" in the body of the message
 


  Robert N M Watson 

[EMAIL PROTECTED]  http://www.watson.org/~robert/
PGP key fingerprint: AF B5 5F FF A6 4A 79 37  ED 5F 55 E9 58 04 6A B1
TIS Labs at Network Associates, Safeport Network Services



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: man reads /etc/rc.conf?

1999-11-12 Thread Robert Watson

On Wed, 10 Nov 1999, Alexander Leidinger wrote:

 On 11 Nov, Daniel C. Sobral wrote:
   (102) netchild@ttyp2  grep cat /etc/rc.conf.local
   spppconfig_isp0="`cat /etc/isdn/connect.parameters`"
 ^^^
  Calling programs from any of the rc.conf files is considered evil
  and it's looked down on.
 
 It´s there to hide login/passwd information for i4b.

But it seems like the end up as arguments to ifconfig at a later date,
where a user can pull them out of ps, /proc, etc.  The window there is
clearly shorter than keeping it in /etc/rc.conf, but still not "secure"
per se.  The same goes for the use of environmental variables, which can
also be listed using ps.  Probably spppconfig should accept a filename
with the contents as an argument, or the information via a pipe.  

  Robert N M Watson 

[EMAIL PROTECTED]  http://www.watson.org/~robert/
PGP key fingerprint: AF B5 5F FF A6 4A 79 37  ED 5F 55 E9 58 04 6A B1
TIS Labs at Network Associates, Safeport Network Services



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: man reads /etc/rc.conf?

1999-11-13 Thread Robert Watson

On Sat, 13 Nov 1999, Alexander Leidinger wrote:

 On 12 Nov, Robert Watson wrote:
 
(102) netchild@ttyp2  grep cat /etc/rc.conf.local
spppconfig_isp0="`cat /etc/isdn/connect.parameters`"
  ^^^
   Calling programs from any of the rc.conf files is considered evil
   and it's looked down on.
 
  It´s there to hide login/passwd information for i4b.
  
  But it seems like the end up as arguments to ifconfig at a later date,
^^ s/if/spp/

  where a user can pull them out of ps, /proc, etc.  The window there
  is clearly shorter than keeping it in /etc/rc.conf, but still not
 
 It will only be in /proc (ps, etc.) at execution-/boot-time or am I
 missing something?

Yes -- the window of exposure is while a program is running that either a)
has the password as a command line argument, or b) has the variable as an
environmental variable.  Opportunities for using ps to pull this
information out happen after the sppp* portion of rc.network, but begin as
early as sendmail (.forward and deferred delivery), cron (crontab), httpd
(cgi), etc.  And it's important to keep in mind that every time rc.conf is
executed, it will pull in the password using the `...` clause, and store
it in the execution environment of the caller.  Not the same as being in
the exposed environmental variables, but it's more exposure in the sense
that if the program coredumps (i.e., the sh running the script that
invoked /etc/rc.conf) the contents will be in the dump.  Later invocations
of spppcontrol in userland will expose their arguments to the world also.

The generally preferred way to pass in passwords to a program is either to
provide the program with an argument that is the filename storing the
password, or to pass it in via stdin.  E.g., 

% program -p /etc/private/my_password

% cat /etc/private/my_password | program -p - 

  Robert N M Watson 

[EMAIL PROTECTED]  http://www.watson.org/~robert/
PGP key fingerprint: AF B5 5F FF A6 4A 79 37  ED 5F 55 E9 58 04 6A B1
TIS Labs at Network Associates, Safeport Network Services



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: FreeBSD security auditing project.

1999-11-29 Thread Robert Watson


(Damn, go away for Thanksgiving and fall behind on -CURRENT, and miss out
on large interesting and fast-paced discussions!  I am now subscribed to
the new -audit, and probably missed some messages.  I've Bcc'd this to
-current, but CC'd to -audit under the assumption that that is where it
belongs)

On Wed, 24 Nov 1999 [EMAIL PROTECTED] wrote:

 On Wed, 24 Nov 1999, Doug Rabson wrote:
 
  We need to put audit tags into the source tree when a file is audited.
  That allows the diffs to be audited later which should be a smaller job
  and then the audit tag slides forward.
 
   Not to interrupt in the middle of this discussion but you might
 want to check with robert watson before you guys get too deep here since
 he is working on a FUNDED Posix.1e implementation for FreeBSD. And has
 already posted some EARLY MAC code. It might be usefull to work with him
 as well. Just a thought.

Chris,

That would be the "other" audit -- this auditing is source code
auditing for bugs in the code.  The POSIX.1e auditing would be event
logging/etc.  Sadly, they have the same name, and I'm not sure which
is the more appropriate activity to have the name :-).  

That said, there have been past projects to audit the FreeBSD source
code, but this seems to have the most steam behind it so far, and I
hope it goes forwards.

It's important to note that source code auditing is not the only
thing that makes OpenBSD more secure -- strong crypto, careful
thinking through of information leaking, etc, are also very important.
There are many bugs in the security design, not just in the
implementation, important as the implementation may be.

Strings in C seem to be a huge source of security problems, but needless
to say even if we had a better string library, there would still be
security problems :-) -- poorly thought out suid programs, incorrect use
of setuid to create sandboxes (man, uucp, etc, etc), bad permissions,
reliance on privacy of environmental variables, poor random number
seeding, misunderstandings about euids/uids/reuids/etc in the context of
debugging and tracing, weak defense of /dev/kmem, etc, etc, and there are
dozens and dozens of such issues. 

POSIX.1e extensions attempt to address some of these issues by providing
least privilege capabilities, finer grained access control, as well as
trusted system behavior such as mandatory access control and auditing. 

This all also requires serious thought.  Source auditing is a great
step forwards, however, as it clears up the most commonly exploited
security holes that make for bad press and lots of bugtraq
announcements.  :-)


  Robert N M Watson 

[EMAIL PROTECTED]  http://www.watson.org/~robert/
PGP key fingerprint: AF B5 5F FF A6 4A 79 37  ED 5F 55 E9 58 04 6A B1
TIS Labs at Network Associates, Safeport Network Services



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: fsck / and remount failure

2002-10-27 Thread Robert Watson
Are you using UFS1 extended attributes on that box?  I suspect there might
be a bug involving the open flags passed to extended attribute backing
vnodes such that a remount is refused because there are existing vnodes
opened writable.  I.e., the extended attribute backing files are opened
FREAD|FWRITE, and since the file system is mounted read-only, remount
refuses to upgrade to a rw mount until they are closed.  My guess is that,
in fact, this should be permitted, or we should re-open the backing files
on a remount.  I'd like to get this bug fixed, but it is another reason to
recommend UFS2 over UFS1.

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories

On Sun, 27 Oct 2002, Sean Kelly wrote:

 I just suffered a kernel panic and upon reboot, I noticed that the root
 filesystem isn't able to be remounted read/write after the fsck:
 
 Mounting root from ufs:/dev/ad1s1a
 WARNING: / was not properly dismounted
 ...
 Starting file system checks:
 /dev/ad1s1a: INCORRECT BLOCK COUNT I=42806 (4 should be 0) (CORRECTED)
 /dev/ad1s1a: UNREF FILE I=42804  OWNER=smkelly MODE=100600
 /dev/ad1s1a: SIZE=8756 MTIME=Oct 27 13:48 2002  (CLEARED)
 /dev/ad1s1a: UNREF FILE I=42805  OWNER=smkelly MODE=100600
 /dev/ad1s1a: SIZE=8630 MTIME=Oct 27 13:48 2002  (CLEARED)
 /dev/ad1s1a: UNREF FILE I=42806  OWNER=root MODE=100444
 /dev/ad1s1a: SIZE=0 MTIME=Oct 27 15:50 2002  (CLEARED)
 /dev/ad1s1a: FREE BLK COUNT(S) WRONG IN SUPERBLK (SALVAGED)
 /dev/ad1s1a: SUMMARY INFORMATION BAD (SALVAGED)
 /dev/ad1s1a: BLK(S) MISSING IN BIT MAPS (SALVAGED)
 /dev/ad1s1a: 2231 files, 83743 used, 168240 free (424 frags, 20977 blocks, 0.2% 
fragmentation)
 /dev/ad1s1e: DEFER FOR BACKGROUND CHECKING
 /dev/ad1s1f: DEFER FOR BACKGROUND CHECKING
 /dev/ad0s1c: DEFER FOR BACKGROUND CHECKING
 mount: /dev/ad1s1a: Device busy
 mount: /dev/ad1s1a: Device busy
 
 Is this a known problem? It is rather annoying to have to come up for
 fscks, then reboot again to get a read/write root filesystem. Am I doing
 something wrong? And yes, I know my root filesystem is excessively large.
 
 -- 
 Sean Kelly | PGP KeyID: 77042C7B
 [EMAIL PROTECTED] | http://www.zombie.org
 
 To Unsubscribe: send mail to [EMAIL PROTECTED]
 with unsubscribe freebsd-current in the body of the message
 


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: Kernel breakage?

2002-10-27 Thread Robert Watson
I think UPDATING hasn't been updated on this, but there was a change in
the format printing for printf that conflicts with the ddb format
printing.  You need to rebuild your gcc.

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories

On Sun, 27 Oct 2002, Josef Karthauser wrote:

 Is this me?
 
 cc1: warnings being treated as errors
 /usr/src/sys/ddb/db_examine.c: In function `db_examine':
 /usr/src/sys/ddb/db_examine.c:132: warning: unknown conversion type
 character `y' in format
 /usr/src/sys/ddb/db_examine.c:132: warning: too many arguments for
 format
 /usr/src/sys/ddb/db_examine.c: In function `db_print_cmd':
 /usr/src/sys/ddb/db_examine.c:216: warning: unknown conversion type
 character `y' in format
 /usr/src/sys/ddb/db_examine.c:216: warning: too many arguments for
 format
 *** Error code 1
 
 Joe
 -- 
 As far as the laws of mathematics refer to reality, they are not certain;
 and as far as they are certain, they do not refer to reality. - Albert
 Einstein, 1921
 


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: Poor 5.0/nfs performance

2002-10-29 Thread Robert Watson
Hmm.  I haven't experienced this with my 5.0 boxes not running
WITNESS/INVARIANTS/etc, but I'm updating a box to give it a try.



Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories

On Tue, 29 Oct 2002, John De Boskey wrote:

 Hi,
 
I have a 5.0 system from 10/27. In an attempt to improve
 performance I commented out the INVARIANTS/WITNESS options:
 
 #optionsINVARIANTS  #Enable calls of extra sanity checking
 #optionsINVARIANT_SUPPORT   #Extra sanity checks of internal structures, 
required by INVARIANTS
 #optionsWITNESS #Enable checks to detect deadlocks and cycles
 #optionsWITNESS_SKIPSPIN#Don't run witness on spinlocks for speed
 
and then started a make release.
 
Since doing this, the machine has become almost totally
 unresponsive. Command execution is measured in hours. A page
 from top which finally came up shows some very high load
 averages:
 
 last pid:  1892;  load averages:  7.14,  6.00,  5.67up 1+10:49:19  23:50:26
 34 processes:  1 running, 33 sleeping
 CPU states:  0.0% user,  0.0% nice, 78.5% system,  0.8% interrupt, 20.7% idle
 Mem: 69M Active, 909M Inact, 214M Wired, 51M Cache, 112M Buf, 255M Free
 Swap: 4096M Total, 36K Used, 4096M Free
 
   PID USERNAME PRI NICE   SIZERES STATETIME   WCPUCPU COMMAND
 99689 root  -85  2928K  2236K biowr0:59  0.00%  0.00% cvs
 
The cvs is being executed by 'make release' updating the chroot area.
 The repo lives in /home/ncvs which is an nfs mount of a 4.7 system. A
 kernel with the above options does not exibit this behaviour.
 
When I killed the cvs process, the machine returns to normal.
 
I guess my basic question is: Are the INVARIANTS and WITNESS
 options required at this point? 
 
 Thanks,
 John
 
 To Unsubscribe: send mail to [EMAIL PROTECTED]
 with unsubscribe freebsd-current in the body of the message
 


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: fsck / and remount failure

2002-11-01 Thread Robert Watson

On Sun, 27 Oct 2002, Sean Kelly wrote:

 On Sun, Oct 27, 2002 at 05:17:44PM -0500, Robert Watson wrote:
  Are you using UFS1 extended attributes on that box?
 
 Yes.
 (290) smkelly@edgemaster:~$ grep UFS /usr/src/sys/i386/conf/EDGEMASTER
 options UFS_DIRHASH
 options UFS_EXTATTR
 options UFS_EXTATTR_AUTOSTART
 options UFS_ACL
 
  I suspect there might
  be a bug involving the open flags passed to extended attribute backing
  vnodes such that a remount is refused because there are existing vnodes
  opened writable.  I.e., the extended attribute backing files are opened
  FREAD|FWRITE, and since the file system is mounted read-only, remount
  refuses to upgrade to a rw mount until they are closed.  My guess is that,
  in fact, this should be permitted, or we should re-open the backing files
  on a remount.  I'd like to get this bug fixed, but it is another reason to
  recommend UFS2 over UFS1.
 
 That would make sense. I had never considered the backing files being
 the cause of it. If necessary, I can rebuild my kernel without ACLs and
 EXTATTRs to see if that cures the problem. 

I suspect it will cure the problem, and I suspect the real solution is
that we need to open the backing files with flags based on the mount flags
(read-only or not), and restart EAs on a remount, allowing the files to be
re-opened with the write flag if needed.

 I haven't switched to UFS2 for a few reasons:
 * I don't know what state it is in (can I boot from it on x86?)

This is probably a question for phk -- my understanding is that we're
either there, or we're very close on the boot issue for i386.  I'm using
it in several places on i386 as non-boot, and on sparc64 as the boot file
system pretty successfully.

 * I don't know how stable it is now

It seems to be pretty stable, and if it's not, we'd really like to find
out.  Because it largely relies on existing UFS1 logic, the chances of it
being stable are a lot higher than if it was a from scratch
implementation.

 * I don't really want to have to go through the hassle of backing up and
   restoring all my data right now. Oh what I'd give for a conversion tool.
   Hello? Partition Magic people? *shakes wallet*

We talked about implementing a conversion tool from UFS1-UFS2, and
decided it would take more resources than we had easily available, and
that the risks associated with a tool would be high.  The backup/restore
solution is a pain, but the one that is most likely to be successful.

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: ypbind doesn't work right on freshly installed machines

2002-11-01 Thread Robert Watson
Per our discussion out-of-band, and just for the reference of others who
might have the same question, forced dependencies for rpcbind from ypserv
and ypbind aren't present right now, you can work around by explicitly
enabling rpcbind in rc.conf.  You might actually see rpcbind running later
in boot, but it's from another forced dependency that is present.  The
symptom of this is that ypbind silently fails if the RPC port mapper isn't
present :-(.


Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories

On Fri, 1 Nov 2002, John Baldwin wrote:

 I installed two machines with fresh current snapshots last night
 and this morning.  One was an i386 box the other a sparc64 box.
 Both machines are NIS clients from the same server.  I do have
 other 5.x and 4.x boxes on the same LAN at home that also are NIS
 clients of the same server (the server is 4.7 box).  All my other
 machines work fine.  However, for the two freshly installed
 test boxs, ypbind doesn't find a server the first time it is run
 during /etc/rc startup.  If I login as root and run 'ypbind' again
 then it works fine.  All my other 5.x boxes which are not fresh
 installs do not have this problem.  Anyone have any ideas?
 
 -- 
 
 John Baldwin [EMAIL PROTECTED]http://www.FreeBSD.org/~jhb/
 Power Users Use the Power to Serve!  -  http://www.FreeBSD.org/
 
 To Unsubscribe: send mail to [EMAIL PROTECTED]
 with unsubscribe freebsd-current in the body of the message
 


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: libc size

2002-11-03 Thread Robert Watson

On Sun, 3 Nov 2002, Miguel Mendez wrote:

  2) Security.  Can LD_LIBRARY_PATH (or other mechanisms)
  be used to deliberately subvert any of these programs?
  (especially the handful of suid/sgid programs here)
 ..
 
 I can't come up right now with an idea of how exploiting LD_LIBRARY_PATH
 could be useful with any of these, but the possibility exists. OTOH, the
 recently added priviledge elevation feature should make it possible to
 have *no* setuid programs on a system, and have the kernel elevate
 priviledges for certain syscalls, based on the policy created by
 systrace. 

LD_LIBRARY_PATH is disabled for setuid binaries -- the kernel sets the
P_ISSETUGID flag, which is exported to userspace by issetugid(), which is
in turn checked by the rtld, which will refuse to observe that
environmental variable (and a number of others) as a result.  We have
plenty of dynamically linked setuid binaires in the system already, and
it's not a problem. 

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: libc size

2002-11-03 Thread Robert Watson

On Sun, 3 Nov 2002, Robert Watson wrote:

 On Sun, 3 Nov 2002, Miguel Mendez wrote:
 
   2) Security.  Can LD_LIBRARY_PATH (or other mechanisms)
   be used to deliberately subvert any of these programs?
   (especially the handful of suid/sgid programs here)
  ..
  
  I can't come up right now with an idea of how exploiting LD_LIBRARY_PATH
  could be useful with any of these, but the possibility exists. OTOH, the
  recently added priviledge elevation feature should make it possible to
  have *no* setuid programs on a system, and have the kernel elevate
  priviledges for certain syscalls, based on the policy created by
  systrace. 
 
 LD_LIBRARY_PATH is disabled for setuid binaries -- the kernel sets the
 P_ISSETUGID flag, which is exported to userspace by issetugid(), which is
 in turn checked by the rtld, which will refuse to observe that
 environmental variable (and a number of others) as a result.  We have
 plenty of dynamically linked setuid binaires in the system already, and
 it's not a problem. 

Due to sucky latency, I didn't realize y fingers had typed the constant
there incorrectly, that should read P_SUGID.  That same protection also
prevents debugging of processes following privilege downgrade, amongst
other things.

On the systrace issue -- I have lasting concerns about the race conditions
present in fine-grained SMP and threaded systems, such as FreeBSD 5, or
present in systems providing Linux clone() emulation.  Neils has addressed
some but not all of these concerns; until they are fully addressed, the
race conditions there will remain a serious problem.  When the scheduler
activation work hits the main NetBSD tree, I would expect similar
problems.

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories





To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: setfacl requirements?

2002-11-08 Thread Robert Watson

On Fri, 8 Nov 2002, Dan Pelleg wrote:

 I'm trying to use setfacl - just the example that's in the manpage. All
 I ever get is:  setfacl: acl_get_file() failed: Operation not supported

This error generally results from three cases:

(1) UFS_ACL isn't enabled
(2) Extended attributes aren't available on the file system (shouldn't
happen for UFS2, but might happen for UFS1 if you don't have
UFS_EXTATTR and appropriate configuration of EAs) 
(3) The file system isn't mounted with the ACL option: either -o acls (or
acls in the fstab file), or more reliably, setting the tunefs -a
enable flag in the file system configuration.

  getfacl seems to work fwiw.

For better or for worse, POSIX.1e defines that getfacl() will print the
current file permissions as an ACL if ACLs aren't available on the file
system.  As such, you're probably just seeing the results of stat()
printed in an ACL form.

 Same results on UFS and UFS2 filesystems. I have UFS_ACL, also tried
 UFS_EXTATTR. -current as of about a week ago. 

With UFS2, it should be sufficient to run the following command on the
unmounted device:

tunefs -a enable /dev/storagedevicehere

and then mount the file system, which will result in ACLs being
automatically enabled.  As mentioned above, it is possible to set the flag
using the mount -o options invocation, or via an fstab entry, but that's a
lot less reliable if some sort of failure occurs, and also doesn't work
well for the root file system.  tunefs is the most reliable way to enable
ACLs.

Let us know if that doesn't work.

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: /dev/acd*t* no longer available in -current?

2002-11-15 Thread Robert Watson

On Fri, 15 Nov 2002, Sheldon Hearn wrote:

 On (2002/11/15 09:48), Soeren Schmidt wrote:
 
   Don't you think it makes more sense for the kernel to start off with
   more restrictive permissions, and have the administrator determine
   whether more restrictive permissions are appropriate?
  
  Actually no I dont.
  The security aware admin will know (or should that be should know :) )
  what to do to make a system secure.
  The avarage user that uses FreeBSD dont, and will get confused if the CDROM
  device doesn't appear to work (ie writeprotected).
 
 Well I think this goes against the grain of much of the work that's
 happened recently. 
 
 Look at how sysinstall now defaults to installing an inetd.conf with no
 services enabled.  Look at how sshd doesn't allow root login or empty
 passwords by default.  Look at how IPFW defaults to deny all.  Look at
 how the floppy drive is inaccessible to anyone but root by default.  And
 so on and so forth. 

So one thing we could start doing is have sysinstall's adduser stuff offer
to place new users in the operator group, and set up the default
permissions on removable devices such that the operator group has
read/write access to them (or even just read-access).  This would be
logically equivilent to placing users in an admin group at instlal on
Windows or Mac OS X.  Operator access connotes the ability to shut down
the system in FreeBSD, as well as the ability to dump file systems, etc.
Another possibility would be to evolve our notion of console user based on
fbtab some for workstation configurations.

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories



To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: DISABLE_PSE DISABLE_PG_G still needed?

2002-11-15 Thread Robert Watson

On Fri, 15 Nov 2002, John Baldwin wrote:

 On 15-Nov-2002 Wesley Morgan wrote:
  On Fri, 15 Nov 2002, Vallo Kallaste wrote:
  
  Just finished '-j2 buildworld' and it did well with kernel which had
  the options enabled. Therefore I suppose that those options are
  still absolutely necessary to make use of -current system. These
  
  This may be a bit overstated. I removed those options from my kernel a few
  weeks ago and have no problems at all. Are you certain the problem is not
  specific to a particular CPU?
 
 It only happens with P4's.  I haven't seen it locally on a p4 test
 machine at work that I have built test releases on.  Also, it would be
 nice to see if just adding one of the options fixed the problems.  As
 for NOTES, those options should not be enabled in NOTES as they would
 defeat the purpose of LINT since they disable code. 

Does this apply generally to all P4's, or just a subset?  If all, it may
be we want to add a P4-workaround to GENERIC so that P4's work better ouf
of the box.  If it's a select few, I wonder if there's some way to test
for the problem early in the boot...

One of the recurring themes here has (a) been P4 processors, and (b) been
a fear that because of timing changes introduced by the DISABLE_FOO flags,
the real bug is still there, but less visible in the tests people are
running.

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories



To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: SMP stability ? [was Re: more info from panic from running dnet on SMP kernel]

2002-11-17 Thread Robert Watson

On Sun, 17 Nov 2002, Thierry Herbelot wrote:

 Even make -j1 buildworld with the SMP kernel ends with a complete freeze
 of the machine (the kernel does not go to a panic where I could try a
 backtrace) 

I've seen several reports that using a serial break to get into ddb is now
quite a bit more reliable than a keyboard break.  If you're not already
using a serial console, you might want to give it a try (make sure to turn
on BREAK_TO_DEBUGGER and/or ALT_BREAK_TO_DEBUGGER). 

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: Processes hanging in thrd_sleep

2002-11-17 Thread Robert Watson
I ran into that during heavy builds on one of my boxes a few months ago --
I never really got around to properly debugging it because the UFS file
systems promptly ate themselves.  Oddly, I had two boxes in particular
that this happened on, and none of my others, and it wasn't clear to me if
there was a useful pattern.  I will try reproducing it sometime this
weekend.  Basically, the system seemed fairly live, but any attempt to
execve() would hang the process in that sleep state.  It looked a bit like
a VM lock leak followed by piling up on locks into a deadlock staet.

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories

On Sun, 17 Nov 2002, Kris Kennaway wrote:

 Since upgrading my kernel to today's current (from a couple of weeks
 ago) I have had a number of hangs where processes block in the kernel,
 usually in the thrd_sleep state (but once one hung in the ufs state).
 
 e.g:
 
  load: 0.01  cmd: cc 708 [ufs] 0.00u 0.00s 0% 56k
 
  load: 0.01  cmd: tcsh 709 [thrd_sleep] 0.00u 0.00s 0% 1220k
 
 I've seen this on my sparc64 box as well as an i386 box.
 
 Any bright ideas?  Anyone feeling guilty? :)
 
 Kris
 
 


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: more info from panic from running dnet on SMP kernel ( lock order reversal, recursed on non-recursive lock )

2002-11-17 Thread Robert Watson
Hmm.  It looks like there is indeed a lock leak in the RFTHREAD code.
Maybe a change like the following might help:

PROC_LOCK(p2);
psignal(p2, SIGKILL);
PROC_UNLOCK(p2);
}

Change the } to:
} else
PROC_UNLOCK(p1-p_leader);

And see if that gets rid of the problem.  Any chance this is highly
reproduceable, btw? :-)  And what app are you running that's using
RFTHREAD -- linux thread stuff?

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories

On Sun, 17 Nov 2002, Joel M. Baldwin wrote:

 
 running dnet on a SMP kernel causes the kernel to panic.
 
 
 lock order reversal
  1st 0xc2c803e8 process lock (process lock) @ 
 ../../../kern/kern_fork.c:571
  2nd 0xc03cfce0 proctree (proctree) @ ../../../kern/kern_fork.c:596
 recursed on non-recursive lock (sleep mutex) process lock @ 
 ../../../kern/kern_fork.c:599
 first acquired @ ../../../kern/kern_fork.c:571
 panic: recurse
 cpuid = 1; lapic.id = 0100
 Debugger(panic)
 Stopped at  Debugger+0x55:  xchgl   %ebx,in_Debugger.0
 db t
 Debugger(c03926fa,100,c0395ada,d26f5c08,1) at Debugger+0x55
 panic(c0395ada,c038feab,23b,c038feab,257) at panic+0x11f
 witness_lock(c2c803e8,8,c038feab,257,0) at witness_lock+0x3e6
 _mtx_lock_flags(c2c803e8,0,c038feab,257,d26f5cb8) at 
 _mtx_lock_flags+0xb2
 fork1(c2773d00,6050,0,d26f5cd4,c2c803e8) at fork1+0xbfc
 rfork(c2773d00,d26f5d10,c03b07a2,407,1) at rfork+0x65
 syscall(2f,2f,2f,0,80ddf10) at syscall+0x28e
 Xint0x80_syscall() at Xint0x80_syscall+0x1d
 --- syscall (251, FreeBSD ELF32, rfork), eip = 0x8087d14, esp = 
 0xbfbff4a8, ebp = 0xbfbff524 ---
 db ps
   pid   proc addruid  ppid  pgrp  flag   stat  wmesgwchan 
 cmd
  6217 c2b98e00 d28a70000  6215  6216 000 newpanic: unknown 
 thread state
 cpuid = 1; lapic.id = 0100
 boot() called on cpu#1
 Uptime: 1h43m39s
 pfs_vncache_unload(): 1 entries remaining
 Dumping 255 MB
  16 32 48 64 80 96 112 128 144 160 176 192 208 224 240
 Dump complete
 Automatic reboot in 15 seconds - press a key on the console to abort
 Rebooting...
 cpu_reset called on cpu#1
 cpu_reset: Restarting BSP
 cpu_reset_proxy: Stopped CPU 1
 
 
 
 And then when the system came back up and I took a closer
 look at the core dump.
 
 
 (kgdb) where
 #0  doadump () at ../../../kern/kern_shutdown.c:232
 #1  0xc02114ad in boot (howto=260) at ../../../kern/kern_shutdown.c:364
 #2  0xc0211767 in panic () at ../../../kern/kern_shutdown.c:517
 #3  0xc014f2bc in db_ps (dummy1=-1070342907, dummy2=0, dummy3=-1, 
 dummy4=0xd26f5a24 )
 at ../../../ddb/db_ps.c:169
 #4  0xc014d142 in db_command (last_cmdp=0xc03be920, cmd_table=0x0, 
 aux_cmd_tablep=0xc03b5540,
 aux_cmd_tablep_end=0xc03b5558) at ../../../ddb/db_command.c:346
 #5  0xc014d256 in db_command_loop () at ../../../ddb/db_command.c:472
 #6  0xc014feea in db_trap (type=3, code=0) at ../../../ddb/db_trap.c:72
 #7  0xc033da10 in kdb_trap (type=3, code=0, regs=0xd26f5b80)
 at ../../../i386/i386/db_interface.c:166
 #8  0xc0356a3f in trap (frame=
   {tf_fs = -1069481960, tf_es = 16, tf_ds = 16, tf_edi = 
 -1032372992, tf_esi = 256, tf_ebp = -764453940, tf_isp = -764453972, 
 tf_ebx = 0, tf_edx = 0, tf_ecx = 1, tf_eax = 18, tf_trapno = 3, tf_err 
 = 0, tf_eip = -1070342907, tf_cs = 8, tf_eflags = 662, tf_esp = 
 -1069883258, tf_ss = -1069996294}) at ../../../i386/i386/trap.c:603
 #9  0xc033f238 in calltrap () at {standard input}:99
 #10 0xc021174f in panic (fmt=0x0) at ../../../kern/kern_shutdown.c:503
 #11 0xc02333d6 in witness_lock (lock=0xc2c803e8, flags=8,
 file=0xc038feab ../../../kern/kern_fork.c, line=599) at 
 ../../../kern/subr_witness.c:609
 #12 0xc02079c2 in _mtx_lock_flags (m=0xc03cf4c0, opts=0, 
 file=0xc042cfd4 è\003È«þ8À;\002,
 line=-1027079192) at ../../../kern/kern_mutex.c:328
 #13 0xc01fd3ec in fork1 (td=0xc2773d00, flags=24656, pages=0, 
 procp=0xd26f5cd4)
 at ../../../kern/kern_fork.c:599
 #14 0xc01fc6c5 in rfork (td=0xc2773d00, uap=0xd26f5d10) at 
 ../../../kern/kern_fork.c:168
 #15 0xc035739e in syscall (frame=
   {tf_fs = 47, tf_es = 47, tf_ds = 47, tf_edi = 0, tf_esi = 
 135126800, tf_ebp = -1077938908, tf_isp = -764453516, tf_ebx = 2, 
 tf_edx = 135381248, tf_ecx = 135381248, tf_eax = 251, tf_trapno = 0, 
 tf_err = 2, tf_eip = 134774036, tf_cs = 31, tf_eflags = 659, tf_esp = 
 -1077939032, tf_ss = 47})
 at ../../../i386/i386/trap.c:1033
 #16 0xc033f28d in Xint0x80_syscall () at {standard input}:141
 ---Can't read userspace from dump, or kernel process---
 
 
 
 
 
 
 To Unsubscribe: send mail to [EMAIL PROTECTED]
 with unsubscribe freebsd-current in the body of the message
 


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: Device permissions with DEVFS

2002-11-19 Thread Robert Watson

On Tue, 19 Nov 2002, Bruce Evans wrote:

  No, the default permissions are specified in the driver source code
  via make_dev().
 
 The drivers only get the magic numbers for uids and gids from a central
 file.  This is bad enough.  I think all devices should have ownership
 root:wheel and mode 0600, but that would increase the problems with
 non-persistent attributes.  devfs(8) may be able to handle this now. 

I have to say that the ownership issue has been a pet peeve of mine for
some time: I would really like the kernel to know about exactly two magic
id values: uid 0 (suser uid, default uid, default devfs owner), and gid 0
(default gid, default devfs owner).  Hard-coding of other non-0 values in
the kernel leads to many potential (and real) problems. 

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: Device permissions with DEVFS

2002-11-19 Thread Robert Watson

On Tue, 19 Nov 2002, Poul-Henning Kamp wrote:

 In message [EMAIL PROTECTED], Robe
 rt Watson writes:
 
   No, the default permissions are specified in the driver source code
   via make_dev().
  
  The drivers only get the magic numbers for uids and gids from a central
  file.  This is bad enough.  I think all devices should have ownership
  root:wheel and mode 0600, but that would increase the problems with
  non-persistent attributes.  devfs(8) may be able to handle this now. 
 
 I have to say that the ownership issue has been a pet peeve of mine for
 some time: I would really like the kernel to know about exactly two magic
 id values: uid 0 (suser uid, default uid, default devfs owner), and gid 0
 (default gid, default devfs owner).  Hard-coding of other non-0 values in
 the kernel leads to many potential (and real) problems. 
 
 I think we should stick to the current slightly hackish way, possibly
 with the modification that the security-officer gang gets to rule what
 exact m/o/g devices in the FreeBSD cvs tree should have. 

I'm not suggesting we change to this model at this point, or at any
particular point in the future, it's just a pet peeve that someday I'd
like to address :-).

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: panic: sleeping thread owns a mutex - with debug traceback

2002-11-20 Thread Robert Watson
Hmm.  Another thread has decided to sleep while holding an inpcb mutex. 
Any chance this can be reproduced while running WITNESS?  If so, you
should get a panic earlier when the other thread sleeps in the first
place.  The easiest way to do that is if you can reproduce the panic with
WITNESS.  If you can't reproduce the panic, you may be able to extract
this from your system core using gdb -- you want to figure out what the
thread owner of the mutex is doing -- in the context of the kassert() 
below, td is the pointer to the thread that owns the mutex.  I'm not sure
how to extract a stack trace from that information, unfortunately, perhaps
someone can give us pointers.  (note: td from the priority_propagate()
argument is shadowed, which is annoying).

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories

On Wed, 20 Nov 2002, Joel M. Baldwin wrote:

 
 Under heavy system load and heavy swapping I had the following
 panic occur.
 
 -
 
 Panic message from the serial console:
 
 panic: sleeping thread owns a mutex
 cpuid = 1; lapic.id = 0100
 Debugger(panic)
 Stopped at  Debugger+0x55:  xchgl   %ebx,in_Debugger.0
 db panic     t
 Debugger(c0392fc9,100,c0392171,cd29ac18,1) at Debugger+0x55
 panic(c0392171,1,c03920e8,6b,c039614a) at panic+0x11f
 propagate_priority(c0eadc40,2,c03920e8,23b,c03cffa0) at 
 propagate_priority+0x104
 _mtx_lock_sleep(c3520a20,0,c039f4dd,182,c040af98) at 
 _mtx_lock_sleep+0x219
 _mtx_lock_flags(c3520a20,0,c039f4dd,182,c0393d1d) at 
 _mtx_lock_flags+0x98
 syncache_timer(1,0,c0393d1d,bf,220649) at syncache_timer+0xaf
 softclock(0,0,c0390a04,230,c0eac700) at softclock+0x19c
 ithread_loop(c0ea0600,cd29ad48,c0390767,355,0) at ithread_loop+0x182
 fork_exit(c01fef40,c0ea0600,cd29ad48) at fork_exit+0xa5
 fork_trampoline() at fork_trampoline+0x1a
 --- trap 0x1, eip = 0, esp = 0xcd29ad7c, ebp = 0 ---
 db panic
 panic: from debugger
 cpuid = 1; lapic.id = 0100
 boot() called on cpu#1
 Uptime: 6h11m40s
 pfs_vncache_unload(): 1 entries remaining
 Dumping 255 MB
  16 32 48 64 80 96 112 128 144 160 176 192 208 224 240
 Dump complete
 Automatic reboot in 15 seconds - press a key on the console to abort
 Rebooting...
 cpu_reset called on cpu#1
 cpu_reset: Restarting BSP
 cpu_reset_proxy: Stopped CPU 1
 
 
 
 kgdb traceback information:
 
 su-2.05b# gdb -k
 GNU gdb 5.2.1 (FreeBSD)
 Copyright 2002 Free Software Foundation, Inc.
 GDB is free software, covered by the GNU General Public License, and 
 you are
 welcome to change it and/or distribute copies of it under certain 
 conditions.
 Type show copying to see the conditions.
 There is absolutely no warranty for GDB.  Type show warranty for 
 details.
 This GDB was configured as i386-undermydesk-freebsd.
 (kgdb) symbol-file 
 /usr/src/sys/i386/compile/testGeneric.smp/kernel.debug
 Reading symbols from 
 /usr/src/sys/i386/compile/testGeneric.smp/kernel.debug...done.
 (kgdb) exec-file /boot/kernel.testgeneric.smp/kernel
 (kgdb) core-file /var/crash/vmcore.30
 panic: from debugger
 panic messages:
 ---
 panic: sleeping thread owns a mutex
 cpuid = 1; lapic.id = 0100
 panic: from debugger
 cpuid = 1; lapic.id = 0100
 boot() called on cpu#1
 Uptime: 6h11m40s
 pfs_vncache_unload(): 1 entries remaining
 Dumping 255 MB
  16 32 48 64 80 96 112 128 144 160 176 192 208 224 240
 ---
 #0  doadump () at ../../../kern/kern_shutdown.c:232
 232 dumping++;
 (kgdb) where
 #0  doadump () at ../../../kern/kern_shutdown.c:232
 #1  0xc02118bd in boot (howto=260) at ../../../kern/kern_shutdown.c:364
 #2  0xc0211b77 in panic () at ../../../kern/kern_shutdown.c:517
 #3  0xc014d252 in db_panic () at ../../../ddb/db_command.c:450
 #4  0xc014d1d2 in db_command (last_cmdp=0xc03bf2a0, cmd_table=0x0, 
 aux_cmd_tablep=0xc03b5ec8,
 aux_cmd_tablep_end=0xc03b5ee0) at ../../../ddb/db_command.c:346
 #5  0xc014d2e6 in db_command_loop () at ../../../ddb/db_command.c:472
 #6  0xc014ff7a in db_trap (type=3, code=0) at ../../../ddb/db_trap.c:72
 #7  0xc033e2a0 in kdb_trap (type=3, code=0, regs=0xcd29ab90) at 
 ../../../i386/i386/db_interface.c:166
 #8  0xc03572cf in trap (frame=
   {tf_fs = -1069547496, tf_es = 16, tf_ds = -1069744112, tf_edi = 
 -1058350016, tf_esi = 256, tf_ebp = -852907044, tf_isp = -852907076, 
 tf_ebx = 0, tf_edx = 0, tf_ecx = 1, tf_eax = 18, tf_trapno = 3, tf_err 
 = 0, tf_eip = -1070340715, tf_cs = 8, tf_eflags = 662, tf_esp = 
 -1069880826, tf_ss = -1069994039}) at ../../../i386/i386/trap.c:603
 #9  0xc033fac8 in calltrap () at {standard input}:99
 #10 0xc0211b5f in panic (fmt=0x0) at ../../../kern/kern_shutdown.c:503
 #11 0xc0207b54 in propagate_priority (td=0x0) at 
 ../../../kern/kern_mutex.c:125
 #12 0xc0208329 in _mtx_lock_sleep (m=0xc3520a20, opts=0,
 file=0xc039f4dd ../../../netinet/tcp_syncache.c, line=386) at 

Re: panic: sleeping thread owns a mutex - with debug traceback

2002-11-20 Thread Robert Watson
On Wed, 20 Nov 2002, Robert Watson wrote:

 Hmm.  Another thread has decided to sleep while holding an inpcb mutex. 
 Any chance this can be reproduced while running WITNESS?  If so, you
 should get a panic earlier when the other thread sleeps in the first
 place.  The easiest way to do that is if you can reproduce the panic with
 WITNESS.  If you can't reproduce the panic, you may be able to extract
 this from your system core using gdb -- you want to figure out what the
 thread owner of the mutex is doing -- in the context of the kassert() 
 below, td is the pointer to the thread that owns the mutex.  I'm not sure
 how to extract a stack trace from that information, unfortunately, perhaps
 someone can give us pointers.  (note: td from the priority_propagate()
 argument is shadowed, which is annoying).

Ack.  I mis-read.  You want the stack from thread td1 (the mutex owner),
not thread td.

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: DP2 (I think!) crash booting from floppies

2002-11-20 Thread Robert Watson
This is not actually DP2, it's about a week earlier.  That said, I'm not
sure that bug was fixed in the missing week.  If you can, try booting off
of the 5.0-DP2 ISOs found at:

  ftp://ftp.FreeBSD.org/pub/FreeBSD/ISO-IMAGES-i386/5.0-DP2

Or using the floppies:

  ftp://ftp.FreeBSD.org/pub/FreeBSD/releases/i386/5.0-DP2/floppies

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories

On Wed, 20 Nov 2002, local.freebsd.current wrote:

 I got a pair of floppies from:
 
 ftp://ftp.freebsd.org/pub/FreeBSD/snapshots/i386/5.0-20021103-SNAP/floppies/
 
 and booted them on a Dell Dimension XPS D300 which is currently
 running 4.7. It's a PII/300 with an Adaptec 2940 SCSI and an 
 STB Riva graphics card.
 
 When booting the kernel off the second floppy I get:
 
 Booting [/kernel]...
 /
 
 Fatal trap 12: page fault while in vm86 mode
 fault virtual address  = 0x9f800
 fault code = user read, page not present
 instruction pointer= 0xf000:0x8c3e
 stack pointer  = 0x0:0xfcc
 frame pointer  = 0x0:0xfd4
 code segment   = base 0x0, limit 0x0, type 0x0
= DPL 0, pres 0, def32 0, gran 0
 processor eflags   = interrupt enabled, resume, vm86, IOPL = 0
 current process= 0 ()
 trap number= 12
 panic: page fault
 Uptime: 1s
 
 The same floppies work fine on another machine, up to the point
 of launching sysinstall.
 
 
 To Unsubscribe: send mail to [EMAIL PROTECTED]
 with unsubscribe freebsd-current in the body of the message
 


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: your mail

2002-11-20 Thread Robert Watson

On Wed, 20 Nov 2002, Chris Howells wrote:

 Hi, 
  
 On Wednesday 20 November 2002 5:08 pm, Robert Watson wrote: 
  dmesg is a command that dumps the kernel message buffer.  You can 
 redirect 
  the output to a file: 
  
dmesg  fileofchoice 
  
 Sure. This bit is sufficiently similar to Linux for me to know it :) 
  
 Problem is, I haven't got 4.7 installed. I want to install 5.0DP2. I
 can't install 5.0DP2 due to the reboot, and therefore I can't get to a
 command prompt in order to run dmesg
  
 Bit of a chicken  egg situation -- if I could install 5.0Dp2
 successfully, then I wouldn't need to be sending you the output of dmesg
 here ;)

If you have a second box and a null modem cable, you can set up FreeBSD to
use a serial console by unplugging the keyboard.  You can then capture the
boot output on the second machine and e-mail that out.  If it doesn't
automatically use the serial console w/o a keyboard, you can do:

set console=console

in the loader.  By default, it uses 9660bps on com1.  This is typically
how we capture console debug output such as stack traces, debugger stuff,
etc, since it doesn't rely on the system remaining up, and doesn't require
you to type a lot of stuff in :-).

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: panic: sleeping thread owns a mutex - with debug traceback

2002-11-20 Thread Robert Watson

On Wed, 20 Nov 2002, John Baldwin wrote:

 Erm.  Did you manage to look at dmesg then?  If so, you would have seen
 warnings from WITNESS earlier about the locks messing up.  If you can
 reproduce this and are letting it sit unattended, a better plan might be
 to turn on witness_ddb (it's a kernel option, loader tunable, and sysctl
 (debug.witness_ddb)) and then when the original error occurs it will
 drop into the debugger with a very useful error message.  You can also
 get a useful trace at that point from ddb. 

Word of warning though: either use a serial console, or don't run X,
because you'll want to be able to see the debugger, and you probably won't
get much warning when it's about to drop in.

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories



To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Searching for users of netncp and nwfs to help debug 5.0 problems

2002-11-21 Thread Robert Watson

The build of netncp is currently broken on 5.0-CURRENT, and I'd like to
see this fixed before 5.0-RELEASE.  Unfortunately, we're having a lot of
trouble finding a test environment, which is the natural and immediate
follow-on to the compile fixes :-).  Was wondering if anyone with FreeBSD
kernel debugging experience and some time on their hands was interested in
helping resolve this issue over the next week or two. 

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: Searching for users of netncp and nwfs to help debug 5.0 problems

2002-11-21 Thread Robert Watson

On Thu, 21 Nov 2002, Robert Watson wrote:

 The build of netncp is currently broken on 5.0-CURRENT, and I'd like to
 see this fixed before 5.0-RELEASE.  Unfortunately, we're having a lot of
 trouble finding a test environment, which is the natural and immediate
 follow-on to the compile fixes :-).  Was wondering if anyone with
 FreeBSD kernel debugging experience and some time on their hands was
 interested in helping resolve this issue over the next week or two.

(And, you have to bring your own test environment, as the second sentence
suggests, but doesn't actually state).

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: Searching for users of netncp and nwfs to help debug 5.0 problems

2002-11-21 Thread Robert Watson

On Fri, 22 Nov 2002, Brad Knowles wrote:

 At 5:23 PM -0500 2002/11/21, Robert Watson wrote:
 
   (And, you have to bring your own test environment, as the second sentence
   suggests, but doesn't actually state).
 
   Over on -chat, we're in the process of putting together a list
 of volunteers, hardware, organizational talent, etc... to help test out
 -DP2.  Mark Murray is involved, but I personally would like to see at
 least one or two more core team members committed to making this happen. 
 
   If we can get a suitable group of people together, with suitable
 hardware, and get the coordination effort done correctly, I believe that
 we can help make this a much more successful project.
 
   Your assistance in this effort would be greatly appreciated. 

I appreciate the effort, and am interested in the idea, but in this case
it was as much a solicitation for a developer as for the testing
environment itself.  This won't just be testing of netncp and nwfs, this
will probably require a developer to have local access to a netware
configuration that they can do nasty things to in order to exercise the
code properly.  Unfortunately, those seem to be in short supply.

If I might suggest: there's a freebsd-qa mailing list.  It's a great place
to organize QA efforts, whereas freebsd-chat is notorious for its lack of
signal (it's where dead signals go to rot).  That's why I read it about
once a month.  If you moving the conversation there and get a bunch of
people subscribed and interested, they'll be able to look there for the
stream of bug fixes associated with the install process, and get easy
access to the testing guide as it evolves, since we usually pass drafts
through there, etc.

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: smbfs install option questions

2002-11-21 Thread Robert Watson
In terms of where to take this: there are many reported problems with
smbfs on 5.0-CURRENT.  It's not clear whether this is left over from the
KSE imports, the Apple-derived fixes that might not have fixed things,
etc.  In any case, before we can look at smbfs install, we really need
smbfs working.  People who feel moved to debug this should feel free :-).

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories

On Tue, 5 Nov 2002, David Yeske wrote:

 I got a smbfs install option working a while ago before drivers.flp came around, but 
there was no
 space on the floppies.  Since drivers.flp came out, there is more space.  I was 
wondering how I
 should go about making this usable, and which files should be on kern.flp, 
mfsroot.flp,
 drivers.flp, or somewhere else.  Also I am generally looking for feedback...
 
 The following patch is NOT up to date though.  The diff to GENERIC is NOT implying I 
think GENERIC
 should be modifed.  I did that just to have those things added to GENERIC so they 
would make it
 onto BOOTMFS.  The smbfs install option is based of the nfs install option, but it 
does not use
 dns.
  
 http://pigseye.kennesaw.edu/~dyeske/freebsd/smbfs_current.patch
 
 http://pigseye.kennesaw.edu/~dyeske/freebsd/smbfs.c
 
  
 Regards,
 David Yeske
 
 __
 Do you Yahoo!?
 HotJobs - Search new jobs daily now
 http://hotjobs.yahoo.com/
 
 To Unsubscribe: send mail to [EMAIL PROTECTED]
 with unsubscribe freebsd-qa in the body of the message
 


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: VM locking problem... And doscmd(8)

2002-11-21 Thread Robert Watson
On Thu, 21 Nov 2002, Juli Mallett wrote:

 I'm getting a giant owned assertion failure in the vm_map code, simply
 by running doscmd something.exe where something.exe is a
 self-extracting ZIP file (of BIOS upgrade stuff, FWIW), which leads
 trivially to tripping over it.  I still don't have a good way to get the
 trace output from the box in question to here, but I've been able to
 reproduce it every time, so it shouldn't be hard for someone else. 
 
 I rebuilt my kernel today from CVSup, but hadn't tried before that. 

For those of us that don't frequently (ever) use doscmd -- can you provide
a tarball of the necessary configuration files, executable, etc,
somewhere? 

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories



To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: DP2 root partition size

2002-11-21 Thread Robert Watson

On Fri, 22 Nov 2002, Richard Tobin wrote:

 Somewhere there should be a warning that the root partition needs to be
 *much* bigger in 5.0 than in 4.x.  It's gone from 40-something MB to 92
 MB for a default install.  It's really frustrating to install a system
 and find that / is 104% full. 

Part of that is that the DP2 install sticks a debug kernel on your disk as
well as the normal kernel.  So the installed size will shrink some.

 It looks as if even with 128 MB you're not going to have enough room to
 install a custom kernel+modules without deleting the generic one. 

Yeah, I've had that problem a lot re-using 4.x dev boxes as 5.x dev boxes.
I'm generally installing machines with 128mb or 256mb roots now.  Assuming
that at some point we begin to support dynamically linked binaries in /,
that should help some.

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories



To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: Searching for users of netncp and nwfs to help debug 5.0 problems

2002-11-22 Thread Robert Watson

On Fri, 22 Nov 2002, Brad Knowles wrote:

   If I might suggest: there's a freebsd-qa mailing list.  It's a great place
   to organize QA efforts, whereas freebsd-chat is notorious for its lack of
   signal (it's where dead signals go to rot).
 
   There's been some talk of freebsd-qa, but so far the only thing
 I recall being said is that the sort of thing we're talking about doing
 is not what this list has been used for in the past.  We were kind of
 wondering where we could take this particular effort, and if we could
 commandeer the freebsd-qa list for this purpose. 

The purpose of the QA mailing list is to provide a means to coordinate the
release and general QA process for FreeBSD.  Thus far, the traffic on qa@
has been relatively low bandwidth, but if a bunch of people turn up
wanting to perform thorough testing for the release, freebsd-qa sounds to
me like the right place.  Certainly, freebsd-qa would be where you want to
have moderate parts of the discussion take place.

   I believe that our biggest problem at the moment is finding the
 necessary development types to help us debug the problems and get them
 sorted out -- we've got people who have hardware, and would be more than
 willing to help test things out, but in the past they haven't gotten
 much help from the developers.

It sounds like there are a couple of problems here -- that we need a
debugging guide (How to prepare a useful bug report for a kernel panic,
How to prepare a useful bug report for a sysinstall failure, etc) -- that
we need a better way to find developers on a particular topic who are
willing to pick up more debugging burden.  I would have guessed that, in
general, problems with finding a responsible party developer would lie
more in the areas of the system that don't have an active maintainer (vis
owner), which is a harder problem to address.  If that's not a correct
impression, then it's something that's probably easier to fix :-).

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories




To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: No entries in /proc :: feature or problem ??

2002-11-22 Thread Robert Watson

On 22 Nov 2002, Dhee Reddy wrote:

Just tried to look up some info and saw that the /proc filesystem
 doesn't
contain any files. 
Shouldn't they contain entries correcponding to all the processes
 ?  truely -- dhee

In fresh 5.0 installs, procfs is not enabled by default.  Right now there
appear to be two tools in the system that pay a price for this:

(1) procfs -e relies on grubbing through /proc/pid/mem to find
environmental variables -- everything else, it can get through
sysctl().

(2) truss currently relies on procfs, albeit not working very well.  There
were a set of patches floating around to make truss use ptrace(),
which is the direction we probably do want to take this.  If someone
could finish up that work, it would be great.

The reasons to deprecate procfs are many-fold -- not least that there are
existing interfaces in the kernel that provide most or all of its features
at a substantially lower risk.  You just have to see the kernel-related
security advisories for FreeBSD, Linux, Solaris, etc, over the last five
years to understand why we want to turn it off if we can.  :-)  There has
also been a concerted effort to move userland system monitoring tools away
from using /dev/kvm (direct kernel memory access) and towards using the
sysctl() MIB interface, reducing the level of privilege required to run
the monitoring tools. 

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories



To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: Searching for users of netncp and nwfs to help debug 5.0 problems

2002-11-22 Thread Robert Watson
On Fri, 22 Nov 2002, Martijn Pronk wrote:

 The build of netncp is currently broken on 5.0-CURRENT, and I'd like to
 see this fixed before 5.0-RELEASE.  Unfortunately, we're having a lot of
 trouble finding a test environment, which is the natural and immediate
 follow-on to the compile fixes :-).  Was wondering if anyone with FreeBSD
 kernel debugging experience and some time on their hands was interested in
 helping resolve this issue over the next week or two. 
 
 I can test this next week at work, however, I don't normally use netncp
  nwfs, so it may take me a while. 
 
 I'll get back on this next week. 

That would be great.  My suggestion would be to first set up netncp and
nwfs on FreeBSD 4.7-RELEASE or 4.7-STABLE, since that is believed to work,
and then we can start to attempt to replicate that condition on
5.0-CURRENT.

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories



To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: libpthread question

2002-11-22 Thread Robert Watson
On Fri, 22 Nov 2002, walt wrote:

 I noticed David Xu's changes to libpthread this morning, so I did a
 make libraries and noticed with surprise that libpthread.so.5 was
 still dated Sep 16. 
 
 I then did 'cd /usr/src/lib' and a 'make' and noticed that libpthread
 did not show up during the make. 
 
 At that point I looked at /usr/src/lib/Makefile and noticed that
 libpthread is not mentioned there at all. 
 
 Then I 'cd /usr/src/lib/libpthread' and 'make' and everything recompiled
 in my /usr/src tree, not my /usr/obj tree. 
 
 So, am I screwed up somehow, or is this expected behavior? 

This is expected behavior -- libpthread is currently disconnected from the
build.  I'd actually like to see it connected to the build, with an
appropriate WARNING: DRAGONS INCLUDED man page also hooked up to
discourage accidental use.  At least, assuming David Xu, Jon Mini, etc,
are ready for the resulting bug reports they'll get. 

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories



To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: mbuf header bloat ?

2002-11-22 Thread Robert Watson

On Thu, 21 Nov 2002, Luigi Rizzo wrote:

 [Bcc to -net because it is relevant there. This email has been triggered
 by a private discussion i was having with other committers (who will
 easily recognise themselves :) which suggested the possibility of adding
 more fields to mbuf headers]
 
 Just recently came up to my attention that we have the following code in
 sys/_label.h
 
 #define MAC_MAX_POLICIES4
 struct label {
 int l_flags;
 union {
 void*l_ptr;
 long l_long;
 }   l_perpolicy[MAC_MAX_POLICIES];
 };
 
 (what are l_perpolicy[], ints ? Could this be written a bit better ?)
 and then in sys/mbuf.h

MAC labels provide general purpose security label storage across a variety
of kernel objects.  Each MAC label is made up of a number of slots which
are allocated to statically linked or dynamically loaded policies.  The
union is used to allow policies to use their slot for either the purposes
of an integer store, or as a pointer with the semantics they define.  Some
policies simply use the long to represent their policy information,
perhaps by itself (a partition number), or to key into an existing array. 
Other policies use the pointer to point to shared reference-counted,
static, or per-label data.  Probably the long should be changed to some
other integer type that better matches the notion of pointer length.

 struct pkthdr {
 struct  ifnet *rcvif;   /* rcv interface */
 int len;/* total packet length */
 /* variables for ip and tcp reassembly */
 void*header;/* pointer to packet header */
 /* variables for hardware checksum */
 int csum_flags; /* flags regarding checksum */
 int csum_data;  /* data field used by csum routines */
 SLIST_HEAD(packet_tags, m_tag) tags; /* list of packet tags */
 struct  label label;/* MAC label of data in packet */  
 };
  
 The label is 5 ints, the pkthdr a total of 11 ints (and m_hdr takes
 another 6, for a total of 136 bytes of header info on 64-bit
 architectures). 
 
 Of the pkthdr, only 3 fields (rcvif, len, tags) are of really general
 use, the rest being used only in certain cases and for very specific
 purposes (e.g. reassembly of fragments, or hw capabilities, or MAC). 
 
 Now that Sam has done the excellent work of integrating packet tags to
 carry annotations around, i really believe that we should try to move
 out of the pkthdr all non-general fields, and move them to m_tags so we
 only pay the cost when needed and not in all cases.  Also this pays a
 lot in terms of ABI compatibility and extensibility.  I understand that
 for 5.0 it is a bit late to act, but i do hope that we can reconsider
 this issue for 5.1 and pull out of the pkthdr at least the MAC label,
 and possibly also the csum_* fields, much in the same way it has been
 done for VLAN labels. 

In the original MAC label design for mbufs, and up until very recently,
m_tag wasn't available, so that design didn't use it.  We traded off
various things, including measured packet lengths, and decided the 20
bytes was an acceptable cost given the available extension services.  I'm
certainly willing to re-consider that notion now that we have general
extensibility, and now that we have a good, SMP-safe slab allocator in
5.0.  However, one thing to keep in mind is that in a MAC environment,
every packet header mbuf does have a valid label, and as a result,
inserting additional memory allocations for every packet handled can have
substantial cost.  I've had a number of requests to make options MAC a
default-shipped option: it's not ready yet (as an experimental feature),
but it may well be by 6.0 we are ready for that, assuming the performance
numbers are right.  In that situation, it could well be that it does make
sense to maintain the label data in the mbuf pkthdr, since it really will
be used for all pkthdr's.

There's a hard tradeoff here, and it applies to all of the data in the
packet header.  On the one hand, we want to keep the mbuf packet header
data small: any data there cuts into the space available for fast packet
storage without a cluster.  We also want to keep it protocol-independent,
since we use mbufs for all protocols, as well (in a number of situations) 
as a general purpose memory allocator for the network stack.  On the other
hand, we also want to use the highest performance configuration for the
common case, and the reality is that our current common case is IPv4
networking.  I'm not a big fan of performance hacks, but if we're reaching
a time when 50% of network cards provide support for IP checksum handling
on the card, paying a few bytes cost per mbuf header may be a definite
win.  As you suggest, this is probably a question we need to revisit once
5.0 is out the door, because we really can't make changes like this 

Re: No entries in /proc :: feature or problem ??

2002-11-22 Thread Robert Watson

On Fri, 22 Nov 2002, Mike Barcroft wrote:

 Dhee Reddy [EMAIL PROTECTED] writes:
  Hello all.
 Just tried to look up some info and saw that the /proc filesystem doesn't
 contain any files.
 Shouldn't they contain entries correcponding to all the processes ?
  truely
 
 This question was just asked a few days ago (yesterday?).  By default,
 /proc is no longer mounted.  To reenable it (not recommended for
 production systems because of procfs' poor security record) add the
 following line to fstab:  proc /proc procfs rw 0 0

This sounds like this will be a common 5.0 FAQ.  We should probably put it
on the web page somewhere, with some useful discussion of the benefits and
risks.  It's not clear to me why the open office build is looking for
procfs -- probably so that it can get to /proc/pid/cmdline, which is a
bogusism if ever I saw one.  I talked with Martin Blapp at one point about
how to adapt the Open Office build to DTRT -- it really shouldn't be hard
to teach it to use argv, one way or the other, especially given that
Solaris (on which Star Office runs quite nicely) doesn't support cmdline. 
:-) 

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories



To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Keeping mdmfs in sync with newfs

2002-11-22 Thread Robert Watson

Dima,

I recently switched two of my diskless crash boxes over to rc.d from the
old rc scripts, and discovered that the new rc.diskless code uses mdmfs. 
Unfortunately, it doesn't appear to support UFS2, since the newfs -O 
flag isn't supported -- and it isn't quite so simple to add, since mdmfs
has renamed the newfs -o flags to -O, causing a namespace collision. 
Basically, we need to be able to generate the -O1 or -O2 argument in the
newfs call from mdmfs.  I can hack it locally in the mean time, but there
is a long term maintenance question about passing newfs options through
mdmfs due to command line argument collisions. 

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: libpthread question

2002-11-22 Thread Robert Watson

On Fri, 22 Nov 2002, Juli Mallett wrote:

  This is expected behavior -- libpthread is currently disconnected from the
  build.  I'd actually like to see it connected to the build, with an
  appropriate WARNING: DRAGONS INCLUDED man page also hooked up to
  discourage accidental use.  At least, assuming David Xu, Jon Mini, etc,
  are ready for the resulting bug reports they'll get. 
 
 No.  You really don't want to do this.  A lot of ports will see
 -lpthread and try to use it instead of letting gcc use its best
 judgement about what threads system to use (-pthread).  And in the
 current state, that means a _lot_ of broken stuff. 

I thought we discussed installing it as -lkse at one point to avoid that
scenario.

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories



To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



RE: DP2: nfsiod

2002-11-22 Thread Robert Watson

On Fri, 22 Nov 2002, John Baldwin wrote:

 On 22-Nov-2002 local.freebsd.current wrote:
  Having installed DP2 and said NO to NFS client and
  server in sysinstall (and there's nothing about them
  in /etc/rc.conf) I see four nfsiod daemons running
  after the first boot. Are they supposed to be there?
 
 Yes.  If you have NFS client support in your kernel they will be there. 
 GENERIC has NFS client support enabled by default.  Hmm this is kind of
 a policy change as it now means that nfs_client_enable is basically
 useless unless you compile a custom kernel w/o NFS client support in
 which case the startup scripts will attempt to load it as a module for
 you. 

Per our phone conversation this morning, the way to think about it is
this:

The only function of nfsiod is to provide asynchronous write-behind
functionality.  On -STABLE, even if nfsiod isn't actually running, if the
NFS client code is present in the kernel, NFS will still work.  What
nfs_client_enable should do is:

(1) Load the kernel module if it's not already loaded
(2) Tune the kernel module if appropriate
(3) Possibly start rpcbind as a dependency
(4) Possibly start rpc.statd as a dependency
(5) Possibly start rpc.lockd as a dependency

Note that the rpc.lockd support is still experimental in 5.0 for
client-side locking, and as such, might not be good to enable by default.
I notice that  in the original message for this thread, there's reference
to release documentation indicating that client side locking isn't
implemented: this is actually not the case.  We do implement it now.

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories



To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: Panic, possibly MAC related

2002-11-23 Thread Robert Watson

On Sat, 23 Nov 2002, Christian Brueffer wrote:

 just got this panic on my notebook. Had to manually shut it down after a
 acpiconf -s 4. At the next bootup, the panic occured.  At the moment I'm
 trying to boot into my system again to reproduce it. 

In general, this panic occurs in the following situation: each mbuf packet
header has a label structure in it, and the Biba policy stores a per-label
allocated label blob using malloc'd memory.  If the packet header is
copied using the copypacket abstractions, then the reference will be
duplicated as will the label data, so each resulting reference will be
separately handled and free'd.  However, if the packet header data is
blindly copied without invoking the normal header replication code, then
the same pointer will sit in the new packet header as the old one.  When
the two mbufs are free'd, the first one goes fine, but the second one is a
duplicate free.  So somehow you managed to trigger a code path that
improperly copies packet headers.  Do you have any information on what
process it was that was performing the recvfrom()?  One of the code paths
that may present a problem in the base system implementation is a packet
copy for alignment purposes in the KAME code.  Another possibility is that
we've seen a regression in existing handling of something like the
fragmentation code, broadcast/multicast, etc.  Knowing what's in the
packet and what process it is would help a lot in debugging this.  Also,
it would be useful to know what interface this mbuf originated from, if
possible.

So if it's reproduceable, you want to show the pcpu data to find the pid,
then us ps to identify the process.  If you have gdb set up (either live
or a dump), seeing the mbuf ifnet pointer value would be valuable, as well
as knowing the domain, type, etc of the socket in question.

Thanks,

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories

 
 
 Slab at 0xc26fffc8, freei 19 = 0.
 panic: Dublicate free of item 0xc26ff980 from zone 0xc0e8d000(128)
 
 Debugger(panic)
 Stopped at  Debugger+0x54:  xchgl   %ebx,in_Debugger.0
 db tr
 Debugger(c03e24ca,c0475220,c03fb550,d1c9eae4,1) at Debugger+0x54
 panic(c03fb550,c26ff980,c0e8d000,c03e0944,6a8) at panic+0xab
 uma_dbg_free(c0e8d000,c26fffc8.c26ff980,6a8,0) at uma_dbg_free+0x122
 uma_zfree_arg(c0e8d000,c26ff980,c26fffc8,c043b620,80) at uma_zfree_arg+0x124
 free(c26ff980.c044ff60,d1c9eb80,c032a64d,c26ff980) at free+0xe1
 biba_free(c26ff980,c0450260,d1c9eba4,c024244f,c0ee0e30) at biba_free+0x1d
 mac_biba_destroy_label(c0ee0e30,0,c03e06f7,39a,c0ee0e00) at 
mac_biba_destroy_label+0x1d
 mac_destroy_mbuf(c0ee0e00,0,c261b000,0,d1c9ebc0) at mac_destroy_mbuf+0x7f
 m_free(c0ee0e00,0,d1c9ec6c,20e,0) at m_free+0x41
 soreceive(c278ba00,d1c9ec44,d1c9ec6c,0,0) at soreceive+0x666
 recvit(c261b000,6,d1c9ecb8,bfbff934,8090d10) at recvit+0x1ac
 recvfrom(c261b000,d1c9ed10,c0402882.407,6) at recvfrom+0xa9
 syscall(2f,2f,2f,8090c20,6) at syscall+0x28e
 Xint0x80_syscall() at Xint0x80_syscall+0x1d
 --- syscall (29, FreeBSD ELF32, recvfrom), eip = 0x280fdcc3, esp = 0xbfbff8dc, ebp = 
0xbfbffbe8
 db
 
 - Christian
 
 -- 
 http://www.unixpages.org  [EMAIL PROTECTED]
 GPG Pub-Key: www.unixpages.org/cbrueffer.asc
 GPG Fingerprint: A5C8 2099 19FF AACA F41B  B29B 6C76 178C A0ED 982D
 GPG Key ID : 0xA0ED982D
 


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: ACPI problem

2002-11-23 Thread Robert Watson

On Sat, 23 Nov 2002, Ertan Kucukoglu wrote:

 First of all, I do not know much about backtracing, debugging etc. 

First advice: we shipped DP2 with two different kernels, the normal kernel
without high debugging features, and then a special debugging kernel
called DEBUG.  My first advice when starting to debug problems of this
sort is to boot the debug kernel rather than the regular kernel, since it
will allow you to exercise debugging features more easily.

 I want to use the power key to shutdown the system. It is compaq evo
 300, P4 1.6ghz, 368MB RAM
 
 Yesterday OS was 5.0DP2. I can not power it off. It comes to a point
 when it should cut the power off 'System is shutting down using ACPI'
 like message is displayed and after a while. It just panics at free().

In general, if you get a panic and drop to ddb, you can type in trace to
get a backtrace -- this is usually the single must useful piece of
information for debugging a panic.  The print-out can often be 20 lines
long, and is full of hex numbers.  If you have access to a second machine,
you can use a serial console to make it easier to copy and paste the ddb
output.  If you don't, and are concerned about typographical errors, you
can skip the function arguments, but make sure you do get the
function+0xoffset bits right.  Serial console is really best because we
can look at the argument pointers and make sure they make sense.  Also,
when reporting panics, please do include the exactly language of the panic
message.

 This morning I cvsuped and buildworld the machine. This time it do not
 panic, but 'Timeout' error message comes and system reset itself leading
 a new boot.' Is this problem because of my hardware or something else?

Not clear.  We really need the exact messages here, if you can arrange it.
Given the problems that you're experiencing, if a serial console is
available, it would probably make a big different in debugging the
problems, and save you a lot of hand-typing.  You can set up a serial
console by linking it and an adjacent machine using a null modem cable on
the first serial port, and then using a terminal program on the adjacent
box at 9600bps -- all console I/O can be sent to the box by typing:

set console=comconsole

in the boot loader.  When you're using the debug kernel, you'll also be
able to interact with the debugger using the serial console.

 
 My dmesg output is below. My hand written uname -a is:
 test# uname -a
 FreeBSD test.ozlerplastik.com 5.0-CURRENT FreeBSD
 5.0-CURRENT #3: Sat
 Nov 23 10:57:36 EET 2002
 [EMAIL PROTECTED]:/usr/src/sys/i386/compile/OPTIMIZED_KERNEL
 i386
 
 Regards,
 
 --Ertan
 
 P.S. Unproper dismount messages are because I wanted to
 test the system background filesystem check. I just pluged
 the cable of when the X is running. There seems to be no
 problem at all.
 
 dmesg:
 Copyright (c) 1992-2002 The FreeBSD Project.
 Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991,
 1992, 1993, 1994
 The Regents of the University of California. All
 rights reserved.
 FreeBSD 5.0-CURRENT #3: Sat Nov 23 10:57:36 EET 2002
 [EMAIL PROTECTED]:/usr/src/sys/i386/compile/OPTIMIZED_KERNEL
 Preloaded elf kernel /boot/kernel/kernel at 0xc0415000.
 Timecounter i8254  frequency 1193182 Hz
 Timecounter TSC  frequency 1594833756 Hz
 CPU: Pentium 4 (1594.83-MHz 686-class CPU)
   Origin = GenuineIntel  Id = 0xf12  Stepping = 2
   
Features=0x3febfbffFPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM
 real memory  = 402653184 (384 MB)
 avail memory = 386703360 (368 MB)
 Initializing GEOMetry subsystem
 Pentium Pro MTRR support enabled
 acpi0: COMPAQ CPQ003E  on motherboard
 Using $PIR table, 9 entries at 0xc00ebfd0
 acpi0: power button is handled as a fixed feature
 programming model.
 Timecounter ACPI-fast  frequency 3579545 Hz
 acpi_timer0: 24-bit timer at 3.579545MHz port
 0xf808-0xf80b on acpi0
 acpi_cpu0: CPU on acpi0
 pcib0: ACPI Host-PCI bridge port 0xcf8-0xcff on acpi0
  initial configuration 
 \\_SB_.LNKA irq   0: [  3  4  5  6  7 10 11 14 15]
 low,level,sharable 0.31.0
 \\_SB_.LNKB irq  10: [  3  4  5  6  7 10 11 14 15]
 low,level,sharable 0.31.1
 \\_SB_.LNKH irq   0: [  3  4  5  6  7 10 11 14 15]
 low,level,sharable 0.31.2
 \\_SB_.LNKD irq  11: [  3  4  5  6  7 10 11 14 15]
 low,level,sharable 0.31.3
 \\_SB_.LNKC irq  10: [  3  4  5  6  7 10 11 14 15]
 low,level,sharable 0.1.0
 \\_SB_.LNKD irq  11: [  3  4  5  6  7 10 11 14 15]
 low,level,sharable 0.1.1
  before setting priority for links 
 \\_SB_.LNKA:
 interrupts:  3 4 5 6 7
10111415
 penalty:  1220  1220   220  1220  1220
   420   420 10220 10220
 references: 1
 priority:   0
 \\_SB_.LNKH:
 interrupts:  3 4 5 6 7
10111415
 penalty:  1220  1220   220  1220  1220
   420   

Re: -current unusable after a crash

2002-11-25 Thread Robert Watson

On Mon, 25 Nov 2002, Mikhail Teterin wrote:

 The only way to get my -current system back to normal after a crash is
 to boot into single user and do an explicit ``fsck -p''. 
 
 Otherwise the system will, seemingly, boot fine, but none of the ttyvs
 will accept any input, although tty-switching works fine. Remote
 connections (ssh, telnet) don't bring up the login prompt. 
 
 I thought, this might be due to the priority of the background fsck and
 have once left it alone for several hours -- with no effect. The usual
 fsck takes a few minutes. 
 
 There are three drives in the system -- a 4G SCSI (on ahc0) with /,
 /usr, /opt, and /home on it, and two 30Gb IDEs coupled into one big ccd. 

Any chance we can get you to break into ddb on the console, do a ps, and
see what the processes are waiting for?  Also, if they're waiting on
something like ufs or inode, generated ddb traces of the processes
would be interesting.

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories



To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: [Where] is OpenOffice 1.0.1_4 package available?

2002-11-25 Thread Robert Watson

On Mon, 25 Nov 2002, Munish Chopra wrote:

 On 2002-11-25 08:30 +, [EMAIL PROTECTED] wrote:
  
  http://projects.imp.ch/openoffice/
  
  But, i tried to install that package on my FreeBSD 5.0-CURRENT, well it
  went fine, but when i try to run openoffice, i will get a Segmentation
  fault. 
  Hope you will have more luck.
 
 As it says in the README, you need procfs to run it. This dependency
 will supposedly be removed in the (near?) future. 

Presumably openoffice should be trivially changeable to use something
other than procfs for the cmdline data -- Solaris doesn't appear to
support /proc/pid/cmdline, so it must have support for argc/argv, we just
need to twiddle the right configure bit for openoffice...?

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories



To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: mbuf header bloat ?

2002-11-25 Thread Robert Watson

On Sat, 23 Nov 2002, Andrew Gallatin wrote:

 I propose that we make struct label portion of the pkthdr compile-time
 conditional on MAC.  The assumption is that you will move the MAC label
 to an m_tag sometime after 5.0-RELEASE. 

This weekend I spent about six hours looking at what it would take to move
MAC label data into m_tags.  While in theory it is a workable idea, it
turns out our m_tag implementation is fairly far from being ready to
handle something like this.  I ran into the following immediate problems:

(1) When packet headers are copied using m_copy_pkthdr(), different
consumers have different expectations for what the resulting semantics
are for m_tag data -- some want it duplicated, others want it moved.
In practice, it is only ever moved, so consumers that expect
duplication are in for a surprise.  We need to re-implement the packet
header copying code so that it can generate a failure (because it
involves allocation), and separate the duplicate and move abstractions
to get clean semantics.  I exchanged some e-mail with Sam Leffler on
the topic, and apparently OpenBSD has already made these changes, or
similar ones, and we should do the same for 5.1.

(2) m_tag's don't have a notion that the data carried in a tag is
multi-dimmensional and may require special destructor behavior.  While
it does centralize copying and free'ing of data, it handles this
purely with bcopy, malloc, and free, which is not appropriate for use
with MAC labels, since they may contain a variety of per-policy data
that may require special handling (reference count management, etc).
I tried putting tag-specific release/free/... code in the m_tag
central routines, and it looks like that would work, although it
eventually would lead to a lot of junk in the m_tag code.  We might
want to consider m_tag entry free/copy pointers of some sort, but I'm
not sure if we want to go there.  Adding the MAC stuff to the
m_tag_{free,copy,...} calls won't break the ABI, whereas adding free
and copy pointers to the tags themselves would.

(3) Not all code generating packets properly initializes m_tag field.  The
one example I've come across so far is the use of m_getcl() to grab
mbufs with an attached cluster -- it never initializes the SLIST
properly, as far as I can tell.  Right now that's used in device
drivers, and also in the BPF packet generation code.  If the header is
zero'd, this may be OK due to an implicit proper initialization, but
this is concerning.  We need to do more work to normalize packet
handling.

(4) Code still exists that improperly hand-copies mbuf packet header data.
if_loop.c contains some particular bogus code that also triggers a
panic in the MAC code for the same reason.  If m_tag data ever passes
through if_loop and hits the re-alignment case introduced by KAME, the
system will panic when the tag data is free'd.  This code all needs to
be normalized, and proper semantics must be enforced. 

 This will immediately reduce the size of mbufs for the vast majority of
 users, and will prevent a 4.1.1 like flag-day for 3rd party network
 driver vendors.  The only downside is that the few MAC users will not be
 able to use 3rd party binary network drivers until the MAC label is put
 into an m_tag.  This seems fair, as the only people inconvienced are the
 people who want the labels and they are motivated to move them to an
 m_tag.  But that's easy for me to say, since I don't run MAC, and I may
 be missing something big. 

In the past I have looked at adding conditionally-defined components to
struct mbuf and other key kernel data structures.  While the condition of
the tree is improving from this perspective due to better isolation of
user and kernel data structures, the result is still incredibly messy,
especially if you key the conditionally defined sections on a kernel
option.  mbuf.h is included in a number of userland applications -- some
expected, such as the ipfilter test framework, but others less expected --
such as BIND.  I'm very wary of the notion of adding conditionally defined
portions of struct mbuf on this (and other) bases.  I'll take a look at
whether many of the obvious foot-shooting scenarios still exist since I
last tried it.  Moving to m_tag looks like a reasonable long-term
strategy, but until the m_tag code is substantially more mature, it isn't
realistic.  Otherwise, I might have attempted to push through a change to
it now before RC1.

BTW, do you have any recent large-scale measurements of packet size
distribution?  In local tests and measurements, the additional 20 bytes on
i386 didn't bump the remaining mbuf data space sufficiently low to
substantially change the behavior of the stack.  However, I haven't done
measurements against the 64-bit variation.  In practice, a number of
network interfaces now seem to use clustered mbufs and not attempt to use

Re: mbuf header bloat ?

2002-11-25 Thread Robert Watson

On Mon, 25 Nov 2002, Bosko Milekic wrote:

 On Mon, Nov 25, 2002 at 11:31:39AM -0500, Robert Watson wrote:
  BTW, do you have any recent large-scale measurements of packet size
  distribution?  In local tests and measurements, the additional 20 bytes on
  i386 didn't bump the remaining mbuf data space sufficiently low to
  substantially change the behavior of the stack.  However, I haven't done
  measurements against the 64-bit variation.  In practice, a number of
  network interfaces now seem to use clustered mbufs and not attempt to use
  the in-mbuf storage space...  All my packet distribution measurements come
  from a typical ISP environment, but may not match what is seen in
  large-scale backbone environments.
 
   I am equally curious about this.  One of the design assumptions for
   mbufs and clusters, according to McKusick et al. (and I believe
   another text which currently escapes me) is that packets are typically
   either very small or fairly large.  Given the MAC label additions
   (yes it would be nice if this was done using the m_tag interface but
   at the very least one can say that they are implemented fairly
   'consistently' despite the fact that they appear imposing to the
   general mbuf structure), and the currently available data region in
   the mbuf, it is absolutely necessary to know whether the assumption of
   packet size distribution still holds before a decision is made on how
   to modify the MAC label implementation - if at all.

It's worth pointing out for those listening, and as I'm sure you're
already aware, m_tag was not available for use by the MAC Framework when
we did any of the design and implementation, and m_tags were committed to
the tree about three months after the MAC work.  I'm happy to look at
switching the mechanism used for MAC to m_tag, especially once m_tag is
mature enough to be used, but it wasn't a design consideration in our
first pass simply because it didn't exist :-).

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: mbuf header bloat ?

2002-11-25 Thread Robert Watson

On Mon, 25 Nov 2002, Sam Leffler wrote:

 As I explained to you; the handling of mtags mimics what was there for
 the aux mbufs.  I did this intentionally to avoid changes that might
 introduce subtle problems.  My intent was to cleanup this stuff after
 5.0 releases by replacing the pkthdr copy macros with separate DUP+MOVE
 macros ala openbsd.  I did this in my original implemention but
 discarded it when I did the -current integration. 

And I agree this is the right direction to take this in once we are out of
the freeze.

 I don't believe it's necessary to overload the basic mtag structure but
 instead introduce a specific cookie that enables a more general
 mechanism that would be suitable for your needs. 

That sounds like a reasonable approach.

  (3) Not all code generating packets properly initializes m_tag field.  The
  one example I've come across so far is the use of m_getcl() to grab
  mbufs with an attached cluster -- it never initializes the SLIST
  properly, as far as I can tell.  Right now that's used in device
  drivers, and also in the BPF packet generation code.  If the header is
  zero'd, this may be OK due to an implicit proper initialization, but
  this is concerning.  We need to do more work to normalize packet
  handling.
 
 I don't see this problem; m_getcl appears to do the right thing.
 
  (4) Code still exists that improperly hand-copies mbuf packet header data.
  if_loop.c contains some particular bogus code that also triggers a
  panic in the MAC code for the same reason.  If m_tag data ever passes
  through if_loop and hits the re-alignment case introduced by KAME, the
  system will panic when the tag data is free'd.  This code all needs to
  be normalized, and proper semantics must be enforced.
 
 
 I don't see this problem; looutput looks to do the right thing.  FWIW I've
 passed mbufs w/ mtags through the loopback interface.

This refers specifically to the following code snippet:

if (m  m-m_next != NULL  m-m_pkthdr.len  MCLBYTES) {
struct mbuf *n;

MGETHDR(n, M_DONTWAIT, MT_HEADER);
if (!n)
goto contiguousfail;
MCLGET(n, M_DONTWAIT);
if (! (n-m_flags  M_EXT)) {
m_freem(n);
goto contiguousfail;
}

m_copydata(m, 0, m-m_pkthdr.len, mtod(n, caddr_t));
n-m_pkthdr = m-m_pkthdr;
n-m_len = m-m_pkthdr.len;
n-m_pkthdr.aux = m-m_pkthdr.aux;
m-m_pkthdr.aux = (struct mbuf *)NULL;
m_freem(m);
m = n;
}

In this scenario, the mbuf header for (n) is initialized to an empty m_tag
chain.  The direct assignment of (n)'s pkthdr data from (m) copies the
pointers from (m).  (m) is then freed, which causes the mbuf allocator to
go through and delete the m_tag chain on (m), freeing the individual
entries in the chain, which are now still referenced by (n).  You only
bump into this case if you trigger the conditional clause above.  In the
MAC code, this case results in a duplicate free() when (n) is later
released -- unless I'm mis-reading things (quite possible), the same
failure mode should exist for the m_tag code.  In my local tree, I have
this case disabled, and I've been trying to figure out what the right
solution is -- probably to move to using M_COPY_PKTHDR() and then doing
the fixup.

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories



To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: mbuf header bloat ?

2002-11-25 Thread Robert Watson

On Mon, 25 Nov 2002, Sam Leffler wrote:

 I don't see this problem; m_getcl appears to do the right thing. 

Hmm.  I see the SLIST initialization there also.  Maybe I'm thinking of
another function, I'll have to go check.  Sorry about that.

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: ACLs on the boot partition?

2002-11-26 Thread Robert Watson
On Tue, 26 Nov 2002 [EMAIL PROTECTED] wrote:

 On Tue, 26 Nov 2002, Hiten Pandya wrote:
 
  On Tue, Nov 26, 2002 at 11:21:28AM -0700, [EMAIL PROTECTED] wrote the words in 
effect of:
   On Tue, 26 Nov 2002, Bruno Miguel wrote:
  
On 25 Nov 2002 at 23:34, [EMAIL PROTECTED] wrote...
   
 How do I enable ACLs on the boot partition? tunefs -a enable /dev/ad0s1a
 indicates it got set (in single user mode with / mounted readonly). But I
 still can't set anything with setfacl(1). I tried booting to the fixit
 floppy, hoping to set acls flag from there to my partition, but it doesn't
 have tunefs. Is my only choice now to take the drive out and put it in
 another FreeBSD machine and set it from there?
   
If you are using UFS1, did you follow the procedures in 
/sys/ufs/ufs/README.acls ?
  
   No, not using USF1. / was formatted UFS2.
 
  tunefs -a /your/filesystem
 
  I think thats the one.
  Cheers.
 
 Tried that already on / in single user mode with it mounted readonly. 
 tunefs said it changed the flag, but didn't really. I also tried adding
 acls to fstab for /, but no effect. Were you successful in doing this
 for / ? 

tunefs changes the flag for the next mount, so doesn't take immediate
effect.  Once you've tunefs'd a read-only file system, you need to unmount
and remount it -- for the file system root, this generally means
rebooting.  Just to confirm: you're running with GENERIC, or with a kernel
that includes UFS_ACL, right?  (Normally the kernel will complain if you
try to mount a file system with ACL support when ACLs aren't enabled).

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories



To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: 5.0-DP2 ACLs on UFS2

2002-11-26 Thread Robert Watson

On Mon, 25 Nov 2002 [EMAIL PROTECTED] wrote:

 I've recently installed FreeBSD 5.0-DP2 to get myself familiar with the
 upcoming ACLs present in -CURRENT before the release itself. I've setup
 a test machine with one 45gb ide drive with one slice and two partitions
 (/ and swap) and installed FreeBSD on it. 
 
 dumpfs / shows that root is UFS2, and from reading
 /usr/src/sys/ufs/ufs/README.acls, I don't need to do the extattrctl
 initattr commands since ufs2 supports EA/ACLs natively. Additionally, I
 booted to single user mode and enabled ACLS on / by doing a tunefs -a
 enable /dev/ad0s1a. I proceeded to try getfacl and setfacl. 
 
 getfacl returned the default settings (just stat() in ACL form according
 to Robert Watson), however, no matter what I tried all I could get with
 setfacl -m g:mail:rwx testfile was: 
 
 setfacl: acl_get_file() failed: Operation not supported
 
 I thought perhaps the tunefs on the ro mount of / did not take. So
 instead I used the mount time flag in fstab: 
 
 /dev/ad0s1a / ufs rw,acls 1 1
 
 I rebooted, and tried again. Yet I still get the same error message with
 setfacl. At this point I'm stuck. Is it because I only have / and not /
 and /usr? Does UFS2 with EA/ACLs not work on boot partitions? Or did I
 misunderstand something when trying to setup ACLs in -CURRENT? Any
 advice right now would be welcomed. Thanks. 

ACLs should work fine on any UFS2 partition where ACLs are enabled.  I'm
wondering if it's actually UFS2, or if dumpfs is lying to you.  Could you
try the following command:

touch /foo
setextattr system foo foo /foo
getextattr system foo /foo

And tell me what results you get?  That will tell us if extended
attributes are available or not.

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories



To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: mbuf header bloat ?

2002-11-27 Thread Robert Watson

Andrew,

Thanks for your patience as I finished some research and experimentation
regarding the options there.  Some more details below.

On Sat, 23 Nov 2002, Andrew Gallatin wrote:

 On the contrary, I think that if anything is going to be done, it must
 be done now, so as to not break binary network driver compatability like
 we did in 4.1.1 when the size of mbufs changed.  Otherwise, we're stuck
 with it until 6.0.

Per an on-going discussion on -arch, it seems there's a reasonable
concencus that the kernel driver ABI will not be frozen until 5.1, since
we need continued flexibility to mature the fine-grained locking, KSE, and
MAC technologies.  This will allow us some wiggle room in resolving these
sorts of issues. 

 As you eloquently state, there are a number of tradeoffs involved.  On a
 64-bit platform, 99% of users are paying 40 bytes/pkt for something that
 they will never use.  On x86, 99.99% of users are paying 20 bytes/pkt
 for a feature they will never use.  At least a signifigant fraction of
 nics make use of csum offloading (xl, ti, bge, em, myri). 
 
 I propose that we make struct label portion of the pkthdr compile-time
 conditional on MAC.  The assumption is that you will move the MAC label
 to an m_tag sometime after 5.0-RELEASE. 

For a variety of reasons, I'm averse to the notion of compile-time
components in the struct mbuf (and other) vital kernel structures.  One of
the design requirements for the MAC Framework was that it be possible for
third party vendors to distribute security modules that plug in without
necessarily being part of the FreeBSD build infrastructure.  While it is
true we currently require options MAC to be compiled into the kernel, we
don't require that you manually integrate module source into the kernel
source so that it builds as part of a kernel.  Due to the way that
separately shipped modules build out of the context of a kernel
configuration, this would introduce substantial problems.  However, since
we believe that the kernel ABI will not be frozen until 5.1, if we have an
alternative place to put the label that doesn't expand the pkthdr, then we
can change it once we think the solution is ready. 

On the topic of m_tag: I've spent a few days working with m_tag now to see
if it can meet the needs of the MAC Framework.  My conclusion is that, in
the form it's currently in the tree, it cannot meet the requirements. 
However, I believe with a relatively straight forward set of
modifications, it can.  As such, the proposed 5.1 time frame for moving
the MAC Framework to using m_tag is realistic.  I'm currently exchanging
patches with Sam Leffler looking at how to tweak the various protocol
stacks to properly maintain m_tag chains on mbufs when mbufs are copied,
etc.  These problems largely stem from a failure to maintain the tag
chains on mbufs over some of the copy/... operations that occur.  The
result is that the MAC labels stored in mbufs are often discarded or lost,
and many packets float around the system without proper protection.  For
policies that rely on ubituitous labeling, this results in rapid assertion
failures (yes, we fail very closed :-).  I hope to post patches for these
changes in the next few days once I've had a perform more extensive
testing.  Sam and I are having an on-going conversation about whether it
would be safe to introduce some of these changes before 5.0.

There are some downsides to moving to m_tag for MAC labels.  One is that
it effectively doubles the number of memory allocations in the system for
every packet delivered through the system when running with MAC if we
maintain the current semantic that all packets are labeled.  This means
users will pay a higher cost for MAC even if they don't label packets,
which is unfortunate.  I'm currently exploring the impact -- my hope is
that changes to the memory allocators since 4.x, such as the new mbuf
allocator and introduction of UMA, will largely mitigate that effect.  A
fair amount of interest has been expressed in supporting MAC in the
GENERIC kernel eventually: if and when that becomes the case, we may find
that the rationale for moving the label out of the mbuf is reversed.

 This will immediately reduce the size of mbufs for the vast majority of
 users, and will prevent a 4.1.1 like flag-day for 3rd party network
 driver vendors.  The only downside is that the few MAC users will not be
 able to use 3rd party binary network drivers until the MAC label is put
 into an m_tag.  This seems fair, as the only people inconvienced are the
 people who want the labels and they are motivated to move them to an
 m_tag.  But that's easy for me to say, since I don't run MAC, and I may
 be missing something big. 

I think you under-estimate the complexity of variably sized key kernel
data structures.  mbuf.h is included all over the kernel, as well as in
many user applications (although often for bogus reasons).  My proposed
strategy is the following:

(1) For 5.0, we either maintain the 

Re: ACLs on the boot partition?

2002-11-27 Thread Robert Watson

On Wed, 27 Nov 2002, Bruce Evans wrote:

 On Tue, 26 Nov 2002, Robert Watson wrote:
 
  tunefs changes the flag for the next mount, so doesn't take immediate
  effect.  Once you've tunefs'd a read-only file system, you need to unmount
  and remount it -- for the file system root, this generally means
  rebooting.  Just to confirm: you're running with GENERIC, or with a kernel
 
 Er, what is the mount(..., MNT_RELOAD ...) in tunefs for then? 
 Unmounting and remounting should not be necessary for any read-only file
 system including /.  You can do the MNT_RELOAD from the command line
 using mount -u if tunefs doesn't do it. 
 
 I have some old fixes for tunefs which fix missing remounts as a side
 effect.  In -current, tunefs only detects mounted filesystems if they
 are in fstab.  It clobbers read-write mounted filesystems and fails to
 remount read-only mounted file systems if they are not detected.

The problem is that some flags can't be changed via MNT_RELOAD and require
a from-scratch mount.  I'm hoping that with nmount(), we can get a little
more expressive regarding what changes are (and aren't) allowed to flags.
Right now there's some uncomfortable masking.

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories



To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: ACLs on the boot partition?

2002-11-27 Thread Robert Watson

On Thu, 28 Nov 2002, Bruce Evans wrote:

 On Wed, 27 Nov 2002, Robert Watson wrote:
 
  On Wed, 27 Nov 2002, Bruce Evans wrote:
 
   On Tue, 26 Nov 2002, Robert Watson wrote:
  
tunefs changes the flag for the next mount, so doesn't take immediate
effect.  Once you've tunefs'd a read-only file system, you need to unmount
and remount it -- for the file system root, this generally means
rebooting.  Just to confirm: you're running with GENERIC, or with a kernel
  
   Er, what is the mount(..., MNT_RELOAD ...) in tunefs for then?
 
  The problem is that some flags can't be changed via MNT_RELOAD and require
  a from-scratch mount.  I'm hoping that with nmount(), we can get a little
  more expressive regarding what changes are (and aren't) allowed to flags.
  Right now there's some uncomfortable masking.
 
 Why can't they be changed?  All the other tunefs flags except FS_ACLS
 and FS_MULTILABEL are related to writing, so ffs_reload() has to support
 them changing as a side effect of supporting transitions from read-only
 to read-write mode. 

Switching ACLs to support a change shouldn't be a problem, although I'd
generally discourage changing the ACLs flag very much, since you don't
want inconsistent access control and other side effects of using ACLs
inconsistently (they get out of sync, etc).  Multilabel can't be changed
because of cache coherency issues: we cache label data in the vnode, and
changing the origin of the label data (what MNT_MULTILABEL effectively
does) would invalidate the contents of the cache.  To correct that, we'd
have to support immediately (and atomically) walking the entire vnode list
and re-loading and validating the labels, something that we don't
currently do.

There are some bugs in the UFS1 extended attribute implementation relating
to the remount issue, actually -- in particular, the EA backing files for
UFS1 are opened read-write, and UFS blocks an upgrade from read-only to
read-write if they are read rather than read-only.  We need to force a
re-open of the backing files and make the flags passed to open/close match
that.  I suspect the quota code must already have that behavior.

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories



To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: mbuf header bloat ?

2002-11-27 Thread Robert Watson

On Wed, 27 Nov 2002, Julian Elischer wrote:

 On Wed, 27 Nov 2002, Robert Watson wrote:
 
  I'd like to continue to explore options for reducing the number of memory
  allocations to extend storage on mbufs.  One idea I've been tossing around
  is adopting Jeff Roberson's extension model used in struct proc and
  related structures. 
 
 I've been wondering about a couple of things..  1/ soemtiems I wonder if
 ALL mbufs should not be external mbufs. 
 
 In other words, if the mbuf were always just a header and data was
 always stored on an external buffer it might actually simplify some
 code. It would then become possible that some tag space be allocated
 along with the mbuf header.. if MAC was in the system, then every mbuf
 would be allocated with a MAC tag by default.  Maybe as a single
 allocation. The UMA allocator's init()  capability gives us a lot of
 latitude in doing things like that.

These are all interesting possibilities.  One of the upshots of this
conversation is that it sounds like, while 5.x is stabilizing, we might
want to consider some side experimental work to evaluate the continuing
effectiveness of mbufs, and to experiment with alternatives.  Between
Bosko's new allocator and UMA, I think we're pretty set on optimizing for
the current mbuf model fo 5.x.  Finishing up and cleaning up the
fine-grained locking and measuring the impact of current changes should
keep us busy on the implementation side for a bit though.

There seem to be a number of parts of this problem -- how changes in
traffic, common interfaces, usage patterns, and memory allocators have
changed our requirements for a network subsystem. 

I have the suspicion that the traffic patterns leading to an mbuf model
will probably remain: most connections will remain assymetric in nature,
with most of the large frames in one direction representing a bulk data
transfer, and small frames in the other direction, representing the
acknowledgement and control stream.  TCP hasn't shown any signs of going
to a model where we send large selective acknowledgement frames covering
wide windows, which I suppose it might do at some point given the increase
in minimum frame size.  However, we have seen work to pack lots of small
packets into large frames for bulk delivery in routers to avoid the loss
of efficiency over medimums with large mininum frame sizes.

Maybe we can put together a working group to do some discussion and
experimentation.  This is an area where we might be able to approach
potential sponsors using FreeBSD for joint investment in network
performance improvement. 

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories



To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: system locks with vnode backed md(4)

2002-11-30 Thread Robert Watson

On Sat, 30 Nov 2002, Michal Mertl wrote:

 I'm now unable to make it dead-lock again. Yet it happened quite easily. 
 I had more md backing files in the same directory at the beginning (to
 test Terry's suspicion mentioned in thread 'jail' on hackers@).

I've noticed that chroot() environments tend to make existing deadlock
opportunities more likely.  I'm not quite sure why that is.  :-)

 I'll try to get more problems and get more info (show lockedvnods, show
 locks). 
 
 show locks with first dead-lock showed only Giant AFAIR. 

Yeah, vnode locks and other lockmgr locks don't show up in 'show locks',
since only SMPng locking primitives are tracked by WITNESS.

There are a fair number of vnode locking deadlock scenarios that are
unavoidable where we rely on grabbing vnode locks out of the directory
structure lock order.  This occurs for vnode-backed md devices, quotas,
and UFS1 extended attributes, and probably some other situations.  I
suspect that Terry is correct that operations on the vnode backing file
storage directory are triggering the problem, since that increases the
chances that a vnode lock race to root will occur from both the file
system backed into the md device, and for the md backing vnodes during
blocking I/O.  If you can avoid directory operations on the md backing
directory, that would probably be one way to avoid triggering the bug.
Seeing it reproduced would probably confirm that this is the case.  On the
other hand, there may be other deadlocks in the vnode/ufs/md code that can
be more easily corrected than this general VFS problem, so details there
would be very useful.

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories



To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: [REPORT] Upgrade from 4.0-RELEASE to 5.0-CURRENT

2002-12-01 Thread Robert Watson

On Sun, 1 Dec 2002, Jake Burkholder wrote:

 Apparently, On Sun, Dec 01, 2002 at 01:15:00PM -0200,
   Daniel C. Sobral said words to the effect of;
 
  There I go reply to all... sigh
  
  IIRC, we never supported upgrade to 4.0 or 4.1 from anybut but the
  *latest* version in the 3.x series. I sure hope we adopt the same policy
  here.
 
 Agree, I don't see any use in supporting upgrades without going through
 4.x-STABLE first. 

It's nice that you guys are in agreement on that fact, but perhaps you'd
like to do something constructive like to look at and review the patches?
Avoiding the install of a new make during a build is highly desirable,
even if you don't believe in updating from old RELENG_4.

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories




To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: The great perl script rewrite - progress report

2002-12-01 Thread Robert Watson
Base system perl-based tools added to the TODO list.  We need to deal with
these ASAP. 

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories

On Sun, 1 Dec 2002, Kris Kennaway wrote:

 On Sat, Nov 30, 2002 at 10:41:28PM -0500, Garance A Drosihn wrote:
 
/usr/bin/mmroff
/usr/bin/afmtodit
  
/usr/sbin/adduser
/usr/sbin/rmuser
 
 These must be converted before 5.0-R.
 
/usr/share/examples/cvs/contrib/clmerge
/usr/share/examples/cvs/contrib/cln_hist
/usr/share/examples/cvs/contrib/commit_prep
/usr/share/examples/cvs/contrib/cvs_acls
/usr/share/examples/cvs/contrib/log
/usr/share/examples/cvs/contrib/log_accum
/usr/share/examples/cvs/contrib/mfpipe
/usr/share/examples/cvs/contrib/rcslock
/usr/share/examples/cvs/contrib/easy-import
 
 These probably aren't very important.
 
/usr/X11R6/bin/mkhtmlindex
/usr/X11R6/bin/bdftruncate.pl
/usr/X11R6/bin/ucs2any.pl
  
/usr/compat/linux/usr/bin/mtrace
 
 These are all installed by ports that should have pulled in a
 dependency on perl (if not then it's a bug).
 
 Kris
 


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: setfacl requirements?

2002-12-05 Thread Robert Watson

On Thu, 5 Dec 2002, kai ouyang wrote:

 Hi, everybody,
 From Robert N M Watson
 (1) UFS_ACL isn't enabled
 Yes, I am sure that in my kernel config:
 options UFS_ACL
 options UFS_EXTATTR
 options UFS_EXTATTR_AUTOSTART

Ok, looks good.

 (2) Extended attributes aren't available on the file system (shouldn't
 happen for UFS2, but might happen for UFS1 if you don't have
 UFS_EXTATTR and appropriate configuration of EAs) 
 I do as the README.alcs
 mkdir -p /usr/.attribute/system
 cd /.attribute/system
 extattrctl initattr -p /usr/ 388 posix1e.acl_access
 extattrctl initattr -p /usr/ 388 posix1e.acl_default

Followed by a reboot, right?

 (3) The file system isn't mounted with the ACL option: either -o acls (or
 acls in the fstab file), or more reliably, setting the tunefs -a
 enable flag in the file system configuration.
 For better or for worse, POSIX.1e defines that getfacl() will print the
 current file permissions as an ACL if ACLs aren't available on the file
 system.  As such, you're probably just seeing the results of stat()
 printed in an ACL form.
 I use UFS1. In DP1, the ACL works nice. But in DP2, I have never succeeded.
 in DP1, there is no need to add the 'acls' to 'fstab'. Anyway, I also add 
 the 'acls' flag to 'fstab', but it fails, too.
 The system always say:
 Current#cd /usr/
 Current#setfacl -m u:oyk:r src
 setfacl: acl_get_file() failed: Operation not supported

If you run mount, does it show the acls flag for /usr?  I generally
recommend that the tunefs flag be used to start ACLs on a file system
rather than the fstab flag, since that will prevent races and issues with
re-mounting for /.  Try the following:

tunefs -a enable /dev/{deviceof/usr}

then unmount and remount the file system (probably by rebooting).  Here's
the procedure I use to test ACLs locally when I don't want to mess with a
live file system:

alsvid# mdconfig -a -t malloc -s4m
md0
alsvid# newfs /dev/md0
/dev/md0: 4.0MB (8192 sectors) block size 16384, fragment size 2048
using 4 cylinder groups of 1.02MB, 65 blks, 256 inodes.
super-block backups (for fsck -b #) at:
 32, 2112, 4192, 6272
alsvid# mount /dev/md0 /mnt
alsvid# cd /mnt
alsvid# mkdir -p .attribute/system
alsvid# cd .attribute/system/
alsvid# extattrctl initattr -p . 388 posix1e.acl_access 
alsvid# extattrctl initattr -p . 388 posix1e.acl_default
alsvid# cd /mnt
alsvid# setfacl -m u:robert:r .
setfacl: acl_get_file() failed: Operation not supported
alsvid# cd /
alsvid# umount /mnt
alsvid# tunefs -a enable /dev/md0
tunefs: ACLs set
alsvid# mount /dev/md0 /mnt
alsvid# mount
/dev/ad0s1a on / (ufs, NFS exported, local, soft-updates)
devfs on /dev (devfs, local, multilabel)
/dev/md0 on /mnt (ufs, local, acls)
alsvid# setfacl -m u:robert:r .
alsvid# ls -la
total 5
drwxr-xr-x+  3 root  wheel   512 Dec  5 10:40 .
drwxr-xr-x  20 root  wheel  1024 Nov 29 16:00 ..
drwxr-xr-x   3 root  wheel   512 Dec  5 10:40 .attribute
alsvid# getfacl .
#file:.
#owner:0
#group:0
user::rwx
user:robert:r--
group::r-x
mask::r-x
other::r-x

Try using tunefs and see how things go.  BTW, we're generally recommending
the use of UFS2 with ACLs because the extended attribute support is
substantially faster and more reliable.

It looks like there may be a problem with the processing of the acls 
flag as an argument of the mount command.  I'll investigate locally and
see if that's the cause -- if so, that would result in what you're seeing. 

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories



To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: LOR: filedesc structure - pipe mutex

2002-12-06 Thread Robert Watson

On Fri, 6 Dec 2002, Kris Kennaway wrote:

 I'm getting this too:
 
 Local package initialization:lock order reversal
  1st 0xc449ad34 filedesc structure (filedesc structure) @ 
/local0/src-client/sys/kern/sys_generic.c:901
  2nd 0xc4146780 pipe mutex (pipe mutex) @ /local0/src-client/sys/kern/sys_pipe.c:1239
 Debugger(witness_lock)
 Stopped at  Debugger+0x54:  xchgl   %ebx,in_Debugger.0
 db trace
 Debugger(c03efc85,c4146780,c04192dc,c04192dc,c0419d33) at Debugger+0x54
 witness_lock(c4146780,8,c0419d33,4d7,c0291b89) at witness_lock+0x667
 _mtx_lock_flags(c4146780,0,c0419d33,4d7,c418b4ec) at _mtx_lock_flags+0xb1
 pipe_poll(c418b4ec,40,c420ad00,c3f5cb60,c420ad00) at pipe_poll+0x40
 selscan(c3f5cb60,d7a29b98,d7a29b88,7,4) at selscan+0x12e
 kern_select(c3f5cb60,7,8076190,0,0) at kern_select+0x36f
 select(c3f5cb60,d7a29d10,c043094c,407,5) at select+0x66
 syscall(2f,2f,2f,8076190,1e) at syscall+0x28e
 Xint0x80_syscall() at Xint0x80_syscall+0x1d
 --- syscall (93, FreeBSD ELF32, select), eip = 0x2829a433, esp = 0xbfbff4dc, ebp = 
0xbfbffda0 ---
 db

Given that selscan() seems to rely on holding the file descriptor lock for
the duration of the scan, it seems that the file descriptor lock is
intended to be grabbed before the pipe mutex in the lock order.  The
reversal you're seeing is probably therefore the right order rather than
the wrong order.  To properly diagnose this, you probably want to hard
code that order in Witness's lock order list in subr_witness so that a
warning is generated earlier.  This reversal is presumably the second
order found by the kernel, which is the point at which (if the first order
isn't hard-coded) that a violation is first found.  Using show witness 
you can inspect how the lock order was constructed -- I've always found
the output a bit confusing, so if it's a simple order reversal (involving
direct lock relationships between two locks), hard-coding it is easier.

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories



To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: LOR: filedesc structure - pipe mutex

2002-12-07 Thread Robert Watson
On Fri, 6 Dec 2002, Kris Kennaway wrote:

 On Fri, Dec 06, 2002 at 07:18:03PM -0800, Lars Eggert wrote:
 
  I'm getting this too:
 
 After discussing this with various people on IRC, it was determined that
 this is not the place where the reversal is occurring, but since witness
 doesn't have the lock order defined it has to guess, and in this
 instance it is guessing the wrong way around.  After adding the lock
 order to subr_witness.c I now get this: 

Yeah, we're exchanging some out-of-band e-mail on this: the basic problem
is that:

filedescpipe
pipesigio
sigioproc
procfiledesc

We're talking about some possible solutions, including deferred
signalling, etc, etc. 

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories



To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: backgroud fsck is still locking up system (fwd)

2002-12-07 Thread Robert Watson

On Sun, 8 Dec 2002, Bruce Evans wrote:

 On Fri, 6 Dec 2002, Archie Cobbs wrote:
 
  So in summary my recommendation is to add a big warning to the
  growfs(1) man page that is should not be run on the root partition,
  even if you have booted single-user mode and haven't mounted / yet.
  I.e., to grow a root partition, you must boot from a different partition.
 
 Er, it should be obvious that growfs can't reasonably work on the mounted
 partitions.  growfs.1 doesn't exist, but growfs.8 already has the warning
 in a general form:
 
    Currently growfs can only enlarge unmounted file systems.  Do not
  try enlarging a mounted file system, your system may panic and you will
  not be able to use the file system any longer...

Hmm.  I guess one of the interesting questions is: what happened to the
safety belts?  I would have thought that GEOM would prevent opening the
partition writable while it was mounted...

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories



To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: LOR: filedesc structure - pipe mutex

2002-12-07 Thread Robert Watson
BTW, one upshot of this whole event is that we should probably be
hard-coding the lock order of all important locks rather than allowing it
to be automatically determined.  We'd uncover problems of this sort much
faster and much more easily, and it would provide better documentation of
the intended lock order.  Also, the authors of the following locking bits
should document them in the SMPng architecture document:

- File descriptor locking
- Pipe locking
- Select locking
- Process locking

I've gone ahead and documented the SIGIO locking, and I might get to the
rest sometime also, but it would be a lot faster if the people who did the
work documented it rather than having to rely on a lot of code reading to
figure out the subtleties.

Do we know if WITNESS supports the notion of a partial order, in which it
is simply asserted that there is no valid locking order between two locks? 

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories

On Fri, 6 Dec 2002, Kris Kennaway wrote:

 On Fri, Dec 06, 2002 at 07:18:03PM -0800, Lars Eggert wrote:
 
  I'm getting this too:
 
 After discussing this with various people on IRC, it was determined
 that this is not the place where the reversal is occurring, but since
 witness doesn't have the lock order defined it has to guess, and in
 this instance it is guessing the wrong way around.  After adding the
 lock order to subr_witness.c I now get this:
 
 lock order reversal
  1st 0xc41bc418 process lock (process lock) @ 
/local0/src-client/sys/kern/kern_descrip.c:2094
  2nd 0xc42de934 filedesc structure (filedesc structure) @ 
/local0/src-client/sys/kern/kern_descrip.c:2101
 Debugger(witness_lock)
 Stopped at  Debugger+0x45:  xchgl   %ebx,in_Debugger.0
 db trace
 Debugger(c037a085) at Debugger+0x45
 witness_lock(c42de934,8,c039e3cc,835,c41bc418) at witness_lock+0x532
 _mtx_lock_flags(c42de934,0,c039e3cc,835,0) at _mtx_lock_flags+0x7f
 sysctl_kern_file(c03d8a00,0,0,d7aaac08) at sysctl_kern_file+0x119
 sysctl_root(0,d7aaacb4,2,d7aaac08,c0416240) at sysctl_root+0x116
 userland_sysctl(c3f84a80,d7aaacb4,2,bfbfe588,bfbff3f8) at userland_sysctl+0xec
 __sysctl(c3f84a80,d7aaad14,6,1,297) at __sysctl+0x71
 syscall(2f,2f,2f,2,bfbfe588) at syscall+0x211
 Xint0x80_syscall() at Xint0x80_syscall+0x1d
 --- syscall (202, FreeBSD ELF32, __sysctl), eip = 0x805a717, esp = 0xbfbfe53c, ebp = 
0xbfbfe568 ---
 db
 
 Index: sys/kern/subr_witness.c
 ===
 RCS file: /home/ncvs/src/sys/kern/subr_witness.c,v
 retrieving revision 1.130
 diff -u -r1.130 subr_witness.c
 --- sys/kern/subr_witness.c 11 Nov 2002 16:36:20 -  1.130
 +++ sys/kern/subr_witness.c 7 Dec 2002 04:18:29 -
 @@ -198,6 +198,8 @@
   { Giant, lock_class_mtx_sleep },
   { proctree, lock_class_sx },
   { allproc, lock_class_sx },
 + { filedesc structure, lock_class_mtx_sleep },
 + { pipe mutex, lock_class_mtx_sleep },
   { sigio lock, lock_class_mtx_sleep },
   { process group, lock_class_mtx_sleep },
   { process lock, lock_class_mtx_sleep },
 


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



if_fxp and pause packets (or, I didn't need the network anyway)

2002-12-11 Thread Robert Watson

I'm having a recurring problem on a number of machines wherein the fxp
interfaces on those machines will spew out pause packets in vast
quantities while the system is in ddb, or following a shutdown.  This
doesn't happen with other operating systems, and only started happening at
some point in the moderate past on FreeBSD.  Peter Wemm suggested this
might be a result of support introduced for flow control (which I didn't
know existed for ethernet), but no matter what the reason, it's a bit of a
disaster if you have any expectation of using your network segment or
low-end switch while this is going on.  The change is most likely 1.109,
although I haven't built a kernel to test this as yet.  Is there a fix for
this -- for example, disabling support for this feature when in ddb or
after shutting down, or in some other watchdog kind of situation?

Just for reference, the card in the machine I've been having this probelm
with most recently is:

fxp0: Intel Pro 10/100B/100+ Ethernet port 0xdd80-0xddbf mem
0xff50-0xff5f,0xff8fe000-0xff8fefff irq 9 at device 1.0 on pci1

However, I'm having it with other related fxp cards. 

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: revoke(2) redux...

2002-12-24 Thread Robert Watson

On Tue, 24 Dec 2002, Poul-Henning Kamp wrote:

 Isn't there a pretty obvious race between the revoke() and the open() ? 
 
 Wouldn't it in fact make much more sense if revoke(2) was defined as
 
   int revoke(int fd); /* kick everybody else off */
 
 and the code above would look like: 

There are many races here, but one race is closed by this.  The way the
login process works is that it chowns the device, then revokes the device.
If the problem being addressed is that fd's remain open even after the
chown, then revoke works fine, since once you've chowned/chmodded the
file, the original process with a normal user uid can't re-open.  That
said, revoke() has terrible properties from a VFS perspective.  I'd be
interested in learning about the approaches taken in Linux, etc, to
address the same problem.

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories



To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: i386 tinderbox failure

2002-12-27 Thread Robert Watson

On Sat, 28 Dec 2002, Bruce Evans wrote:

  /local0/scratch/des/src/sys/fs/devfs/devfs_vnops.c:932: (near initialization for 
`devfs_specop_entries[14]')
  *** Error code 1
 
 This was broken by removing a unsed definition in:
 
 % RCS file: /home/ncvs/src/sys/kern/vnode_if.src,v
 % Working file: vnode_if.src
 % head: 1.59
 % ...
 % 
 % revision 1.59
 % date: 2002/12/24 19:47:13;  author: rwatson;  state: Exp;  lines: +0 -9
 % Flush vop_refreshlabel() definition, since it is no longer used.
 %
 % Obtained from:  TrustedBSD Project
 % Sponsored by:   DARPA, Network Associates Laboratories
 % 
 
 The use is just a stub so it should have been removed too. 

Yup, I missed it in an earlier GC pass.  I've now removed it -- sorry
about the breakage.

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories



To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: Problem with /dev/stdout in a chroot environment.

2002-12-30 Thread Robert Watson

On Mon, 30 Dec 2002, Marc Butler wrote:

 I'm currently trying to build CURRENT (DEC 29 2002) within a chroot
 environment under CURRENT (DEC 17 2002).  Presently I am stuck on an
 error which appears to be related to /dev/stdout in a chroot environment
 (devfs?). 

Could you provide a bit more detail on what's actually in the chroot
directory?  Have you mounted devfs in chroot/dev, or did you manually
stick in the device nodes?  In -STABLE, /dev/std* were actual device
nodes, whereas in -CURRENT, devfs makes them symlinks to fd/{0,1,2}, so
the details here are important...

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories



To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



pthread ^T problem on recent -CURRENT: death in libc_r mutex

2003-01-04 Thread Robert Watson

Juli Mallett pointed me at the following reproduceable problem on my
-current notebook with userland/kernel dated Dec 29:

paprika:~/freebsd/test/pthread ./test
1
2
1
2
1
2
load: 0.02  cmd: test 910 [running] 0.00u 0.01s 0% 824k
1
Bus error (core dumped)
paprika:~/freebsd/test/pthread ./test
1
2
load: 0.23  cmd: test 914 [running] 0.00u 0.01s 0% 824k
1
Bus error (core dumped)

Hitting ^T to get status information seems to break output following the
first printf after the information display.  Here's the stack trace from
the test program from the first execution above: 

(gdb) bt
#0  0x2807a559 in _pthread_mutex_trylock () from /usr/lib/libc_r.so.5
#1  0x2807a71c in _pthread_mutex_lock () from /usr/lib/libc_r.so.5
#2  0x2813598f in flockfile () from /usr/lib/libc.so.5
#3  0x2812bfd0 in vfprintf () from /usr/lib/libc.so.5
#4  0x2811a552 in printf () from /usr/lib/libc.so.5
#5  0x0804860d in thread2 (arg=0x0) at test.c:22
#6  0x280732ce in _thread_start () from /usr/lib/libc_r.so.5

The program source is attached below.

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories


#include pthread.h
#include stdio.h
#include unistd.h

void *
thread1(void *arg)
{

while (1) {
sleep(2);
printf(1\n);
}
}

void *
thread2(void *arg)
{

sleep(1);
while (1) {
sleep(2);
printf(2\n);
}
}

int
main(int argc, char *argv[]) {
pthread_t t1, t2;
int error;

error = pthread_create(t1, NULL, thread1, NULL);
error = pthread_create(t2, NULL, thread2, NULL);

error = pthread_join(t1, NULL);
error = pthread_join(t2, NULL);

return (0);
}



To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: pthread ^T problem on recent -CURRENT: death in libc_r mutex

2003-01-04 Thread Robert Watson

On Sat, 4 Jan 2003, Juli Mallett wrote:

 * De: Robert Watson [EMAIL PROTECTED] [ Data: 2003-01-04 ]
   [ Subjecte: pthread ^T problem on recent -CURRENT: death in libc_r mutex ]
  
  Juli Mallett pointed me at the following reproduceable problem on my
  -current notebook with userland/kernel dated Dec 29:
 
 Incidentally, this doesn't illustrate the problem I was actually trying
 to point out.  Try making the sleep's pthread_yield().  That will make
 the threads never run again.  sleep is the hack I've had to do.  In my
 appp, I have a 'my_yield' function which will sleep on FreeBSD, and
 yield on everywhere else :(

Hmm.  I'm not experiencing that problem -- if I replace the sleep() with
pthread_yield(), I get a long sequence of '1's until thread2 is started,
and then clean alternation between '1' and '2'.  I don't see any failure
to schedule a thread after it has yielded.

I'm updating my box from Dec 29 to today to see if that makes a difference
either way on either problem.

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories



To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: pthread ^T problem on recent -CURRENT: death in libc_r mutex

2003-01-05 Thread Robert Watson

On Sat, 4 Jan 2003, Juli Mallett wrote:

 * De: Robert Watson [EMAIL PROTECTED] [ Data: 2003-01-04 ]
   [ Subjecte: pthread ^T problem on recent -CURRENT: death in libc_r mutex ]
  
  Juli Mallett pointed me at the following reproduceable problem on my
  -current notebook with userland/kernel dated Dec 29:
 
 Incidentally, this doesn't illustrate the problem I was actually trying
 to point out.  Try making the sleep's pthread_yield().  That will make
 the threads never run again.  sleep is the hack I've had to do.  In my
 appp, I have a 'my_yield' function which will sleep on FreeBSD, and
 yield on everywhere else :(

Updating to Jan 4 kernel generates the same failure mode for me: following
a ^T, I get a core dump.  If I run it outside of gdb and then run gdb on
the core dump, I get the following:

(gdb) bt
#0  0x2807aa63 in _mutex_cv_lock () from /usr/lib/libc_r.so.5
#1  0x2807a749 in pthread_mutex_unlock () from /usr/lib/libc_r.so.5
#2  0x28136164 in funlockfile () from /usr/lib/libc.so.5
#3  0x2812c6ab in vfprintf () from /usr/lib/libc.so.5
#4  0x2811ab82 in printf () from /usr/lib/libc.so.5
#5  0x08048611 in thread1 (arg=0x0) at test.c:12
#6  0x280732ce in _thread_start () from /usr/lib/libc_r.so.5

There's a bit more noise if I run it under gdb, since gdb picks up the
SIGINFO delivery (twice?) but the same result occurs in the end:

1load: 0.07  cmd: test 690 [running] 0.04u 0.20s 0% 816k

Program received signal SIGINFO, Information request.
0x280d4c83 in poll () from /usr/lib/libc.so.5
(gdb) cont
Continuing.

Program received signal SIGINFO, Information request.
0x280d4c83 in poll () from /usr/lib/libc.so.5
(gdb) cont
Continuing.


Program received signal SIGBUS, Bus error.
[Switching to Process 690, Thread 4]
0x2807aa63 in _mutex_cv_lock () from /usr/lib/libc_r.so.5
(gdb) trace
trace command requires an argument
(gdb) bt
#0  0x2807aa63 in _mutex_cv_lock () from /usr/lib/libc_r.so.5
#1  0x2807a749 in pthread_mutex_unlock () from /usr/lib/libc_r.so.5
#2  0x28136164 in funlockfile () from /usr/lib/libc.so.5
#3  0x2812c6ab in vfprintf () from /usr/lib/libc.so.5
#4  0x2811ab82 in printf () from /usr/lib/libc.so.5
#5  0x08048611 in thread1 (arg=0x0) at test.c:12
#6  0x280732ce in _thread_start () from /usr/lib/libc_r.so.5
(gdb) 

Either way, still not the symptoms you have, but equally fatal.

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



gdb: failed to set signal flags properly for ast()

2003-01-05 Thread Robert Watson

While debugging the recent pthreads problem, I've started running into
this:

pid 663 (test), uid 1000: exited on signal 10 (core dumped)
failed to set signal flags properly for ast()
failed to set signal flags properly for ast()
failed to set signal flags properly for ast()
failed to set signal flags properly for ast()
failed to set signal flags properly for ast()
failed to set signal flags properly for ast()
failed to set signal flags properly for ast()
failed to set signal flags properly for ast()
failed to set signal flags properly for ast()
failed to set signal flags properly for ast()
failed to set signal flags properly for ast()
pid 709 (test), uid 0: exited on signal 10 (core dumped)
pid 713 (test), uid 0: exited on signal 10 (core dumped)
failed to set signal flags properly for ast()
failed to set signal flags properly for ast()
failed to set signal flags properly for ast()
failed to set signal flags properly for ast()

It appears to happen frequently when running the previously posted test
source code for -pthread under gdb.  When running the test program outside
of gdb, this doesn't happen, suggesting a possible interaction with
ptrace.  To trigger it the first time under gdb, I have to hit Ctrl-T,
then type continue a few times.  Under gdb, Ctrl-T appears to sometimes
cause a sigbus; the rest of the time, it causes this warning to start
being generated while the program continues.  Once the warning has started
to be generated, it gets generated about 12 times almost immediately, and
then intermittently from then onwards.

Source below.  Compiled using -g, -Wall, -pthread. (so not KSE)

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories

#include pthread.h
#include stdio.h
#include unistd.h

void *
thread1(void *arg)
{

while (1) {
/* sleep(2); */
pthread_yield();
printf(1\n);
}
}

void *
thread2(void *arg)
{

sleep(1);
while (1) {
/* sleep(2); */
pthread_yield();
printf(2\n);
}
}

int
main(int argc, char *argv[]) {
pthread_t t1, t2;
int error;

error = pthread_create(t1, NULL, thread1, NULL);
error = pthread_create(t2, NULL, thread2, NULL);

error = pthread_join(t1, NULL);
error = pthread_join(t2, NULL);

return (0);
}



To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: pthread ^T problem on recent -CURRENT: death in libc_r mutex

2003-01-05 Thread Robert Watson

On Sun, 5 Jan 2003, Julian Elischer wrote:

 On Sun, 5 Jan 2003, Robert Watson wrote:
 
  
  Updating to Jan 4 kernel generates the same failure mode for me: following
 
 What makes you think it's the kernel?

Well, to be more precise, I upgraded the entire system to Jan 4.  I'm
assuming it's something about poor signal handling in libc_r, actually.

  a ^T, I get a core dump.  If I run it outside of gdb and then run gdb on
  the core dump, I get the following:

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: alpha tinderbox failure

2003-01-06 Thread Robert Watson

On Mon, 6 Jan 2003, Mike Barcroft wrote:

 These new truncated lines only make problems harder to solve. 
 
 Anyway, the problem is the 5th argument to vn_extattr_get() should be an
 int *, but it's passing a size_t *.  It looks like most consumers of
 vn_extattr_get() would prefer a size_t *, so maybe the interface should
 be changed. 

I think the problem originated because uio_resid is 'int', but iovec's
len is size_t.  I agree the right answer is to use size_t as the argument
to vn_extattr_{get,set}().  Will that cause type problems with the resid
field, however?

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: bridging broken in -current AND -stable

2000-03-08 Thread Robert Watson

On Thu, 9 Mar 2000, Boris Staeblow wrote:

 On Wed, Mar 08, 2000 at 11:15:23PM +0100, Luigi Rizzo wrote:
 
   
   Is it possible that bridging is broken in -current and -stable?
  
  no, but the "de" driver on bridging is now unsupported and i could not
  find the time to make it work after recent fixes to the bridging code.
 
 Ooo n! :-((
 The best nic´s all around are unsupported? ;)
 
  can you switch to some supported card ("ed" or "fxp") ?
 
 Hmm, "ed" is lousy and "fxp" is too expensive for me now...
 
 How great is the chance that good old dec is coming back?
 I could help you testing your code... :)
 
 I would like to prevent to buy new hardware again.

I have what appear to be functional patches to provide support for if_dc,
used in the common and cheap PCI Linksys ethernet cards (LNE100TX?).  If
jkh approves the commit, I can stick it in before the release.  I don't
have access to an if_de card where I am currently, so won't be able to
look at that for at least a week or two.

The patch for if_dc is below.

  Robert N M Watson 

[EMAIL PROTECTED]  http://www.watson.org/~robert/
PGP key fingerprint: AF B5 5F FF A6 4A 79 37  ED 5F 55 E9 58 04 6A B1
TIS Labs at Network Associates, Safeport Network Services

Index: if_dc.c
===
RCS file: /home/ncvs/src/sys/pci/if_dc.c,v
retrieving revision 1.7
diff -u -r1.7 if_dc.c
--- if_dc.c 2000/01/24 17:19:37 1.7
+++ if_dc.c 2000/03/08 18:15:55
@@ -122,6 +122,11 @@
 
 #include net/bpf.h
 
+#include "opt_bdg.h"
+#ifdef BRIDGE 
+#include net/bridge.h
+#endif
+
 #include vm/vm.h  /* for vtophys */
 #include vm/pmap.h/* for vtophys */
 #include machine/clock.h  /* for DELAY */
@@ -2099,14 +2104,30 @@
ifp-if_ipackets++;
eh = mtod(m, struct ether_header *);
 
-   /*
-* Handle BPF listeners. Let the BPF user see the packet, but
-* don't pass it up to the ether_input() layer unless it's
+   /* Handle BPF listeners. Let the BPF user see the packet */
+   if (ifp-if_bpf)
+   bpf_mtap(ifp, m);
+
+#ifdef BRIDGE
+   if (do_bridge) {
+   struct ifnet *bdg_ifp ;
+   bdg_ifp = bridge_in(m);
+   if (bdg_ifp != BDG_LOCAL  bdg_ifp != BDG_DROP)
+   bdg_forward(m, bdg_ifp);
+   if (((bdg_ifp != BDG_LOCAL)  (bdg_ifp != BDG_BCAST)
+ (bdg_ifp != BDG_MCAST)) || bdg_ifp == BDG_DROP) {
+   m_freem(m);
+   continue;
+   }
+   }
+
+   eh = mtod(m, struct ether_header *);
+#endif
+
+   /* Don't pass it up to the ether_input() layer unless it's
 * a broadcast packet, multicast packet, matches our ethernet
 * address or the interface is in promiscuous mode.
 */
if (ifp-if_bpf) {
-   bpf_mtap(ifp, m);
if (ifp-if_flags  IFF_PROMISC 
(bcmp(eh-ether_dhost, sc-arpcom.ac_enaddr,
ETHER_ADDR_LEN) 




To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



problems using pccard 3c589c with 4.0-snap install

2000-03-14 Thread Robert Watson


Yesterday I spent a fair amount of time attempting to get a 4.0 snapshot
to install on my notebook (Dell Latitude CPi), which until now has been
happily running 3.3-PAO.  Sadly, it seems not to like my ethernet card. 
When installing, sysinstall provides three IRQ exclude options before
initializing pccard support--the first option causes ep0 not to be probed;
the second two allow it to be probed, but not work correctly.  DHCP
succeeds, but following that the DNS lookup hangs.  It appears that the
IRQ is set wrong such that incoming packets are not being observed by the
IP stack, but this is just speculation.  Tcpdump running on the box next
to it shows that outgoing packets seem to be alright.  Sysinstall's debug
screen indicates that IRQ5 is being assigned to the card--PAO was
allocating IRQ 3 (and it worked :-).

Any pointers--especially ones that get the install of 4.0 working ``out of
the box'' on this notebook would be much appreciated.

  Robert N M Watson 

[EMAIL PROTECTED]  http://www.watson.org/~robert/
PGP key fingerprint: AF B5 5F FF A6 4A 79 37  ED 5F 55 E9 58 04 6A B1
TIS Labs at Network Associates, Safeport Network Services



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



current.freebsd.org snapshots and broken X11

2000-03-14 Thread Robert Watson


It looks like the X11 associated with the snapshots on current.freebsd.org
is still broken:

  /usr/libexec/ld-elf.so.1: Share object "libXThrStub.so.6" not found

Installing -current boxes for testing and development would be a lot
easier if this worked. :-)  Especially leading up to releases when testing
changes in the install is useful :-)

  Robert N M Watson 

[EMAIL PROTECTED]  http://www.watson.org/~robert/
PGP key fingerprint: AF B5 5F FF A6 4A 79 37  ED 5F 55 E9 58 04 6A B1
TIS Labs at Network Associates, Safeport Network Services



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: current.freebsd.org snapshots and broken X11

2000-03-14 Thread Robert Watson


Sounds good to me, as long as it runs :-).

BTW, ran into another nit from the 02/13 snapshot.  I installed the
X-kern-developer distribution, discovered X11 didn't work, so went back
into sysinstall to install X11 stuff.  I selected some combination of X11
components, and chose releng3.freebsd.org as the source -- sysinstall
coredumped.  A second attempt at the same resulted in sysinstall hanging.

This seems to be reproduceable -- what can I do to give you better
debugging information?

On Tue, 14 Mar 2000, Jordan K. Hubbard wrote:

 I think I'm just going to go to the 3.4 bits; I don't have a definitive
 XFree86 distribution for 4.0 and I don't think I'm going to get one in
 time to make a difference.
 
 - Jordan
 
  
  It looks like the X11 associated with the snapshots on current.freebsd.org
  is still broken:
  
/usr/libexec/ld-elf.so.1: Share object "libXThrStub.so.6" not found
  
  Installing -current boxes for testing and development would be a lot
  easier if this worked. :-)  Especially leading up to releases when testing
  changes in the install is useful :-)
  
Robert N M Watson 
  
  [EMAIL PROTECTED]  http://www.watson.org/~robert/
  PGP key fingerprint: AF B5 5F FF A6 4A 79 37  ED 5F 55 E9 58 04 6A B1
  TIS Labs at Network Associates, Safeport Network Services
  
  
  
  To Unsubscribe: send mail to [EMAIL PROTECTED]
  with "unsubscribe freebsd-current" in the body of the message
 
 


  Robert N M Watson 

[EMAIL PROTECTED]  http://www.watson.org/~robert/
PGP key fingerprint: AF B5 5F FF A6 4A 79 37  ED 5F 55 E9 58 04 6A B1
TIS Labs at Network Associates, Safeport Network Services



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: problems using pccard 3c589c with 4.0-snap install

2000-03-14 Thread Robert Watson


As a followup email, I suppose I'm specifically asking if there's a way to
make sysinstall allocate IRQ3 to the card, as that seems to be the
differentiating factor in terms of hardware configuration allocated
between 3.3-PAO and 4.0-snapshot.  I.e., rather than a sysinstall field
saying, ``Which of these IRQs should I note use'', instead, ``Which should
I use''.  Or the like.


On Tue, 14 Mar 2000, Robert Watson wrote:

 
 Yesterday I spent a fair amount of time attempting to get a 4.0 snapshot
 to install on my notebook (Dell Latitude CPi), which until now has been
 happily running 3.3-PAO.  Sadly, it seems not to like my ethernet card. 
 When installing, sysinstall provides three IRQ exclude options before
 initializing pccard support--the first option causes ep0 not to be probed;
 the second two allow it to be probed, but not work correctly.  DHCP
 succeeds, but following that the DNS lookup hangs.  It appears that the
 IRQ is set wrong such that incoming packets are not being observed by the
 IP stack, but this is just speculation.  Tcpdump running on the box next
 to it shows that outgoing packets seem to be alright.  Sysinstall's debug
 screen indicates that IRQ5 is being assigned to the card--PAO was
 allocating IRQ 3 (and it worked :-).
 
 Any pointers--especially ones that get the install of 4.0 working ``out of
 the box'' on this notebook would be much appreciated.
 
   Robert N M Watson 
 
 [EMAIL PROTECTED]  http://www.watson.org/~robert/
 PGP key fingerprint: AF B5 5F FF A6 4A 79 37  ED 5F 55 E9 58 04 6A B1
 TIS Labs at Network Associates, Safeport Network Services
 
 
 
 To Unsubscribe: send mail to [EMAIL PROTECTED]
 with "unsubscribe freebsd-current" in the body of the message
 


  Robert N M Watson 

[EMAIL PROTECTED]  http://www.watson.org/~robert/
PGP key fingerprint: AF B5 5F FF A6 4A 79 37  ED 5F 55 E9 58 04 6A B1
TIS Labs at Network Associates, Safeport Network Services



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Comments on 4.0-RELEASE install glitches

2000-03-18 Thread Robert Watson


I installed 4.0 on a notebook yesterday, using the docking station.  As
previously described, I had hardware probing problems without using the
ethernet card in the docking station.  Well, sadly, X11 requires an extra
option or two to work when with the docking station, but I figured that
out and that works too.  Here are some observations:

1) I isntalled from ftp.freebsd.org -- the RSAREF package install failed,
breaking SSH and friends.

2) I chose the gnome/afterstep desktop combo--some of the afterstep icons
were broken

3) I switched to the gnome/enlightenment combo--the file manager never
turned up, and the console spewed warnings from Perl about locale settings

I realize that these are all ports-related problems, but they're very
visible points of failure, that would turn up early in any review in a
magazine or the like :-).

  Robert N M Watson 

[EMAIL PROTECTED]  http://www.watson.org/~robert/
PGP key fingerprint: AF B5 5F FF A6 4A 79 37  ED 5F 55 E9 58 04 6A B1
TIS Labs at Network Associates, Safeport Network Services



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



panic: vm_map_entry_create: kernel resources exhausted

2000-04-10 Thread Robert Watson


5.0-CURRENT -- was doing a make buildworld -j 2.  Sadly, I don't know what
exact date the source was from, as I had just cvsup'd and started
building, but I expect in the last week and a half.  I was running with
capabilities patches going, but I wouldn't imagine that it would cause
this particular nasty.


(kgdb) where
#0  Debugger (msg=0xc02fa583 "panic") at ../../i386/i386/db_interface.c:319
#1  0xc01907fc in panic (
fmt=0xc0312820 "vm_map_entry_create: kernel resources exhausted")
at ../../kern/kern_shutdown.c:552
#2  0xc02838fa in vm_map_entry_create (map=0xc035b4ec) at ../../vm/vm_map.c:292
#3  0xc0283aed in vm_map_insert (map=0xc035b4ec, object=0xc035b580, 
offset=62492672, start=3282665472, end=3282669568, prot=7 '\a', 
max=7 '\a', cow=0) at ../../vm/vm_map.c:528
#4  0xc0282feb in kmem_alloc (map=0xc035b4ec, size=4096)
at ../../vm/vm_kern.c:175
#5  0xc028d09c in _zget (z=0xc035b660) at ../../vm/vm_zone.c:343
#6  0xc028789d in vm_object_allocate (type=0 '\000', size=14)
at ../../vm/vm_zone.h:85
#7  0xc028016b in swap_pager_alloc (handle=0x0, size=57344, prot=7, offset=0)
at ../../vm/swap_pager.c:387
#8  0xc028c1bc in vm_pager_allocate (type=1, handle=0x0, size=57344, prot=7, 
off=0) at ../../vm/vm_pager.c:246
#9  0xc02852c2 in vm_map_split (entry=0xc3a6a930) at ../../vm/vm_map.c:1939
#10 0xc02854ee in vm_map_copy_entry (src_map=0xc3a624c0, dst_map=0xc3a62400, 
src_entry=0xc3a6a930, dst_entry=0xc3a6a870) at ../../vm/vm_map.c:2043
#11 0xc0285753 in vmspace_fork (vm1=0xc3a624c0) at ../../vm/vm_map.c:2170
#12 0xc0282a86 in vm_fork (p1=0xc3a2a080, p2=0xc3a29ee0, flags=20)
at ../../vm/vm_glue.c:233
#13 0xc0188f0b in fork1 (p1=0xc3a2a080, flags=20, procp=0xc3a92f38)
at ../../kern/kern_fork.c:485
#14 0xc01886be in fork (p=0xc3a2a080, uap=0xc3a92f80)
at ../../kern/kern_fork.c:100
#15 0xc02bf2e2 in syscall (frame={tf_fs = 47, tf_es = 47, tf_ds = 47, 
  tf_edi = 135000192, tf_esi = 134975988, tf_ebp = -1077937892, 
  tf_isp = -1012322348, tf_ebx = 0, tf_edx = 134975988, tf_ecx = 0, 
  tf_eax = 2, tf_trapno = 12, tf_err = 2, tf_eip = 134657044, tf_cs = 31, 
  tf_eflags = 514, tf_esp = -1077937936, tf_ss = 47})
at ../../i386/i386/trap.c:1073
#16 0xc02b0526 in Xint0x80_syscall ()
#17 0x804b52e in ?? ()
#18 0x804a920 in ?? ()
#19 0x8051a92 in ?? ()
#20 0x80519fe in ?? ()
#21 0x80480f9 in ?? ()




  Robert N M Watson 

[EMAIL PROTECTED]  http://www.watson.org/~robert/
PGP key fingerprint: AF B5 5F FF A6 4A 79 37  ED 5F 55 E9 58 04 6A B1
TIS Labs at Network Associates, Safeport Network Services



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



middle mouse button emulation broken in 4.0-STABLE?

2000-04-14 Thread Robert Watson


Not sure if this should go to -current or -stable, since we seem to get a
lot of instant MFC's these days :-).  I upgraded a notebook from
4.0-RELEASE to -STABLE last night.  After doing so, I noticed that the
middle mouse button emulation in moused seems to be fairly broken -- i.e.,
once it's enabled, I can no longer drag with the left mouse button
(although I can with the emulated middle button).  If I turn off
emulation, the left and right buttons drag normally, but now I can no
longer paste in X :-).

I'd love to get this fixed--middle mouse button emulation is actually
pretty useful. :-)  Sadly I can no longer use my mouse keyboard switch
with my roller-ball mouse--for some reason it results in the psm synch
error and vast numbers of anomolies when I switch away from certain
machines, so I'm stuck with emulated.

  Robert N M Watson 

[EMAIL PROTECTED]  http://www.watson.org/~robert/
PGP key fingerprint: AF B5 5F FF A6 4A 79 37  ED 5F 55 E9 58 04 6A B1
TIS Labs at Network Associates, Safeport Network Services



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Failed compile of ext2_alloc.c

2000-04-15 Thread Robert Watson


Yup -- I neglected to update the ext2fs code (which uses UFS stuff) to
include the requisite include files.  Please try the attached patch
against src/sys/gnu/ext2fs, and let me know if it works, and I'll go ahead
and commit it.  I caught the weird Coda dependancy, but guess I missed
this one. 

On Sat, 15 Apr 2000, George W. Dinolt wrote:

 Content-Type: text/plain; charset=us-ascii
 Content-Transfer-Encoding: 7bit
 
 Robert:
 
 I compiled  recent (cvsup around 9:00 A.M. PDT) sources for the 5.0
 kernel  with and without
 
 options FFS_EXTATTR
 
 and obtained the following error message in both cases.
 
 cc -c -O -pipe -march=k6 -Wall -Wredundant-decls -Wnested-externs
 -Wstrict-prototypes  -Wmissing-prototypes -Wpointer-arith -Winline
 -Wcast-qual  -fformat-extensions -ansi  -nostdinc -I- -I. -I../..
 -I../../../include  -D_KERNEL -include opt_global.h -elf
 -mpreferred-stack-boundary=2  ../../gnu/ext2fs/ext2_alloc.c
 In file included from ../../gnu/ext2fs/ext2_alloc.c:54:
 ../../ufs/ufs/ufsmount.h:90: field `um_extattr' has incomplete type
 *** Error code 1
 
 I have not investigated. I think this may have to do with your additions
 of extended attributes.
 
 Hope this  helps.
 George Dinolt
 [EMAIL PROTECTED]
 
 
 


  Robert N M Watson 

[EMAIL PROTECTED]  http://www.watson.org/~robert/
PGP key fingerprint: AF B5 5F FF A6 4A 79 37  ED 5F 55 E9 58 04 6A B1
TIS Labs at Network Associates, Safeport Network Services


Index: gnu/ext2fs/ext2_alloc.c
===
RCS file: /home/ncvs/src/sys/gnu/ext2fs/ext2_alloc.c,v
retrieving revision 1.28
diff -u -r1.28 ext2_alloc.c
--- gnu/ext2fs/ext2_alloc.c 1999/08/23 22:05:49 1.28
+++ gnu/ext2fs/ext2_alloc.c 2000/04/15 16:49:37
@@ -49,6 +49,7 @@
 #include sys/mount.h
 #include sys/syslog.h
 
+#include ufs/ufs/extattr.h
 #include ufs/ufs/quota.h
 #include ufs/ufs/inode.h
 #include ufs/ufs/ufsmount.h
Index: gnu/ext2fs/ext2_inode.c
===
RCS file: /home/ncvs/src/sys/gnu/ext2fs/ext2_inode.c,v
retrieving revision 1.25
diff -u -r1.25 ext2_inode.c
--- gnu/ext2fs/ext2_inode.c 2000/03/20 10:44:17 1.25
+++ gnu/ext2fs/ext2_inode.c 2000/04/15 16:49:37
@@ -53,6 +53,7 @@
 #include vm/vm.h
 #include vm/vm_extern.h
 
+#include ufs/ufs/extattr.h
 #include ufs/ufs/quota.h
 #include ufs/ufs/inode.h
 #include ufs/ufs/ufsmount.h
Index: gnu/ext2fs/ext2_linux_balloc.c
===
RCS file: /home/ncvs/src/sys/gnu/ext2fs/ext2_linux_balloc.c,v
retrieving revision 1.11
diff -u -r1.11 ext2_linux_balloc.c
--- gnu/ext2fs/ext2_linux_balloc.c  1999/02/25 15:54:05 1.11
+++ gnu/ext2fs/ext2_linux_balloc.c  2000/04/15 16:49:38
@@ -33,6 +33,7 @@
 #include sys/mount.h
 #include sys/vnode.h
 
+#include ufs/ufs/extattr.h
 #include ufs/ufs/quota.h
 #include ufs/ufs/ufsmount.h
 #include gnu/ext2fs/ext2_extern.h
Index: gnu/ext2fs/ext2_linux_ialloc.c
===
RCS file: /home/ncvs/src/sys/gnu/ext2fs/ext2_linux_ialloc.c,v
retrieving revision 1.13
diff -u -r1.13 ext2_linux_ialloc.c
--- gnu/ext2fs/ext2_linux_ialloc.c  1999/01/27 21:49:53 1.13
+++ gnu/ext2fs/ext2_linux_ialloc.c  2000/04/15 16:49:39
@@ -34,6 +34,7 @@
 #include sys/mount.h
 #include sys/vnode.h
 
+#include ufs/ufs/extattr.h
 #include ufs/ufs/quota.h
 #include ufs/ufs/inode.h
 #include ufs/ufs/ufsmount.h
Index: gnu/ext2fs/ext2_lookup.c
===
RCS file: /home/ncvs/src/sys/gnu/ext2fs/ext2_lookup.c,v
retrieving revision 1.22
diff -u -r1.22 ext2_lookup.c
--- gnu/ext2fs/ext2_lookup.c2000/03/20 11:28:36 1.22
+++ gnu/ext2fs/ext2_lookup.c2000/04/15 16:49:39
@@ -57,6 +57,7 @@
 #include sys/malloc.h
 #include sys/dirent.h
 
+#include ufs/ufs/extattr.h
 #include ufs/ufs/quota.h
 #include ufs/ufs/inode.h
 #include ufs/ufs/dir.h
Index: gnu/ext2fs/ext2_vfsops.c
===
RCS file: /home/ncvs/src/sys/gnu/ext2fs/ext2_vfsops.c,v
retrieving revision 1.63
diff -u -r1.63 ext2_vfsops.c
--- gnu/ext2fs/ext2_vfsops.c2000/03/09 05:21:10 1.63
+++ gnu/ext2fs/ext2_vfsops.c2000/04/15 16:49:41
@@ -56,6 +56,7 @@
 #include sys/malloc.h
 #include sys/stat.h
 
+#include ufs/ufs/extattr.h
 #include ufs/ufs/quota.h
 #include ufs/ufs/ufsmount.h
 #include ufs/ufs/inode.h
Index: gnu/ext2fs/ext2_vnops.c
===
RCS file: /home/ncvs/src/sys/gnu/ext2fs/ext2_vnops.c,v
retrieving revision 1.51
diff -u -r1.51 ext2_vnops.c
--- gnu/ext2fs/ext2_vnops.c 2000/03/03 08:00:27 1.51
+++ gnu/ext2fs/ext2_vnops.c 2000/04/15 16:49:42
@@ -68,6 +68,7 @@
 
 #include sys/signalvar.h
 #include ufs/ufs/dir.h

Re: Failed compile of ext2_alloc.c

2000-04-15 Thread Robert Watson


Since it appears to work for me, I'm going to go ahead and commit the
patch before too many other people run into this.  Please let me know if
you have further problems and I'll get them fixed up ASAP.

On Sat, 15 Apr 2000, George W. Dinolt wrote:

 Content-Type: text/plain; charset=us-ascii
 Content-Transfer-Encoding: 7bit
 
 Robert:
 
 I compiled  recent (cvsup around 9:00 A.M. PDT) sources for the 5.0
 kernel  with and without
 
 options FFS_EXTATTR
 
 and obtained the following error message in both cases.
 
 cc -c -O -pipe -march=k6 -Wall -Wredundant-decls -Wnested-externs
 -Wstrict-prototypes  -Wmissing-prototypes -Wpointer-arith -Winline
 -Wcast-qual  -fformat-extensions -ansi  -nostdinc -I- -I. -I../..
 -I../../../include  -D_KERNEL -include opt_global.h -elf
 -mpreferred-stack-boundary=2  ../../gnu/ext2fs/ext2_alloc.c
 In file included from ../../gnu/ext2fs/ext2_alloc.c:54:
 ../../ufs/ufs/ufsmount.h:90: field `um_extattr' has incomplete type
 *** Error code 1
 
 I have not investigated. I think this may have to do with your additions
 of extended attributes.
 
 Hope this  helps.
 George Dinolt
 [EMAIL PROTECTED]
 
 
 


  Robert N M Watson 

[EMAIL PROTECTED]  http://www.watson.org/~robert/
PGP key fingerprint: AF B5 5F FF A6 4A 79 37  ED 5F 55 E9 58 04 6A B1
TIS Labs at Network Associates, Safeport Network Services



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Recent commit changes extattr backing file format, users beware

2000-04-19 Thread Robert Watson

I just committed a change to the extended attribute backing code that
modifies the per-attribute header.  The result is that backing files used
and created from now on have a different format, and weird and unfortunate
things will happen with backing files before this change.  I doubt anyone
is doing more than mild experimentation at this point, but I thought a
heads up was in order.

I hope not to change the format any further.  I've been considering
introducing a backing file header version number of some sort, but this is
only necessary if we think the backing file format will change much more.

Comments welcome.

  Robert N M Watson 

[EMAIL PROTECTED]  http://www.watson.org/~robert/
PGP key fingerprint: AF B5 5F FF A6 4A 79 37  ED 5F 55 E9 58 04 6A B1
TIS Labs at Network Associates, Safeport Network Services



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Recent commit changes extattr backing file format, users beware

2000-04-19 Thread Robert Watson

On Wed, 19 Apr 2000, Garance A Drosihn wrote:

 At 3:41 AM -0400 4/19/00, Robert Watson wrote:
 I hope not to change the format any further.  I've been considering
 introducing a backing file header version number of some sort, but
 this is only necessary if we think the backing file format will
 change much more.
 
 Comments welcome.
 
 If you're going to change the header right now, then I would think
 it is best to add a field for version number.  What you're saying
 is that "we will never ever have to change this format again", and
 that seems overly optimistic to me.  I don't even know what this
 *IS*, other than you report that this change will cause "weird and
 unfortunate things to happen" for backing files created with the
 previous format.  Any format where a change can cause "weird and
 unfortunate things to happen" should have a version number in it,
 in my opinion...
 
 What downside is there to adding a field for version number?

I came to the same conclusion and am about to commit the addition of a
magic number for sanity checking, and a version field in the file header.
I've spent the last twenty minutes trying to get the man page updates
right for some other parallel changes, but am still working on that. :-)

Should be committed shortly, as the changes have been working fine for me
for the past hour or two.

  Robert N M Watson 

[EMAIL PROTECTED]  http://www.watson.org/~robert/
PGP key fingerprint: AF B5 5F FF A6 4A 79 37  ED 5F 55 E9 58 04 6A B1
TIS Labs at Network Associates, Safeport Network Services



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Recent commit changes extattr backing file format, users beware

2000-04-19 Thread Robert Watson


FYI: I committed the addition of the magic number and version information
an hour or two ago.  It seems to work fine for me, but please let me know
if you have any problems.  A migration tool doesn't seem useful yet, but
is now feasible :-).

In a day or two, I'll send a post to freebsd-fs describing possible
upsides and downsides to moving to a more integrated attribute mechanism
instead of (and/or in addition to) backing files.

  Robert N M Watson 

[EMAIL PROTECTED]  http://www.watson.org/~robert/
PGP key fingerprint: AF B5 5F FF A6 4A 79 37  ED 5F 55 E9 58 04 6A B1
TIS Labs at Network Associates, Safeport Network Services



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: What happens with SECURELVL? (init complains)

2000-06-07 Thread Robert Watson


At bde's request, I moved kern.suser_permitted to kern_prot.c and
accidentally also trimmed kern.securelevel.  I just committed it back into
kern_mib.c.  Please let me know if there are further problems.

That said, I'm a little puzzled as to where securelevel is being defined
-- a bunch of stuff depends on the variable and yet my test build
succeeded without it in there.  And you go that far also -- far enough to
boot rather than have the linking fail.

Robert

On Tue, 6 Jun 2000, Andrey A. Chernov wrote:

 Now init always complains:
 
 init: cannot get kernel security level: No such file or directory
 
 It is because KERN_SECURELVL define still present in /sys/sysctl.h but
 gone from kern_mib.c
 Moreover, even define is gone from kern_mib.c, sysctl_kern_securelvl()
 function is still there!
 
 Please clean up the mess.
 
 -- 
 Andrey A. Chernov
 [EMAIL PROTECTED]
 http://ache.pp.ru/
 
 
 To Unsubscribe: send mail to [EMAIL PROTECTED]
 with "unsubscribe freebsd-current" in the body of the message
 


  Robert N M Watson 

[EMAIL PROTECTED]  http://www.watson.org/~robert/
PGP key fingerprint: AF B5 5F FF A6 4A 79 37  ED 5F 55 E9 58 04 6A B1
TIS Labs at Network Associates, Safeport Network Services



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: What happens with SECURELVL? (init complains)

2000-06-07 Thread Robert Watson

On Wed, 7 Jun 2000, Robert Watson wrote:

 That said, I'm a little puzzled as to where securelevel is being defined
 -- a bunch of stuff depends on the variable and yet my test build
 succeeded without it in there.  And you go that far also -- far enough to
 boot rather than have the linking fail.

Nevermind -- I just trimmed the SYSCTL, not the variable declaration.

Sorry for inconvenience to all.

  Robert N M Watson 

[EMAIL PROTECTED]  http://www.watson.org/~robert/
PGP key fingerprint: AF B5 5F FF A6 4A 79 37  ED 5F 55 E9 58 04 6A B1
TIS Labs at Network Associates, Safeport Network Services



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: KAME integration and plans

2000-07-05 Thread Robert Watson


This is great news -- one of the big hangups in our interop testing at NAI
Labs was the like of IKE on FreeBSD.  I notice that right now racoon is a
port -- assuming this interpretation is correct, are their any plans to
integrate racoon as a base system component?  As you point out, without
IKE, FreeBSD's IPsec implementation is effectively useless for
cross-platform communication due to the number of frobs in SA
configuration.  I also look forward to the rapid MFC'ing, assuming that
the code works :-).

  Robert N M Watson 

[EMAIL PROTECTED]  http://www.watson.org/~robert/
PGP key fingerprint: AF B5 5F FF A6 4A 79 37  ED 5F 55 E9 58 04 6A B1
TIS Labs at Network Associates, Safeport Network Services



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: /sys hierarchy

2000-07-05 Thread Robert Watson

On Wed, 5 Jul 2000, John Baldwin wrote:

 The headers will always be installed in the right place in
 /usr/include: Makefile's are editable.  As far as kernel
 compiles, symlinks can be created in the work directory as
 one possible solution.  For example,
 sys/compile/i386/GENERIC/netinet - ../../../../net/inet.
 This would most likely result in netinet _not_ being split
 up.

As much as I'd love a complete cleanup of sys/, this cure seems to be
worse than the problem. :-)  Take this as another vote to leave net/ as
is, if only to keep the includes in kernel code in sync with includes in
userland code :-).

  Robert N M Watson 

[EMAIL PROTECTED]  http://www.watson.org/~robert/
PGP key fingerprint: AF B5 5F FF A6 4A 79 37  ED 5F 55 E9 58 04 6A B1
TIS Labs at Network Associates, Safeport Network Services



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: 5.1-R acl problem (again)

2003-08-14 Thread Robert Watson

On Sun, 10 Aug 2003, Branko F. Gracnar wrote:

 Thanks for quick and very informative answer. 
 
 You're right about getfacl -d (i used linux + acl patch before, where
 default acls are displayed without any arguments and i didn't read
 getfacl man page). 

Yeah -- the Linux tool implementation is based more on Solaris than on
POSIX.1e.  That has some upsides, and some downsides.  I believe there's
an environmental variable you can set on Linux to cause the
getfacl/setfacl to behave in strict accordance with the spec
(POSIXLY_CORRECT or the like). 

 Thanks alot again.

Sure.

 But there is one thing, i don't understand. 
 
 if i issue the following command:
 
 setfacl -dm u::rwx,g::rx,o::---,u:branko:rwx,m::rwx  directory
 
 and then create file under that directory, why getfacl reports:
 
 #file:a/c
 #owner:0
 #group:0
 user::rw-
 user:branko:rwx # effective: r--
 group::r-x  # effective: r--
 mask::r--
 other::r--
 
 why is mask just 'r' ?!

One of the contentious issues in the design of POSIX.1e was how to set the
protections on a new object.  There are three variables of interest: the
creation mode requested by a process, the umask of that process, and the
default ACL on the parent directory where the object is being created.  In
5.0-R and 5.1-R, we combine them as follows: we mask all elements of the
creation mode using the umask; we then combine the ACL and combined mode
by converting the default ACL to the access ACL on the new object and
overwriting the access ACL fields with the equivilent fields in the mode.
So in the above example, a mask of r-- is likely a result of the creation
mode and umask having a group mode of 4.

In 5.1-CURRENT, we recently switched these semantics to perform a further
intersection of rights in the ACL, rather than a replacement of rights. 
The result is that if the mask in your default ACL is --- and the
combination of creation mode and umask is r--, you get a mask of --- in
the final access ACL.  This implements the algorithm in the POSIX.1e spec
to the letter: at some point, these semantics got changed during a
retrofit of the ACL code, and it wasn't picked up (this might actually
have been after 5.0 but I haven't checked the logs).

I'm currently in the throes of implementing a mode of operation which uses
the Solaris/Linux algorithm, which works in the following manner: if an
default ACL is being used to create a new object, the default ACL replaces
the umask, rather than combining with it.  This allows directory default
ACLs to override the umask locally, producing more liberal rights, which
may be what you're expecting.  This is a violation of the spec, but it's a
common violation due to its utility (POSIX.1e doesn't allow the create
more liberal protections because it was deemed unsafe).  I hope to finish
prototyping this and get a patch out to the current@ list in the next
couple of weeks.  The complication is that currently, the umask and
requested creation mode are combined at the system call layer, above VFS,
so we need to expose them separately on the entry to the file system.  The
result is that all file systems would now have to combine the two
elements, and it touches a lot of code. 

Hope this information is useful, and gives you a good picture of where
we're going. 

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Change in application of default ACLs in UFS

2003-08-14 Thread Robert Watson

On Wed, 6 Aug 2003, Daniel C. Sobral wrote:

Note: this change contains a semantic bugfix for new file creation:
we now intersect the ACL-generated mode and the cmode requested by
the user process.  This means permissions on newly created file
objects will now be more conservative.  In the future, we may want
to provide alternative semantics (similar to Solaris and Linux) in
which the ACL mask overrides the umask, permitting ACLs to broaden
the rights beyond the requested umask.
 
 FWIW, I don't like it. This means I'll have to change my umask to o+rw
 for my ACLs to work correctly, since I use ACLs to _give_ rights in ways
 that umask cannot. 

I'm in the throes of implementing changes that push umask processing down
into individual file systems, permitting UFS ACLs to override the umask
using the ACL mask, which would reproduce the Solaris/Linux model
(non-POSIX.1e).  However, there are some interesting implementation
question shtere, so it will probably be a bit (perhaps a couple of weeks)
before I have a useful prototype worth reviewing.  I agree that those
semantics are useful, however :-).

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to [EMAIL PROTECTED]


5.2-RELEASE TODO

2003-08-15 Thread Robert Watson
  |
 | |  | | operating system|
 | |  | | has fairly  |
 | |  | | extensive   |
 | Merge of Darwin |  | | improvements to |
 | msdosfs, other  | --   | --  | msdosfs and other   |
 | fixes   |  | | kernel services;|
 | |  | | these fixes must be |
 | |  | | reviewed and merged |
 | |  | | to the FreeBSD  |
 | |  | | tree.   |
 |-+--+-+-|
 | |  | | Port syscons to |
 | |  | | sparc64. Add device |
 | |  | | drivers for sun |
 | |  | | mice and keyboards. |
 | |  | | Allow for more than |
 | sparc64 adaptation  | In   | | 3 bits of   |
 | of syscons  | progress | Jake Burkholder | background colour   |
 | |  | | in syscons. Creator |
 | |  | | frame buffer device |
 | |  | | driver. In the  |
 | |  | | process, generally  |
 | |  | | improve the MI-ness |
 | |  | | of syscons. |
 |-+--+-+-|
 | |  | | Many systems|
 | |  | | supporting POSIX.1e |
 | |  | | ACLs permit a minor |
 | |  | | violation to that   |
 | |  | | specification, in   |
 | |  | | which the ACL_MASK  |
 | |  | | entry overrides the |
 | ACL_MASK override   | In   | | umask, rather than  |
 | of umask support in | progress | Robert Watson   | being intersected   |
 | UFS |  | | with it. The|
 | |  | | resulting semantics |
 | |  | | can be useful in|
 | |  | | group-oriented  |
 | |  | | environments, and   |
 | |  | | as such would be|
 | |  | | very helpful on |
 | |  | | FreeBSD.|
 |-+--+-+-|
 | |  | | Significant parts   |
 | |  | | of the network  |
 | |  | | stack (especially   |
 | |  | | IPv4 and IPv6) now  |
 | |  | | have fine-grained   |
 | |  | | locking of their|
 | |  | | data structures.|
 | |  | | However, it is not  |
 | |  | | yet possible for|
 | |  | | the netisr threads  |
 | |  | | to run without  |
 | |  | | Giant, due to   |
 | Fine-grained|  | | dependencies on |
 | network stack   | In   | Jeffrey Hsu,| sockets, routing,   |
 | locking without | progress | Seigo Tanimura  | etc. A 5.2-RELEASE  |
 | Giant   |  | | goal is to have the |
 | |  | | network stack   |
 | |  | | running largely |
 | |  | | without Giant,  |
 | |  | | which should|
 | |  | | substantially   |
 | |  | | improve performance |
 | |  | | of the stack

Re: LOR with filedesc structure and Giant

2003-08-15 Thread Robert Watson

On Fri, 15 Aug 2003, Kris Kennaway wrote:

 The problem seems to be due to select() being called on the /dev/null
 device, and it is holding the filedesc lock when it reaches
 PICKUP_GIANT() in spec_poll.

Yeah, this is pretty much the same issue you've been bumping into for a
bit -- we hold filedesc lock over select(), which means every object we
poll can't grab a lock that either comes before the file descriptor lockin
the lock order, or that might sleep.

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories


___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: LOR with filedesc structure and Giant

2003-08-16 Thread Robert Watson

On Sat, 16 Aug 2003, Poul-Henning Kamp wrote:

  The problem seems to be due to select() being called on the /dev/null
  device, and it is holding the filedesc lock when it reaches
  PICKUP_GIANT() in spec_poll.
 
 Yeah, this is pretty much the same issue you've been bumping into for a
 bit -- we hold filedesc lock over select(), which means every object we
 poll can't grab a lock that either comes before the file descriptor lockin
 the lock order, or that might sleep.
 
 Doesn't this effectively doom any attempt at getting rid af Giant from
 below ? 

I have mixed feelings about our current strategy.  On the one hand, it's a
very simple strategy to understand and implement -- it's also a reasonable
argument that poll operations for status might return quickly -- i.e.,
be safe while holding a mutex to prevent the filedesc array from changing.
On the other hand, the lock order and sleep implications are pretty
alarming, and have already caused a substantial number of problems.  It
would be interesting to know what consistency guarantees are provided for
the user app on other platforms with fine-grained kernel locking.

Approaches that come to mind include making a copy of the filedesc array
to prevent it from changing (sounds expensive for a select() call) to
avoid holding the mutex for long; we could move to an sx lock which would
fix the sleep issue at a slightly increased locking cost (but not solve
the lock order problem); if we push Giant past the file descriptor code in
one big throw that would resolve the lock order issue (but not the sleep
problem).  In a recent pass, I identified some of the locks with order
relationships with the file descriptor lock, and many of those will be
non-trivial to resolve.  For example, we grab file descriptor locks during
execve() to clean up the file descriptor array, and kevent interacts with
file descriptor locks.  Pushing Giant off further off execve() might have
a fair number of interactions with VFS and VM we'd want to watch out for
(on the other hand, we are probably close..)  Most of the changes to push
Giant behind the filedesc lock are not too bad, though.  I think it would
be worth a concerted effort by an interested and competent party to push
Giant behind the file descriptor lock.

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: HEADS UP: dynamic root support now in the tree

2003-08-17 Thread Robert Watson

On Sun, 17 Aug 2003, Shin-ichi Yoshimoto wrote:

 make installworld broken.
 
 ==libexex/rtld-elf
 [snip]
 ln: /usr/libexec/ld-elf.so.1: Operation not permitted
 *** Error code 1
 
 any idea ? 

I'm guessing we need to remove the schg flag from the old ld-elf.so.1
before trying to replace it with a symlink:

paprika:/usr/src/kerberos5/usr.bin/kadmin ls -lo /usr/libexec/ld-elf.so.1
-r-xr-xr-x  1 root  wheel  schg 133292 Jul  3 21:07 /usr/libexec/ld-elf.so.1*

You can work around it locally, probably, by doing a chflags noschg
/usr/libexec/ld-elf.so.1, but the official makefiles probably need to do
something about it also.

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories


___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Broken kernel compile on 5.1-RELEASE / 5-CURRENT (SMP, PAE scsi)

2003-08-19 Thread Robert Watson

On Tue, 19 Aug 2003, Mark Sergeant wrote:

 There are no other errors apart from those listed so I may try compiling
 as a module that gets loaded on boot. Just one problem, I succesfully
 build an SMP kernel without PAE and then rebooted and the server is no
 longer responding, it seems it crashed just after coming up as I was
 able to ping it 5 or 6 times and then it went away again. If I've got a
 crash dump I'll post it. 

Can't speak to the specifics of this, but you want to be very careful not
to use kernel modules with PAE: modules are currently without the context
of the kernel configuration file, and PAE introduces possible binary
incompatibility with modules that dig into VM (which many drivers do). The
only supported configuration is to not use modules, but link the driver
directly into the kernel when running PAE.  This is why the PAE kernel has
no_modules defined in the sample PAE configuration file.  Various
conversations have happened regarding how to address this problem, and I'm
not sure we've come up with the right answer yet.  There seem to be two
conflicting directions: build modules in the context of a kernel module
(and get conditionally compiled type/structure/code/... pieces), and try
to make the module build entirely independent of a kernel configuration. 
As someone who uses conditionally compiled components in modules, I tend
to fall into the first camp, and no doubt we'll figure out the right
answer in due course.

The above crash sounds unfortunate too, but quite possibly a separate
failure mode :-).

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories


___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: mksnap_ffs, snapshot issues, again

2003-08-19 Thread Robert Watson

On Tue, 19 Aug 2003, Branko F. Gracnar wrote:

 The behaviour of filesystem activity stalling during snapshot creation
 is intentional, but 30 minutes to snapshot an empty FS is not.  Is
 there disk activity during this time?  It's not clear from your mail
 whether bg fsck is in operation during this time.  If so, that's
 probably the cause, since bg fsck itself uses a snapshot to check the
 FS consistency.
 
 Background fsck was NOT running. I formatted fs and then tried to make
 snapshot. 

When reporting bgfsck/snapshot/... problems, you may want to CC Kirk
McKusick [EMAIL PROTECTED] -- I don't believe he closely tracks
current@, and he's the best person to track down and fix problems in this
area.  I forwarded your earlier message to him, but haven't heard back as
yet.  Just FYI.

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories


___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: 5.1-R: zero byte core file.

2003-08-20 Thread Robert Watson

On Wed, 20 Aug 2003, Yogeshwar Shenoy wrote:

 While using 5.1-RELEASE, I find that if my application program seg
 faults, it produces programname.core; but it is 0 bytes.  I ran the
 exact same program on another machine that was running 4.4-RELEASE, and
 I do get a core file that I can use with gdb.  I'd really appreciate if
 someone could help me resolve this. 
 
 Additional details:  - It is not specific to the application program. I
 tried a 2 line program: 
 char p[8]; 
 memcpy(p, 1234567890123456789012345678901234567890, 40); 
  with same results on 5.1-R(0 byte core file) and 4.4-R(usable core
 file) 
 
 - ulimit -a on the 5.1-R machine gives
core file size (blocks, -c) unlimited
 
 - Just to be sure I used getrlimit() to find what the limit for
 RLIMIT_CORE is in my processes, and it is RLIM_INFINITY. 
 
 - I did the basic checks like write permission on current directory, it
 looks fine. 
 
 Can someone help me resolve this? 

With 5.1-CURRENT from July, I get:

paprika:~/tmp/core ./tmp 
Segmentation fault (core dumped)
paprika:~/tmp/core ls -l
total 348
drwxr-xr-x  2 rwatson  rwatson 512 Aug 20 17:23 ./
drwxr-xr-x  9 rwatson  rwatson 512 Aug 20 17:23 ../
-rwxr-xr-x  1 rwatson  rwatson4677 Aug 20 17:23 tmp*
-rw-r--r--  1 rwatson  rwatson 131 Aug 20 17:23 tmp.c
-rw---  1 rwatson  rwatson  323584 Aug 20 17:23 tmp.core

The corefile isn't very useful, since the stack is completely hosed by the
operations, but I do get a core that I can point gdb at.  Some elements of
core file generation are platform-specific: what architecture are you
running on?  And, just to confirm, df . indicates you have space, right? 

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories


___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: sysinstall spec_getpages panic (with VM overtones)

2003-08-20 Thread Robert Watson

On Wed, 20 Aug 2003, Gavin Atkinson wrote:

 On the 8th August [EMAIL PROTECTED] mentioned he was getting a panic
 with FreeBSD inside VMware where _mtx_lock is being called with a NULL
 mutex from spec_getpages. I'm also seeing this, 100% reproducible, on
 real hardware. (see message ID [EMAIL PROTECTED] for
 the original posters email and jhb's reply) For me, Sysinstall panics
 during the extraction of the base package: 
 
 (note that I do not get to see a register dump)  kernel: type 12 trap,
 code=0
 
 _mtx_lock_flags(0,0,c0529513,300,) at _mtx_lock_flags+0x43
 spec_getpages(cce33b3c,54,0,cce33b2c,0) at spec_getpages+0x26c
 ffs_getpages(cce33b80,0,c05459de,274,c05c63e0) at ffs_getpages+0x5f6
 vnode_pager_getpages(c0bebafc,cce33c70,1,0,cce33c20) at
 vnode_pager_getpages+0x73 vm_fault(c1259900,819b000,1,0,c12534c0) at
 vm_fault+0x8e2 trap_pfault(cce33d48,1,819b004,200,819b004) at
 trap_pfault+0x109 trap(2f,2f,2f,82e533c,0) at trap+0x1fc calltrap() at
 calltrap+0x5

I've been getting similar reports locally from our trustedbsd_sebsd
branch.  We thought originally it was a local merge problem we introduced
due to some inconsistent merging of specfs changes, but I think we have
now have eliminated that.  I suppose I'm relieved... (?)

 I first noticed this with the 20030811 JPSNAP, but have tried with the
 9th July 2003 JPSNAP, and yesterdays snapshot, and see the same result
 on both. I see the same panic whether installing over the net or from
 CD.  With 64 meg of ram, it panics half way through the read the chunks
 that make up the base package, upping the ram to 256 allows it to read
 all of the chunks before panicing. 

Sounds identical.

 *c0529513 = /usr/src/sys/fs/specfs/spec_vnops.c, line 0x300 is line 768:
 
 766 gotreqpage = 0;
 767 VM_OBJECT_LOCK(vp-v_object);
 768 vm_page_lock_queues();
 769 for (i = 0, toff = 0; i  pcount; i++, toff = nextoff) {
 
 so ap-a_vp is null. I'#m afraid that's the limit of my ddb ability. 
 
 Any suggestions as to where I should go from here? I don't really have
 the facility at the moment to make release to test patches but will try
 to if necessary. 

Is it ap-a_vp that's NULL, or vp-v_object that's NULL?  vp is
dereferenced several times before that in the code, so if vp is really
NULL at line 767, we're probably talking about memory corruption.  But if
vp-v_object is NULL, then it could be we're not creating a VM object
along some code path.

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: malloc message with nfs transfer

2003-08-21 Thread Robert Watson

On Thu, 21 Aug 2003, cosmin wrote:

 malloc() of 64 with the following non-sleepable locks held:  exclusive
 sleep mutex inpr = 0 (0xcef0) locked @
 /usr/src/sys/netinet/udp_usrreq.c:378 exclusive sleep mutex netisr lock
 r = 0 (0xc061be80) locked @ /usr/src/sys/net/netisr.c:215
 
 I'm getting those on the console, and it seems that they only happen
 when users start an nfs transfer to the nfs exported filesystem.  The
 exported filesystem is a vinum raid5 array but I don't know if that has
 anything to do with the messages.

Sorry, just to be clear -- is the message you're getting on the NFS
client, or the NFS server?  Could you turn on debug.witness_ddb and get a
stack trace for the warning?

 Before I upgraded from 4.8, I used to be able to send at about 8mb/s to
 the nfs exported raid5.  After upgrading to 5.1-CURRENT, the maximum
 speed has been only 4mb/s.  I'm wondering if the messages above have
 anything to do with the performance drop. 

You appear to have the kernel debugging features turned to high (which
will be useful for resolving this problem :-).  Turn off WITNESS and
INVARIANTS and you should see a substantial performance improvement.  It
may not be back up to 4.x levels -- we hope that with ongoing network
stack locking work we'll be back to 4.x (and exceed them) in the next few
months.

Thanks,

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: malloc message with nfs transfer

2003-08-21 Thread Robert Watson

On Thu, 21 Aug 2003, cosmin wrote:

  Sorry, just to be clear -- is the message you're getting on the NFS
  client, or the NFS server?  Could you turn on debug.witness_ddb and get a
  stack trace for the warning?
 
 This is on the NFS server.  I turned on debug.witness_ddb, but I'm not
 sure if this will help, because the system isn't locking up, or
 otherwise stopping.  I have tried setting a breakpoint in ddb for
 0xcef0, but it starts breaking right away.  The malloc() messages
 are many minutes apart.

With witness_ddb turned on, the kernel should drop into ddb whenever
there's a witness-related warning, which should include the warnings you
mentioned in your previous e-mail.

 I'm not sure if these messages indicate anything critical.  I was mainly
 concerned with the nfs performance.

Generally speaking, WITNESS warns about potential problems, as opposed to
actual problems: i.e., it warns when a deadlock would have occurred, if
the timing had been just right.  This was a warning that a potentially
blocking activity was performed while holding a mutex, which is generally
a bad idea.  A little bit more detail on the strack trace should be
sufficient to track it down.  Turning off WITNESS should dramatically
improve performance at the risk of lowering debugging output (maybe that's
not a risk :-).

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to [EMAIL PROTECTED]


  1   2   3   4   5   6   7   >