from:"Robert Watson"

Re: split out patch

2003-02-01 Thread Robert Watson


On Sat, 1 Feb 2003, Mike Barcroft wrote:

> Brad Knowles <[EMAIL PROTECTED]> writes:
> > At 6:27 PM -0800 2003/02/01, Matthew Dillon wrote:
> > 
> > >  Well, it is an active conversation/thread.  Either people care enough
> > >  to stay involved or they don't.
> > 
> > But don't people have to sleep sometime?  Shouldn't we allow for that?
> 
> Real hackers don't sleep. :) 

That as may be, but even real hackers can't poll every mailing list every
five minutes :-).  Otherwise, real hackers would have no time for coding,
and that might present a problem.

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Re: Question about devd concept

2003-02-02 Thread Robert Watson

On Sun, 2 Feb 2003, Oliver Brandmueller wrote:

> Sorry for answering my own mail now.
> 
> On Sun, Feb 02, 2003 at 03:19:24PM +0100, Oliver Brandmueller wrote:
> > action "/usr/local/etc/netconf/bin/netconf $device-name start";
> 
> David Wolfskill gave a good pointer, that devd starts before all
> non-root disks are mounted. I'm just trying to move my script to the
> root partition (for the moment I move it to /etc/netconf/ just to test
> it). Although the scripts works when manually started after booting I
> have still no evidence it is even called on boot, but for the moment I
> suspect my script maybe has some problems in that early stage and I will
> try to give it some debugging stuff. 
> 
> I'll keep you informed 'bout my findings. 

I ran into a similar problem, actually -- programs like dhclient rely on
being able to write to lease and pid files.  It's almost as though we'd
like an additional set of events when the system is "more booted".  I.e.,
a devd event for each device when the network is started, etc.

Actually, I suspect what we want is to have a seperate network event
management daemon -- arrival of the device is not the same as arrival of
the interface.  dhclient events (and related things) should happen as a
result of the interface arriving, not the newbus device arrival.  I.e., a
netd that listens to routing socket and kqueue events relating to the
network stack. 

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Re: How to freeze up your FreeBSD 5.0 box.

2003-02-03 Thread Robert Watson

Jaye,

Thanks for the report.  Haven't seen it here, but I also don't have a
configuration that looks like that one.  In order to usefully debug this,
we'll need some more information -- off the top of my head, it sounds a
lot like a file system pseudo-deadlock of some sort.  Perhaps a race to
the root due to locking during a long operation.  If you could compile
your kernel with "options DDB", assuming it's not already, and set up a
serial console, that would be helpful.  When you reproduce the apparent
hang, break into ddb on the console, and send the output of the following
two commands:

show lockedvnods
ps

You don't strictly have to use a serial console, but it sure is a lot
easier to copy/paste the output than to copy by hand.  You can find
detailed instructions for setting this up in the Handbook and Developer's
Handbook.

A few things to tickle the thinking process: is the hang immediate, or
does it wait until something else lists one of those directories?  Does
the hang occur if you perform the same behavior locally, or only over NFS?
Do the disks churn during the deletion, or does it seem to be a
CPU-intensive activity?  Suppose you stick all these files a few levels
fdown in the directory hierarchy -- does it help?

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories

On Sun, 2 Feb 2003, Jaye Mathisen wrote:

> 
> 
> THis is repeatable for me at will.
> 
> 5.0-current, supped as of 2/1.
> 
> 4 80GIG maxtors on 2 promise IDE ultra 66 cards, exported via NFS.
> 
> 
> newfs'd an 80G FS on each drive, created one big file filled with
> zero's from /dev/zero on each drive. (Softupdates/UFS1)
> 
> login to my other box, which has the 4 drives mounted via NFS
> (the other box being 4.7-stable as of 1/28/03).  options
> from fstab are bg,intr,rw, don't recall if it was v2 or v3 NFS,
> although mountd was started with defaults, so I'm assuming v3.
> 
> do (on the 4.7-stable box):
> 
> rm /mntpt1/bigfile & 
> rm /mntpt2/bigfile &
> rm /mntpt3/bigfile &
> rm /mntpt4/bigfile &
> 
> 
> And then switch back to the 5.0 current box, only to find out
> that it will not respond to any network traffic via ssh.  Will
> respond to pings.  
> 
> I can type my login/pw on the console, but hitting return
> after typing my pw just sits there, until I ran out of VTY's.
> 
> df on the 4.7 box hangs.
> 
> AFter about 30 minutes, when the files finally finished being deleted,
> control of 5.0 box was returned, and everything was back and functioning
> properly.
> 
> To Unsubscribe: send mail to [EMAIL PROTECTED]
> with "unsubscribe freebsd-current" in the body of the message
> 

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Re: What is the difference between p_ucred and td_ucred?

2003-02-03 Thread Robert Watson

On Mon, 3 Feb 2003, Ilmar S. Habibulin wrote:

> Why not to use only credits for proc and make td_ucred macro like
> td_proc->p_ucred? Or it has some meaning that i do not understand? 

td_ucred is a cached copy of p_ucred.  The cached copy is potentially
updated on any entry to the kernel.  The reason for doing this is
multi-fold:

(1) Because of threading, access to p_ucred requires holding the process
lock.  Because we don't want to hold the process lock for every access
control check, using a thread-access only reference avoids the lock.

(2) Credential consistency.  We perform the update check when we enter the
kernel; if the comparison indicates they differ, we grab the process
lock and update td_ucred.  If they don't differ, we can continue.
My understanding is that this is because a pointer comparison to
determine if they are identical is permitted without a lock.  It might
be desirable to reason about the safety of this.  This guarantees a
single consistent "process credential" for each system call; that way
there aren't races between separate access control checks resulting in
inconsistent enforcement. 

So the end semantic is that there is, in effect, a single process
credential.  However, there may be divergence from that credential when
threads are blocked in kernel and another thread changes the process
credential, as the threads blocked in kernel won't pick up the new
credential until they exit and re-enter the kernel.  This approach, as I
understand it, is taken by several other MP/threaded UNIXes.

The strategy for selecting a credential to check against is generally to
use td_ucred, and to hold no locks.  You'll see that suser() does this,
for example.  Under some circumstances: specifically, credential updates,
you need to hold the process lock and atomically check the process
credential before updating.  If the thread doesn't immediately leave the
kernel (i.e., more checks might be performed), you'll also need to
propagate the cred change to the thread from the process.

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Re: What is the difference between p_ucred and td_ucred?

2003-02-04 Thread Robert Watson


On Tue, 4 Feb 2003, Ilmar S. Habibulin wrote:

> On Mon, 3 Feb 2003, Robert Watson wrote:
> 
> > The strategy for selecting a credential to check against is generally to
> > use td_ucred, and to hold no locks.  You'll see that suser() does this,
> > for example.  Under some circumstances: specifically, credential updates,
> > you need to hold the process lock and atomically check the process
> > credential before updating.  If the thread doesn't immediately leave the
> > kernel (i.e., more checks might be performed), you'll also need to
> > propagate the cred change to the thread from the process.
> 
> Ok. Thank you for an expanation, I'll consider that.  Now i'm trying to
> reanimate Thomas Moestls' capability work. Is anybody interested in such
> integration? I have almost bootable kernel and now will try to
> understand kernel structures locking and td_ucred/p_ucred interactions,
> to make nessesary changes. 
> 
> Or SEBSD make capabilities completly unnesessary? 

We have tentative plans to support Capabilities-like models via a plug-in
module using the MAC Framework sometime over the next few months. 
Slotting the POSIX.1e capabilities work into that makes a lot of sense. 

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Re: sshd + NIS = bad mojo?

2003-02-06 Thread Robert Watson

On Thu, 6 Feb 2003, Barkley Vowk wrote:

> I'm upgrading my cluster to 5.0-R (fresh from CD), and everything works
> swimmingly except logging in as user on the YPmap.

I have about six boxes at work using 5.0-CURRENT and 5.0-RELEASE with NIS
accounts without any apparent problems.  I did shoot my toes several times
during the upgrade process, though.  I think I ended up manually forcing
rpcbind to be on as well as nisclient.  If finger and so on are working,
I'm not sure why you'd see this failure, though.  Does "id username"
return the correct grouplist?  Does this machine have multiple network
interfaces, or just one?

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Re: vnode locking question.

2003-02-07 Thread Robert Watson


On Fri, 7 Feb 2003, Hiten Pandya wrote:

> On Thu, Feb 06, 2003 at 10:53:08AM -0800, Julian Elischer wrote the words in effect 
>of:
> > 
> > On Thu, 6 Feb 2003, John Baldwin wrote:
> > 
> > > On 05-Feb-2003 Julian Elischer wrote:
> > > > 
> > > > Is there ever a case when a vnode is locked for longer than the duration
> > > > of the syscall that locked it?
> > > 
> > > Shouldn't be.  That would be a bug I believe.  Userland threads should
> > > never hold any kernel locks.
> > 
> > That's what I think too but I just thought I'd ask..
> > (NFS worries me a bit)
> 
> If It did, wouldn't that give a panic() with something like: 
>   "panic: mutex held on exit to userland..." 
> 
> ... or something like that? 

Nope; lockmgr doesn't have that feature, although all the SMPng locking
primitives do, I believe.  In fact, I believe that's the source of
Julian's question, since I've had a conversation with him about adding
that sort of sanity checking.  In adding that sort of sanity checking, you
want to be very sure we don't break any existing assumptions.

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Re: Witness This

2003-02-21 Thread Robert Watson

On Fri, 21 Feb 2003, Daniel C. Sobral wrote:

> backtrace(c032e7d9,c25af500,c25a98d4,c046236e,c04623ec) at
> backtrace+0x17
> 
> witness_lock(c25af500,8,c04623ec,1b8,c) at witness_lock+0x660
> _mtx_lock_flags(c25af500,0,c04623ec,1b8,8095) at
> _mtx_lock_flags+0xb1 chn_intr(c25a9880,c,1,208,c25af7c0) at
> chn_intr+0x2f cmi_intr(c25a9800,0,c0329618,217,c25ae9ec) at
> cmi_intr+0xa6 ithread_loop(c25a9000,cd2ced48,c032948d,366,55ff44fd) at
> ithread_loop+0x182 fork_exit(c01cd420,c25a9000,cd2ced48) at
> fork_exit+0xc4 fork_trampoline() at fork_trampoline+0x1a --- trap 0x1,
> eip = 0, esp = 0xcd2ced7c, ebp = 0 ---

This one is probably not my fault, but may well be of interest to
Jeffrey Hsu.

> Now, witness biba:

Is there any change you have the console output about three or four lines
above this?  It identifies the locks in the lock order reversal.  It
sounds like a lock might be held in getnewvnode() across
mac_destroy_vnode_label, which in the original design it wasn't intended
to be, and that might result in the reversal.  I'll have to take a closer
look at that.

> Finally, trace this:
...
> #8  0xc01eb0bb in panic (fmt=0x0) at /usr/src/sys/kern/kern_shutdown.c:528
>   td = (struct thread *) 0xc0ecba50
>   bootopt = 256
>   newpanic = 1
>   buf = "mac_mls_single_in_range: a not single", '\0' 
> #9  0xc0277274 in mac_mls_single_in_range (single=0x0, range=0xc2605e80)
>  at /usr/src/sys/security/mac_mls/mac_mls.c:225
> No locals.
> #10 0xc0278cb6 in mac_mls_check_ifnet_transmit (ifnet=0xc25ebc00,
>  ifnetlabel=0x0, m=0xc0eda000, mbuflabel=0x0)
>  at /usr/src/sys/security/mac_mls/mac_mls.c:1462
>   p = (struct mac_mls *) 0x0
>   i = (struct mac_mls *) 0x0
> #11 0xc01dad7a in mac_check_ifnet_transmit (ifnet=0xc25ebc00, 
> mbuf=0xc0eda000)
>  at /usr/src/sys/kern/kern_mac.c:2269
>   mpc = (struct mac_policy_conf *) 0xc2605e80
>   error = 0

I'm a bit puzzled by this; it could be this relates to recent changes
regarding when socket state is discarded.  Especially odd are the
ifnetlabel and mbuflabel arguments being NULL, as well as the two mac_mls
pointers.  Really they should be non-NULL, or you would have panicked
earlier, so perhaps there's stack corruption.  If you still have this
dump, you might consider walking back up the stack to these two frames,
and printing the contents of *ifnetlabel and *mbuflabel, as well as
the two struct mac_mls values in mac_mls_check_ifnet_transmit.

BTW, the attached patch might also be useful, as it's possible there's now
a NULL pointer dereference here now.  I don't think the trace you have
above will be fixed by this change, but you never know. :-)

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories

Index: tcp_subr.c
===
RCS file: /data/fbsd-cvs/ncvs/src/sys/netinet/tcp_subr.c,v
retrieving revision 1.155
diff -u -r1.155 tcp_subr.c
--- tcp_subr.c  21 Feb 2003 23:17:12 -  1.155
+++ tcp_subr.c  22 Feb 2003 02:34:07 -
@@ -484,7 +484,7 @@
m->m_pkthdr.len = tlen;
m->m_pkthdr.rcvif = (struct ifnet *) 0;
 #ifdef MAC
-   if (tp != NULL) {
+   if (tp != NULL && tp->t_inpcp != NULL) {
/*
 * Packet is associated with a socket, so allow the
 * label of the response to reflect the socket label.

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Re: Witness This

2003-02-21 Thread Robert Watson


Revised patch without typo attached.

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories

Index: tcp_subr.c
===
RCS file: /data/fbsd-cvs/ncvs/src/sys/netinet/tcp_subr.c,v
retrieving revision 1.155
diff -u -r1.155 tcp_subr.c
--- tcp_subr.c  21 Feb 2003 23:17:12 -  1.155
+++ tcp_subr.c  22 Feb 2003 02:44:42 -
@@ -484,7 +484,7 @@
m->m_pkthdr.len = tlen;
m->m_pkthdr.rcvif = (struct ifnet *) 0;
 #ifdef MAC
-   if (tp != NULL) {
+   if (tp != NULL && tp->t_inpcb != NULL) {
/*
 * Packet is associated with a socket, so allow the
 * label of the response to reflect the socket label.


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

TCP connections timing out "real fast"

2003-02-22 Thread Robert Watson


Don't yet have any quantitative evidence that this is the case, but I feel
like TCP sessions have been timing out on me a lot faster than they used
to.  For example, yesterday a machine got unplugged from the network for
about 15 seconds: in that time, the SSH sessions to the machine timed out
and disconnected.  This morning, a machine generated a lot of output to
the serial console keeping it substantially busy for about 20 seconds; in
that time, the SSH session to it timed out.  I'm going to see if I can't
generate some tcpdump traces later today to confirm my suspicions, but was
wondering if anyone else (annecdotally or not) has seen similar things? 

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Re: TCP connections timing out "real fast"

2003-02-22 Thread Robert Watson


On Sat, 22 Feb 2003, Bosko Milekic wrote:

> On Sat, Feb 22, 2003 at 10:57:05AM -0500, Robert Watson wrote:
> > 
> > Don't yet have any quantitative evidence that this is the case, but I feel
> > like TCP sessions have been timing out on me a lot faster than they used
> > to.  For example, yesterday a machine got unplugged from the network for
> > about 15 seconds: in that time, the SSH sessions to the machine timed out
> > and disconnected.  This morning, a machine generated a lot of output to
> > the serial console keeping it substantially busy for about 20 seconds; in
> > that time, the SSH session to it timed out.  I'm going to see if I can't
> > generate some tcpdump traces later today to confirm my suspicions, but was
> > wondering if anyone else (annecdotally or not) has seen similar things? 
> 
>   I have (annecdotally) but I believe I'm seeing it on -STABLE too...
>   it's tough to tell... how recent are your -CURRENT machines, though,
>   and is it something that you think just started happening or has it
>   been happening for a while now?  FWIW, I can't say for sure that this
>   is related to TCP connection timeouts.

The workstation the sessions originated from is 5.0-RELEASE from Jan 16; 
the build box running sshd is 5.x from Jan 30.  I.e., all before recent
TCP changes.

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Re: TCP connections timing out "real fast"

2003-02-22 Thread Robert Watson


On Sat, 22 Feb 2003, Bosko Milekic wrote:

> On Sat, Feb 22, 2003 at 10:57:05AM -0500, Robert Watson wrote:
> > 
> > Don't yet have any quantitative evidence that this is the case, but I feel
> > like TCP sessions have been timing out on me a lot faster than they used
> > to.  For example, yesterday a machine got unplugged from the network for
> > about 15 seconds: in that time, the SSH sessions to the machine timed out
> > and disconnected.  This morning, a machine generated a lot of output to
> > the serial console keeping it substantially busy for about 20 seconds; in
> > that time, the SSH session to it timed out.  I'm going to see if I can't
> > generate some tcpdump traces later today to confirm my suspicions, but was
> > wondering if anyone else (annecdotally or not) has seen similar things? 
> 
>   I have (annecdotally) but I believe I'm seeing it on -STABLE too...
>   it's tough to tell... how recent are your -CURRENT machines, though,
>   and is it something that you think just started happening or has it
>   been happening for a while now?  FWIW, I can't say for sure that this
>   is related to TCP connection timeouts.

Here's a packet trace.  cboss.gw.tislabs.com is running the January 30
5.0-CURRENT.  crash2.gw.tislabs.com is running a -CURRENT from yesterday.
Here's the output from the ssh session:

crash2:~> sysctl -a | grep witnessRead from remote host
crash2.gw.tislabs.com: Operation timed out
Connection to crash2.gw.tislabs.com closed.
cboss:/data/stock/src/sys/kern> 

The sysctl -a takes a little while to run because it currently generates a
boatload of serial console output due to sleep warnings.  Running it on
the console takes about 35 seconds to complete.  The disconnect
appears to happen half way through that time.  Here's the trace, as
recorded on cboss.gw.tislabs.com, starting about when I hit enter at the
end of the sysctl command line; it looks like it takes about 20 seconds to
decide to disconnect after a series of rapid retransmissions:

cboss# tcpdump -r /tmp/packets
11:40:36.826529 cboss.gw.tislabs.com.49423 > crash2.gw.tislabs.com.ssh: P
347024
1365:3470241385(20) ack 49959986 win 33304  (
DF) [tos 0x10] 
11:40:36.845660 crash2.gw.tislabs.com.ssh > cboss.gw.tislabs.com.49423: P
1:21(2
0) ack 20 win 33304  (DF) [tos 0x10] 
11:40:36.940001 cboss.gw.tislabs.com.49423 > crash2.gw.tislabs.com.ssh: .
ack 21
 win 33304  (DF) [tos 0x10] 
11:40:37.758432 cboss.gw.tislabs.com.49423 > crash2.gw.tislabs.com.ssh: P
20:40(
20) ack 21 win 33304  (DF) [tos 0x10] 
11:40:37.775625 crash2.gw.tislabs.com.ssh > cboss.gw.tislabs.com.49423: P
21:41(
20) ack 40 win 33304  (DF) [tos 0x10] 
11:40:37.868677 cboss.gw.tislabs.com.49423 > crash2.gw.tislabs.com.ssh: .
ack 41
 win 33304  (DF) [tos 0x10] 
11:40:40.780735 cboss.gw.tislabs.com.49423 > crash2.gw.tislabs.com.ssh: P
40:60(
20) ack 41 win 33304  (DF) [tos 0x10] 
11:40:41.008779 cboss.gw.tislabs.com.49423 > crash2.gw.tislabs.com.ssh: P
40:60(
20) ack 41 win 33304  (DF) [tos 0x10] 
11:40:41.268786 cboss.gw.tislabs.com.49423 > crash2.gw.tislabs.com.ssh: P
40:60(
20) ack 41 win 33304  (DF) [tos 0x10] 
11:40:41.588797 cboss.gw.tislabs.com.49423 > crash2.gw.tislabs.com.ssh: P
40:60(
20) ack 41 win 33304  (DF) [tos 0x10] 
11:40:42.028822 cboss.gw.tislabs.com.49423 > crash2.gw.tislabs.com.ssh: P
40:60(
20) ack 41 win 33304  (DF) [tos 0x10] 
11:40:42.708951 cboss.gw.tislabs.com.49423 > crash2.gw.tislabs.com.ssh: P
40:60(
20) ack 41 win 33304  (DF) [tos 0x10] 
11:40:43.868880 cboss.gw.tislabs.com.49423 > crash2.gw.tislabs.com.ssh: P
40:60(
20) ack 41 win 33304  (DF) [tos 0x10] 
11:40:45.988960 cboss.gw.tislabs.com.49423 > crash2.gw.tislabs.com.ssh: P
40:60(
20) ack 41 win 33304  (DF) [tos 0x10] 
11:40:48.109027 cboss.gw.tislabs.com.49423 > crash2.gw.tislabs.com.ssh: P
40:60(
20) ack 41 win 33304  (DF) [tos 0x10] 
11:40:50.229094 cboss.gw.tislabs.com.49423 > crash2.gw.tislabs.com.ssh: P
40:60(
20) ack 41 win 33304  (DF) [tos 0x10] 
11:40:52.349177 cboss.gw.tislabs.com.49423 > crash2.gw.tislabs.com.ssh: P
40:60(
20) ack 41 win 33304  (DF) [tos 0x10] 
11:40:54.469236 cboss.gw.tislabs.com.49423 > crash2.gw.tislabs.com.ssh: P
40:60(
20) ack 41 win 33304  (DF) [tos 0x10] 
11:40:56.589311 cboss.gw.tislabs.com.49423 > crash2.gw.tislabs.com.ssh: P
40:60(
20) ack 41 win 33304  (DF) [tos 0x10] 
11:40:58.709370 cboss.gw.tislabs.com.49423 > crash2.gw.tislabs.com.ssh: R
60:60(
0) ack 41 win 33304 (DF) [tos 0x10] 
11:41:15.784279 crash2.gw.tislabs.com.ssh > cboss.gw.tislabs.com.49423: .
ack 60
 win 33304  (DF) [tos 0x10] 
11:41:15.784337 cboss.gw.tislabs.com.49423 > crash2.gw.tislabs.com.ssh: R
347024
1425:3470241425(0) win 0 (DF) [tos 0x10] 
11:41:15.785617 crash2.gw.tislabs.com.ssh > cboss.gw.tislabs.com.49423: .
ack 60
 win 33304  (DF) [tos 0x10] 
11:41:15.785659 cb

Re: Witness This

2003-02-25 Thread Robert Watson

Ok, I've committed it.  Saw your other panic message, hope to look more
closely today.

BTW, you might consider running with the mac_test module, as it's intended
to help diagnose label problems through additional assertions.

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories

On Tue, 25 Feb 2003, Daniel C. Sobral wrote:

> No bugs so far with this.
> 
> Robert Watson wrote:
> > Revised patch without typo attached.
> > 
> > Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
> > [EMAIL PROTECTED]  Network Associates Laboratories
> > 
> > Index: tcp_subr.c
> > ===
> > RCS file: /data/fbsd-cvs/ncvs/src/sys/netinet/tcp_subr.c,v
> > retrieving revision 1.155
> > diff -u -r1.155 tcp_subr.c
> > --- tcp_subr.c  21 Feb 2003 23:17:12 -  1.155
> > +++ tcp_subr.c  22 Feb 2003 02:44:42 -
> > @@ -484,7 +484,7 @@
> > m->m_pkthdr.len = tlen;
> > m->m_pkthdr.rcvif = (struct ifnet *) 0;
> >  #ifdef MAC
> > -   if (tp != NULL) {
> > +   if (tp != NULL && tp->t_inpcb != NULL) {
> > /*
> >  * Packet is associated with a socket, so allow the
> >  * label of the response to reflect the socket label.
> 
> 
> -- 
> Daniel C. Sobral   (8-DCS)
> Gerencia de Operacoes
> Divisao de Comunicacao de Dados
> Coordenacao de Seguranca
> TCO
> Fones: 55-61-313-7654/Cel: 55-61-9618-0904
> E-mail: [EMAIL PROTECTED]
>  [EMAIL PROTECTED]
>  [EMAIL PROTECTED]
> 
> Outros:
>   [EMAIL PROTECTED]
>   [EMAIL PROTECTED]
>   [EMAIL PROTECTED]
> 
> I disagree with what you say, but will defend
> to the death your right to tell such LIES!
> 
> 


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Re: Witness This

2003-02-25 Thread Robert Watson

On Tue, 25 Feb 2003, Daniel C. Sobral wrote:

> Robert Watson wrote:
> > Ok, I've committed it.  Saw your other panic message, hope to look more
> > closely today.
> > 
> > BTW, you might consider running with the mac_test module, as it's intended
> > to help diagnose label problems through additional assertions.
> 
> Oh. That was not clear to me. I thought it was a module you used for
> tests, not something which _helped_ debug other mac policies. 

Well, it helps debug the MAC Framework's handling of labels by testing
lots of assertions about how labels are handled by the framework.  It
doesn't specifically exercise other policies, but it can generate useful
fail-stop behavior for some well-defined failure modes.  I.e., it tests
that uninitialized labels aren't passed to various entry points, that
labels aren't destroyed more than once, etc.

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Re: mac_mls still panics

2003-02-25 Thread Robert Watson

Question on both of these: could you inspect the struct ifnet pointer on
the mbuf and see what interface they originated from?  Also, the dumps are
still showing NULL local variables where it should not be possible -- does
manual inspection of the variables in the debugger reveal different values
(ifnetlabel, mbuflabel, etc).  You might want to see if compiling with -O0
gives better results, as it will force memory to be allocated for all
local variables, I believe.

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories

On Tue, 25 Feb 2003, Daniel C. Sobral wrote:

> No, it didn't fix the problem. I must have mixed kernels when I tested. 
> Two more panics attached (the first seems to have been a double panic).
> 
> -- 
> Daniel C. Sobral
> Gerência de Operações
> Divisão de Comunicação de Dados
> Coordenação de Segurança
> TCO
> Fones: 55-61-313-7654/Cel: 55-61-9618-0904
> E-mail:   [EMAIL PROTECTED]
>   [EMAIL PROTECTED]
>   [EMAIL PROTECTED]
> 
> 

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Re: At r212060, but /usr/bin/drace and /usr/sbin/lockstat still depend on libz.so.5

2010-09-06 Thread Robert Watson


On Mon, 6 Sep 2010, jhell wrote:

After r210693, these utilities are built for i386 and amd64 only. Thereby 
you have stale binaries installed from older sources.


Lol this is the first I have read about this & comes as quite the surprise 
that its not being built on top of the platform/arch that it was designed to 
work on. Given this is FreeBSD and !*Solaris its understandable to a certain 
point but still...


I for one eagerly look forward to the day where we can get DTrace working on 
MIPs, which is a widely-used architecture in the network device / appliance / 
etc community.


Robert
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: RFC: pefs - stacked cryptographic filesystem

2010-09-07 Thread Robert Watson


On Mon, 6 Sep 2010, Gleb Kurtsou wrote:

I would like to ask for feedback on a kernel level stacked cryptographic 
filesystem. It has started as Summer Of Code'2009 project and matured a lot 
since then. I've recently added support for sparse files and switched to XTS 
encryption mode.


I've been using it to encrypt my home directory for almost a year already, 
and use fsx, dbench and blogbench for testing. So it should be fairly 
stable.


Tested on top of ZFS, UFS and tmpfs on amd64 and i386; both 9-CURRENT and 
8-STABLE supported.


Please email me separately if you're willing to help testing on big endian 
machine, XTS code doesn't look endian correct.


At this point all of the project goals complete and I'd like it to get wider 
coverage in terms of tests and reviews and hope to see it commited to HEAD 
soon.


Hi Gleb:

This sounds like really exciting work!  Do you have much in the way of formal 
documentation of your crypto design at this point?  I'd like to point some of 
the local crypto gurus at Cambridge at it to do some analysis of your 
approach.  However, as they rightly point out, reverse engineering crypto from 
code is rather a high barrier of entry for a crypto review, so detailed 
documentation of the approach and a formal format description would be much 
prefered :-).


Thanks,

Robert





Installation instructions:

1a. Clone git repository:
# git clone git://github.com/glk/pefs.git pefs
# cd pefs

1b. Or download latest snapshot from github:
http://github.com/glk/pefs/archives/master

2. Build and install:
# make obj all
# make install

3. Mount pefs filesystem:
# pefs mount ~/Private ~/Private

4. Enter passphrase:
# pefs addkey ~/Private

5. Test it and report back. There is also a man page available.

6. Example how to save your key in keychain database.

pefs has to be mounted and key specified to make fs writable, create
keychain with single entry (keychain -Z option):
# pefs addchain -Z ~/Private
Don't encrypt .pefs.db:
# mv ~/Private/.pefs.db /tmp
# umount ~/Private
# mv /tmp/.pefs.db ~/Private
# pefs mount ~/Private ~/Private
Use -c option to verify key is in database
# pefs addkey -c ~/Private

7. You can setup pam_pefs (not compiled by default) to add key to home
directory and authenticate against keychain database on login, e.g. by
adding the following line to /etc/pam.d/system before pam_unix.so:

authsufficient  pam_pefs.so try_first_pass


The following is a list of its most important features:

*   Kernel level file system, no user level daemons needed.
   Transparently runs on top of existing file systems.
*   Random per file tweak value used for encryption, which guaranties
   different cipher texts for the same encrypted files.
*   Saves metadata only in encrypted file name, but not in file itself.
*   Supports arbitrary number of keys per file system, default directory
   key, mixing files encrypted with different keys in same directory.
*   Allows defining key chains, can be used to add/delete several keys
   by specifying only master key.
*   Uses modern cryptographic algorithms: AES and Camellia in XTS mode,
   PKCS#5v2 and HKDF for key generation.


Github repository: http://github.com/glk/pefs

More details on my blog: http://glebkurtsou.blogspot.com/search/label/pefs

Thanks,
Gleb.

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: Bumping MAXCPU on amd64?

2010-09-23 Thread Robert Watson



On Wed, 22 Sep 2010, Maxim Sobolev wrote:


On 9/22/2010 6:37 AM, John Baldwin wrote:
Unfortunately this can't be MFC'd to 7 as it would destroy the ABI for 
existing klds.


Ah, ok, sorry, I did only check RELENG_7. Can we make it a kernel option 
then?


In principle, yes, but MAXCPU is used to size various kernel data structures 
inspected by userspace crash post-mortem tools, etc.  I've done a bit of work 
to teach some of those tools (in particular, vmstat -z and vmstat -m) to 
extract the version of maxcpu compiled into the kernel instead just relying on 
the version of MAXCPU present when the command line tool was compiled. 
However, I think a better long-term approach here is to generally eliminate 
sizing based on MAXCPU and instead size based on the number of CPUs present. 
Certain kernel subsystems already do this (UMA, netisr, ...) but others don't 
(malloc(9), ...).  Additional hands on this project would probably help :-).


As John mentioned, the other issue is the use of fixed-width types instead of 
variable-length CPU bitmasks to name cores for IPIs, etc.  There are people 
actively working on this, but it's a non-trivial project as kernel code likes 
to do things like cpumask & othermask.  My expectation is that this problem 
will be solved in 9.0 but I don't see any obvious MFC paths for 8.x due to KBI 
issues.  It could be that this forces our hand in terms of breaking the KBI at 
some point in the 8.x series, unclear...


Robert
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: netisr software flowid

2010-09-27 Thread Robert Watson



On Mon, 27 Sep 2010, Artemiev Igor wrote:

What is the status for software flowid calculation? I found the old netisr2 
patch[1] from Robert Watson and took from there code for setting flowid in 
tcp_input with some changes[2]. It work for me very well (8.1-stable) - now 
the server can handle not transit traffic without drops up to 118Kpps 60MB/s 
incoming and up to 107Kpps 50MB/s outgoing, netisr dispatch packets via 
three threads by round-robin:


Hi Artemiev:

I have a large outstanding patch set in Perforce that goes quite a long way 
further, implementing the RSS model found in many network cards and aligning 
OS hash tables for connection lookup with RSS.  Where the RSS hash is made 
available by the driver, the patches are also able to implement link-layer 
dispatch.  They largely eliminate the possibility of cache line contention in 
the TCP/IP input path (as long as the driver also avoids cache line 
contention) on multi-queue cards.


One reason I haven't merged the earlier patch is that many high-performance 
10gbps (and even 1gbps) cards now support multiple input queues in hardware, 
meaning that they have already done the work distribution by the time the 
packets get to the OS.  This makes the work distribution choice quite a bit 
harder: has a packet already been adequately balanced, or is further 
rebalancing required -- and of so, an equal distribution as selected in that 
patch might not generate well-balanced CPU load.


Using just the RSS hash to distribute work, and single-queue input, I am able 
to get doubled end-host TCP performance with highly concurrent connections at 
10gbps, which is a useful result.  I have high on my todo list to get the 
patch you referenced into the mix as well and see how much the software 
distrbiution hurts/helps...


Since you've done some measurement, what was the throughput on that system 
without the patch applied, and how many cores?


Robert




12 root -44- 0K   336K CPU22  18:43 56.15% {swi1: netisr 2}
12 root -44- 0K   336K RUN 3  18:41 54.49% {swi1: netisr 3}
12 root -44- 0K   336K CPU00  18:39 50.39% {swi1: netisr 0}
12 root -68- 0K   336K WAIT1   8:01 18.07% {irq256: bge0}

So, what the reason to exclude this code from final version?

[1] http://www.watson.org/~robert/freebsd/netperf/20090523-netisr2.diff
[2] http://gate.kliksys.ru/~ai/software_flowid.diff
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: MAXCPU preparations

2010-09-27 Thread Robert Watson



On Mon, 27 Sep 2010, Scott Long wrote:

There's no reason not to include .  I'm a little reluctant to 
have it depend on the static MAXCPU definition, though.  What happens when 
you mix-and match userland and kernel and they no longer agree on the 
definition of MAXCPU?  I suggest creating a sysctl that exports the kernel's 
definition of MAXCPU, and have libmemstat look for that first, and fall back 
to using the static MAXCPU definition if the sysctl fails/doesn't exit.


I suppose, in a very worst case scenario, we can read the source code for 
libmemstat and see what it does.


Robert



Scott



On Sep 27, 2010, at 9:26 AM, Sean Bruno wrote:


Does this look like an appropriate modification to libmemstat?

Sean


 //depot/yahoo/ybsd_7/src/lib/libmemstat/memstat.h#4
- /home/seanbru/ybsd_7/src/lib/libmemstat/memstat.h 
@@ -28,12 +28,13 @@

#ifndef _MEMSTAT_H_
#define_MEMSTAT_H_
+#include 

/*
 * Number of CPU slots in library-internal data structures.  This
should be
 * at least the value of MAXCPU from param.h.
 */
-#defineMEMSTAT_MAXCPU  64
+#defineMEMSTAT_MAXCPU  MAXCPU /* defined in
sys/${ARCH}/include/param.h */

/*
 * Amount of caller data to maintain for each caller data slot.
Applications


___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: MAXCPU preparations

2010-09-27 Thread Robert Watson



On Mon, 27 Sep 2010, Sean Bruno wrote:

wouldn't it be better to do a sysctlbyname() and use the real value for the 
system?


libmemstat contains some useful sample code showing how this might be done.


That was my initial thought (as prodded by scottl and peter).

If it is made dynamic, could this be opening a race condition where the call 
to sysctlbyname() returns a count of CPUS that is in turn changed by the 
offlining of a CPU?  Or am I thinking to much about this?


Yes, you are.  MAXCPU is a compile-time constant for kernel builds, so (at 
least a the world is today), that can't happen.


I think there's a reasonable argument that MEMSTAT_MAXCPU should be phased out 
and all internal structures in libmemstat should be dynamically sized. 
However, core counts aren't growing that fast, and it's quite a bit of work, 
and probably not worth it just yet.


I'm somewhat averse to using MAXCPU in libmemstat, however, because MAXCPU is 
actually not a constant in the general case: FreeBSD/i386, for example, 
regularly uses two different values: 1 for !SMP kernels, and 32 for SMP 
kernels.  That's why libmemstat encodes its own value, for better or worse.


A reasonable alternative would be to replace 32 with MAXCPU * 2, or if we're 
feeling particularly optimistic, MAXCPU * 4.  Or just another big number, like 
256.


Robert
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: MAXCPU preparations

2010-09-27 Thread Robert Watson



On Mon, 27 Sep 2010, John Baldwin wrote:

Also, I think we should either fix MAXCPU to export the SMP value to 
userland, or hide it from userland completely.  Exporting the UP value is 
Just Wrong (tm).


Well, it's useful in the sense that it tells you what the maximum number of 
CPUs a kernel can support is, which is helpful, especially if you're futzing 
with MAXCPU as a kernel option :-).


But, more generally, many things that use MAXCPU should probably use either 
mp_maxid or DPCPU.  Not everything, but most things.


Robert
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: MAXCPU preparations

2010-09-28 Thread Robert Watson



On Mon, 27 Sep 2010, Joshua Neal wrote:

I hit this bug at one point, and had to bump MEMSTAT_MAXCPU.  It's already 
asking the kernel for the max number and throwing an error if it doesn't 
agree:


Yes, it looks like MAXCPU was bumped in the kernel without bumping the limit 
in libmemstat.  The bug could be in not having a comment by the definition of 
MAXCPU saying that MEMSTAT_MAXCPU needs to be modified as well.


I was thinking a more future-proof fix would be to get rid of the static 
allocations and allocate the library's internal structures based on the 
value of kern.smp.maxcpus.


Agreed.  I'm fairly preoccupied currently, but would be happy to accept 
patches :-).


Robert



- Joshua

On Mon, Sep 27, 2010 at 2:42 PM, Robert Watson  wrote:


On Mon, 27 Sep 2010, John Baldwin wrote:


Also, I think we should either fix MAXCPU to export the SMP value to
userland, or hide it from userland completely.  Exporting the UP value is
Just Wrong (tm).


Well, it's useful in the sense that it tells you what the maximum number of
CPUs a kernel can support is, which is helpful, especially if you're futzing
with MAXCPU as a kernel option :-).

But, more generally, many things that use MAXCPU should probably use either
mp_maxid or DPCPU.  Not everything, but most things.

Robert
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: [PATCH] Netdump for review and testing -- preliminary version

2010-10-08 Thread Robert Watson


On Fri, 8 Oct 2010, Attilio Rao wrote:


GENERAL FRAMEWORK ARCHITECTURE

Netdump is composed, right now, by an userland "server" and a kernel 
"client". The former is run on the target machine (where the dump will 
phisically happen) and it is responsible for receiving  the packets 
containing coredumps frame and for correctly writing them on-disk. The 
latter is part of the kernel installed on the source machine (where the 
dump is initiated) and is responsible for building correctly UDP packets 
containing the coredump frames, pushing through the network interface and 
routing them appropriately.


Hi Attilio:

Network dumps would be a great addition to the FreeBSD debugging suite!  A few 
casual comments and questions, I need to spend some more time pondering the 
implications of the current netdump design later in the week.


(1) Did you consider using tftp as the network dump protocol, rather than a 
custom protocol?  It's also a simple UDP-based, ACKed file transfer protocol, 
with the advantage that it's widely supported by exiting tftp daemons, packet 
sniffers, security devices, etc.  This wouldn't require using a stock tftpd, 
although that would certainly be a benefit as well.  The post-processing of 
crashdumps that seems to happen in the netdump server could, presumably, 
happen "offline" as easily...?


(2) Have you thought about adding a checksum to the dump format, since packets 
are going over the wire on UDP, etc?  Even an MD5 sum wouldn't be too hard to 
arrange: all the code is in the kernel already, requires relatively little 
state storage, and is designed for streamed data.


(3) As the bounds checking/etc behavior in dumping grows more complex, it 
seems a shame to replicate it in architecture-specific code.  Could we pull 
the mediaoffset/mediasize checking into common code?  Also, a reserved 
size/offset of "0" meaning "no limit" sounds slightly worrying; it might be 
slightly more conservative to add a flags field to dumperinfo and have a flag 
indicating "size is no object" for netdump, rather than overloading the 
existing fields.


Some specific patch comments:

+ * XXX: This should be split into machdep and non-machdep parts

What MD parts are in the file?

The sysctl parts of the patch have a number of issues:

- Please use sysctl_handle_string rather than manual calls and string bounds
  checking with SYSCTL_IN/SYSCTL_OUT.  In general, type-specific sysctl
  handlers are named sysctl_handle_, which I recommend here as well.

- Please use stack-local, fixed-length string buffers as the
  source/destination of copyin/copyout on ifnet names, rather than the fields
  in the structure itself.  We support interface renaming, so the field can
  change, and sysctl has the potential to block for extended periods
  mid-copyin/copyout.

- Because we support ifnet renaming, "none" is probably not a good reserved
  name.  Could you use the empty string or something similar, or just a flag
  to indicate that netdump is disabled?

- "ifp" rather than "ifn" and "nic" is the preferred variable name for ifnet
  pointers throughout the kernel.

- ifnets can, and are, removeable at runtime.  We have a refcount, but that's
  not really what you want.  I suggest simply looking up the ifnet by name
  when it's needed during a dump or for sysctl output, rather than maintaining
  a long-term pointer, which can become stale.  Especially with the
  auto-detect code.

- Since sysctl_nic is only used once, I'd prefer it were inlined into a
  use-specific handler function, avoiding the arg1/arg2 complications that
  make this code a bit harder to read.  sysctl_ip is useful because it's used
  more than once, though.  However, it also very much wants you to call
  sysctl_handle_string!

+sysctl_force_crash(SYSCTL_HANDLER_ARGS)

Does this belong in the netdump code?  We already have some of these options 
in debug.kdb.*, but perhaps others should be added there as well.


+   /*
+* get and fill a header mbuf, then chain data as an extended
+* mbuf.
+*/
+   MGETHDR(m, M_DONTWAIT, MT_DATA);

The idea of calling into the mbuf allocator in this context is just freaky, 
and may have some truly awful side effects.  I suppose this is the cost of 
trying to combine code paths in the network device driver rather than have an 
independent path in the netdump case, but it's quite unfortunate and will 
significantly reduce the robustness of netdumps in the face of, for example, 
mbuf starvation.


+   if (ntohs(ah->ar_hrd) != ARPHRD_ETHER &&
+   ntohs(ah->ar_hrd) != ARPHRD_IEEE802 &&
+   ntohs(ah->ar_hrd) != ARPHRD_ARCNET &&
+   ntohs(ah->ar_hrd) != ARPHRD_IEEE1394) {
+   NETDDEBUG("nd_handle_arp: unknown hardware address fmt "
+   "0x%2D)\n", (unsigned char *)&ah->ar_hrd, "");
+   return;
+   }

Are you sure you don't want to just check for ETHER here?

+   /* XXX: Prob

Re: [PATCH] Netdump for review and testing -- preliminary version

2010-10-15 Thread Robert Watson


On Thu, 14 Oct 2010, Attilio Rao wrote:

No, what I'm saying is: UMA needs to not call its drain handlers, and 
ideally not call into VM to fill slabs, from the dumping context. That's 
easy to implement and will cause the dump to fail rather than causing the 
system to hang.


My point is, however, still the same: that should not happen just for the 
netdump specific case but for all the dumping/KDB/panic cases (I know it is 
unlikely current code !netdump calls into UMA but it is not an established 
pre-requisite and may still happen that some added code does). I still see 
this as a weakness on the infrastructure, independently from netdump. I can 
see that your point is that it is vital to netdump correct behaviour though, 
so I'd wonder if it worths fixing it now or later.


Quite a bit of our kernel and dumping infrastructure special cases debugging 
and dumping behavior to avoid sources of non-robustness.  For example, serial 
drivers avoid locking, and for disk dumps we bypass GEOM to avoid the memory 
allocation, freeing, and threading that it depends on.


The goal here is to be robust when handling dumps: hanging is worse than not 
dumping, since you won't get the dump either way, and if you don't reboot then 
the system requires manual intervention to recover.  Example of things that 
are critical to avoid include:


- The dumping thread tripping over locks held by the panicked thread, or by
  another now-suspended thread, leading to deadlock against a suspended
  thread.

- Corrupting dumps by increasing concurrency in the panic case.  We ran into a
  case a year or two ago where changing VM state during the dump on amd64
  caused file system corruption as the dump code assumed that the space
  required for a dump didn't change while dumping took place.

Any code dependency we add in the panic / KDB / dump path is one more risk 
that we don't successfully dump and reboot, so we need to minimize that code.


Robert
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: DTrace bindings are missing in FreeBSD 9.0 - CURRENT for userland apps

2010-10-19 Thread Robert Watson


On Tue, 19 Oct 2010, Rui Paulo wrote:

you think adding pgsql to wheel might help? cc freebsd-security@ and see 
their opinion about the topic.


dof needs to inject the probes in /dev/dtrace/helper, so the user needs rw 
access to the /dev/dtrace/helper. I specifically added write access to the 
wheel group for this.


In the medium term, part of the solution here will be to finish adding a 
role-based privilege system.  I had this on my todo list for 8.0 but didn't 
manage to get it finished.  With any luck, it will make 9.0 in plenty of time. 
this would allow specific kernel privileges to be delegated to specific users 
and groups (among other things).  Many of the kernel changes to support this 
have been done since 7.0 when I added priv(9), but we've not yet selected a 
specific policy and API to bind to them.  Some appliances are already using 
priv(9) via extensible MAC modules to delegate privilege, but for a role-based 
privilege system, I think a tighter integration is preferable (especially in 
light of the risks from composing incorrectly with the root user model).


In some sense, however, a privilege system is also exactly the wrong answer. 
Ideally, you should be able to run dtrace on any process that you have 
debugging rights on, which is calculated with respect to the credentials of 
the two processes involved (subject and object).  You might also reasonably 
key certain kernel probes, such as systrace probes, to the same authorization 
scheme.  The remainder of kernel tracing presumably should remain a privilege, 
as should the use of kernel probes.


In general, I would prefer it if the kernel didn't know any more about 
specific users and groups than it already does -- in practice, this is 
somewhat unavoidable due to the way we do devfs, but minimizing it would be a 
good idea.  In the past, where we have had special things we need to delegate 
that bypass some but not all system integrity protections (such as shutdown, 
reboot, and backup), we've assigned them via the operator group, FYI.


Robert
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: www/chromium crashing whole system

2010-11-13 Thread Robert Watson



On Sat, 13 Nov 2010, Alexander Best wrote:

i tried detaching and attaching my keyboard after chromium crashed my 
system and the lights of the keyboard didn't even went on. so in fact 
everything crashed and not just X.
If I said it unclear, let me repeat, the usermode crash dump you got 
probably has nothing common with the kernel issue.


oh sorry. indeed i misunderstood you there. well i guess this is the problem 
most regular users have. we don't own any serial/firewire consoles. all i 
can offer is to add kernel OPTIONS. however none of them seem to be able to 
prevent the lock up and instead letting me enter the debugger or trigger a 
kernel core dump.


i even have watchdog running, but without any sucess. i guess all i can hope 
for is that maybe at some point a kernel dump does make it to disk.


Do you have a second box you can run X11 on, and SSH into the box that will 
run Chromium?  If it is really Chromium triggering the crash, this might allow 
you to access the console when it crashes.  However, if it's Chromium 
triggering an X11-related crash, it might well not.  (Also, it might well not 
because of timing differences, but it is worth a try).


Another thing to consider is starting Chromium when switched to a text virtual 
console from X11, which would leave you in text mode for DDB, or at least let 
you see something interesting on the console.


If regular crashdumps appear unreliable, try setting up a textdump with an 
automatic reboot, that might provde more reliable (small chance, but it 
could).


Robert
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: www/chromium crashing whole system

2010-11-13 Thread Robert Watson



On Sat, 13 Nov 2010, Garrett Cooper wrote:


   Isn't there also DEADLKRES that might be helpful in this case (if
Alex is really dealing with a livelock in the kernel)...?


The deadlock resolver is compiled into the GENERIC kernel on -CURRENT, so I'm 
assuming it hasn't helped (or perhaps is even part of the problem).  I think 
the best thing to do at this point is to try to get into DDB.  Of the schemes 
I suggested to work around the X11 issue, switching to a virtual console 
before starting Chromium may work best, since it will continue to use the 
local X server, etc.


Robert
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: [head tinderbox] failure on powerpc64/powerpc

2010-12-22 Thread Robert Watson



On Tue, 21 Dec 2010, Mike Tancsa wrote:

I think Tinderbox has a bad source tree.  Lines 557 and 569 make sense from 
the old version of kern_fail.c before either of my commits.  So is 
Tinderbox somehow building with an old kern_fail.c but an updated 
sys/fail.h?  That would explain the build error, but I have no idea how it 
could have gotten into such a situation.


Sometimes the Subversion->CVS exporter becomes upset.  It would be worth 
checking out the CVS version of those files and making sure they contain what 
you expect -- if not, it could well be a problem there.


Robert

It updates from my local mirror which updates from cvsup18.  The last update 
was


CVSup update ends at 2010-12-21 16:44:39
CVSup update begins at 2010-12-21 17:43:00
Updating from cvsup18.freebsd.org
Connected to cvsup18.freebsd.org
Updating collection cvs-all/cvs
Append to CVSROOT-ports/commitlogs/ports
Append to CVSROOT-src/commitlogs/sys
Edit ports/devel/p5-DateTime-Format-Strptime/Makefile,v
Edit ports/devel/p5-DateTime-Format-Strptime/distinfo,v
Edit ports/multimedia/playd/Makefile,v
Edit ports/multimedia/playd/distinfo,v
Edit src/sys/nfsserver/nfs_srvsubs.c,v
src/sys/nfsserver/nfs_srvsubs.c,v: Checksum mismatch -- will transfer
entire file
Edit src/sys/sparc64/include/cpufunc.h,v
src/sys/sparc64/include/cpufunc.h,v: Checksum mismatch -- will transfer
entire file
Edit src/sys/sparc64/include/vmparam.h,v
src/sys/sparc64/include/vmparam.h,v: Checksum mismatch -- will transfer
entire file
Edit src/sys/sparc64/sparc64/tick.c,v
src/sys/sparc64/sparc64/tick.c,v: Checksum mismatch -- will transfer
entire file
Edit src/sys/sys/mount.h,v
src/sys/sys/mount.h,v: Checksum mismatch -- will transfer entire file
Skipping collection gnats/current
Skipping collection www/current
Skipping collection mail-archive/current
Updating collection distrib/self
Applying fixups for collection cvs-all/cvs
Fixup src/sys/nfsserver/nfs_srvsubs.c,v
Fixup src/sys/sparc64/include/cpufunc.h,v
Fixup src/sys/sparc64/include/vmparam.h,v
Fixup src/sys/sparc64/sparc64/tick.c,v
Fixup src/sys/sys/mount.h,v
Finished successfully
CVSup update ends at 2010-12-21 17:45:11




___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: TCP resident expert?

2011-01-16 Thread Robert Watson



On Sat, 15 Jan 2011, William Allen Simpson wrote:

Who's the kernel expert on TCP around here?  ISC wants me to port TCPCT to 
FreeBSD.  Although I've joined this list (some time ago), I've not seen any 
traffic discussing TCP'ish things.  Need somebody willing to walk me through 
the processes and check my code.


I don't think there's any single "the" expert -- rather, work on TCP is 
distributed over a number of developers who take various interests in the 
topic.  At the risk of pointing fingers:


Lawrence Stewart  has recently been involved in pluggable 
congestion control, new congestion control algorithms, TCP tracing, and 
various other things, and has been among our most active hands in TCP for the 
last year especially.  He might be the best first port of call because of this 
recent activity.


Rui Paulo  did our TCP ECN support.

I've had my hands in TCP data structure/locking/etc on several occasions in 
the last couple of years, especially relating to SMP scalability, and most 
recently, TCP connection CPU affinity and hardware-driven load balancing (RSS, 
etc) as part of work for Juniper.


Andrew Opperman  has done significant work on features like TSO, LRO, 
timers, etc in the last couple of years, and before that reworked out TCP 
syncache implementation (so might be of particular interest).


Drew Gallatin  was the originator of our LRO code as part of his 
work at Myricom, and has taken a more general interest in stack performance.


Kip Macy (kmacy@) did our TCP offload implementation as part of work for 
Chelsio.


George Neville-Neil  has been involved in TCP regression testing, as 
well as other TCP-related problems in the data centre.


Bjoern Zeeb  has been involved in our ongoing network stack 
virtualisation project, and has of necessity had his hands dirty in TCP.


And I feel certain there are others who, entirely accidentally and much to my 
embarrassment, I have omitted.


As Doug points out, however, the best way to reach folks interested in TCP is 
via the freebsd-net@ mailing list, as people come and go some over time, and 
taking any questions to that list will let the answers get archived.  Also, as 
people do come and go, the mailing list may help your requests not be dropped 
:-).


(I've CC'd that list)

Robert
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: Ethernet Drivers: Question on Sending Received Packets to the FreeBSD Network Stack

2011-02-04 Thread Robert Watson



On Thu, 3 Feb 2011, Julian Elischer wrote:


On 2/3/11 10:08 AM, David Somayajulu wrote:
Hi All, While sending the Received Ethernet Frames (non - LRO case) to the 
FreeBSD Network Stack via (struct ifnet *)->if_input((struct ifnet *), 
(struct *mbuf));


Is it possible to send multiple Ethernet Frames in a single invocation of 
the above callback function?


In other words should (struct *mbuf) above always correspond to a single 
Ethernet Frame? I am not sure if I missed something, but I gathered from a 
quick perusal of ether_input() in net/if_ethersubr.c, that only ONE 
Ethernet Frame may be sent per callback.


yes only one. the linkages you see in the mbuf definition are for when you 
are putting it into some queue (interface, socket, reassembly, etc).


I had never considered passing a set of packets, but after my initial 
scoffing thoughts I realized that it would actually be a very interesting 
thought experiment to see if the ability to do that would be advantageous in 
any way. I tmay be a way to reduce some sorts of overhead if using interrupt 
mitigation.


This was discussed quite a lot at the network-related devsummit sessions a few 
years ago.  One idea that was bandied about was introducing an mbuf vector 
data structure (I'm sure at least two floated around in Perforce at some 
point, and another ended up built into the Chelsio driver I think).  The idea 
being that indirection through queues as with mbufs is quite inefficient when 
you want to pass them around in sets when the set may not be in the cache. 
Instead, vectors of mbuf pointers would be passed around, each entry being a 
chain representing a packet.


I think one reason that idea never really went anywhere was that the use cases 
were fairly artificial for anything other than link layer bridging between 
exactly two interfaces or systems with exactly one high-volume TCP connection. 
In most scenarios, packets may come in small bursts going to the same 
destination (etc), as is exploited by LRO, but not in a way that you're able 
to maintain passing around in sets that remain viable as you get above the 
link layer.  I seem to recall benchmarking one of the prototypes and finding 
that it increased the working set on memory noticeably since it effectively 
meant much more queueing was taking place, whereas our current direct dispatch 
model helped latency a great deal.  It could be that with more deferred 
dispatch in a parallel setting, it helps, but you'd definitely want to take a 
measurement-oriented approach in looking at it any further.


(For bridged ethernet filtering devices that act as a "bump in the wire", it 
might well prove a reasonable performance optimisation.  I'm not sure if the 
cxgb mvec implementation would be appropriate to this task or not.)


Robert
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: DTrace Broken?

2011-02-20 Thread Robert Watson


On Fri, 18 Feb 2011, Shawn Webb wrote:


Hey fellow current users,

Looks like dtrace is broken in current:

# dtrace -l -f acl dtrace: invalid probe specifier acl: 
"/usr/lib/dtrace/psinfo.d", line 37: syntax error near "uid_t"


Error messages along these lines almost always mean that the kernel was built 
without WITH_CTF (causing dtrace to be unable to find the type information it 
requires).


Robert



Line 37 shows:
  uid_t   pr_uid; /* real user id */

Looks good to me, but why is dtrace complaining?

Thanks,

Shawn Webb
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: 5.1 setfacl problem

2003-07-19 Thread Robert Watson

On Sat, 19 Jul 2003, [iso-8859-2] Branko F. Graènar wrote:

> Hi there! 
> 
> I'm running 5.1 on i386 platform and i have silly problem with acls. 
> 
> I have disks mounted with acl option (ofcourse they are formatted with
> ufs2)  and acls generally work okay. 
> 
> But when i try to set default directory acl entry i get 'Invalid
> argument' error. 
> 
> Here is example command usage: 
> 
> # setfacl -dm m::rwx,u:some_user:rwx test_directory
> setfacl: acl_set_file() failed for test_directory: Invalid argument
> 
> This is really annoying... 
> 
> Any ideas, how to solve this? 

POSIX.1eD17 23.1.3 requires that default ACLs have the same minimum
entries as an access ACL, meaning that all default ACLs must contain at
least object owner, object group, and other fields.  If you have extended
entries, you must also have a mask field.  If the test_directory above
doesn't already have an ACL on it to modify, the command you're using will
specify what POSIX.1e considers an incomplete ACL and rejects.  Try using:

  setfacl -dm u::rwx,g::rx,o::rx,u:some_user:rwx,m:rwx test_directory

and see if that works better for you.  If so, that was probably the
problem.  I haven't checked to see if other implementations have different
interpretations of POSIX.1e, or bend the rules in various ways, but they
might well do.  We could, in theory, weaken the rules, but the logic to
combine partial default ACLs, requested creation mode, and umask would be
complicated...

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: 5.1 setfacl problem

2003-07-20 Thread Robert Watson


On Sun, 20 Jul 2003, Branko F. Gracnar wrote:

> >specify what POSIX.1e considers an incomplete ACL and rejects.  Try using:
> >
> > setfacl -dm u::rwx,g::rx,o::rx,u:some_user:rwx,m:rwx test_directory

Looks like you might have some typos:

> # setfacl -dm u::rwx,g::rx,o:rx,g:some_group:rwx,m:rwx test_directory
   ^^^ o::rx   ^^ m::rwx

Try with those changes and let me know if it's still causing problems.

> setfacl: acl_from_text() failed: Invalid argument
> 
> ...
> 
> Brane
> 
> ___
> [EMAIL PROTECTED] mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "[EMAIL PROTECTED]"
> 

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: Possible problem with ACL masks and getfacl (fwd)

2003-07-24 Thread Robert Watson


On Thu, 24 Jul 2003, Glen Gibb wrote:

> Whoops - it helps if I attach the patch :)

Glen,

This looks good to me -- I've committed the patch.  If you pick up
acl_to_text.c:1.11, it should have it.  Let me know if there are any
problems. 

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories


___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

"authenticated tftp"

2003-07-25 Thread Robert Watson


Yeah, seems like an oxy-moron, but this is a legitimate question, I
promise.  My linksys wireless router requires me to disable the admin
password on it to tftp a firmware update to it--however, the Windows tftp
client that Linksys ships appear to support some form of "Oh yeah, and
here's a password".  It probably really doesn't make a difference
security-wise, but it would be a lot more convenient to update wireless
routers if our tftp client spoke whatever extension they use to carry the
password.  Does anyone know anything about that protocol extension, or if
there are existing tweaks to add it to our tftp?  (I saw nothing in the
man page).  If there's a pointer to the on-the-write bits, I can always
stick it in myself, but I have yet to find one. 

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: LOR with filedesc structure and Giant

2003-07-28 Thread Robert Watson

On Sun, 27 Jul 2003, Kris Kennaway wrote:

> After upgrading last night, one of the package machines found this:

I've bumped into some similar problems -- it's a property of how we
current lock select().  We hold the file descriptor lock for the duration
of polling each object being "selected", and if any of those objects has
to grab a lock for any reason, it has to implicitly fall after the file
descriptor lock.  I actually run into this in some of our MAC code,
because I need to grab a vnode lock to authorize polling the vnode using
VOP_POLL(), and since the vnode lock is a sleep lock, this generates a
WITNESS warning.  Unfortunately, it's not immediately clear what a better
locking scheme would look like without going overboard on the fine-grained
side.  We probably need to grab Giant before entering the select code
since it's highly likely something in there will require Giant -- it
reaches down into VFS, the device stuff, socket code, tc.

> lock order reversal
>  1st 0xc6c1c334 filedesc structure (filedesc structure) @
> /a/asami/portbuild/i386/src-client/sys/kern/sys_generic.c:902
>  2nd 0xc04aa120 Giant (Giant) @
> /a/asami/portbuild/i386/src-client/sys/fs/specfs/spec_vnops.c:372
> Stack backtrace:
> backtrace(c043d4af,c04aa120,c0439aa4,c0439aa4,c0434e3d) at backtrace+0x17
> witness_lock(c04aa120,8,c0434e3d,174,1bc) at witness_lock+0x672
> _mtx_lock_flags(c04aa120,0,c0434e3d,174,c043daba) at _mtx_lock_flags+0xba
> spec_poll(d8dddaf8,d8dddb18,c02d119c,d8dddaf8,c04939a0) at spec_poll+0x134
> spec_vnoperate(d8dddaf8,c04939a0,c520b124,40,c675e300) at spec_vnoperate+0x18
> vn_poll(c44c5e14,40,c675e300,c6222d10,c675e300) at vn_poll+0x3c
> selscan(c6222d10,d8dddb98,d8dddb88,6,4) at selscan+0x13e
> kern_select(c6222d10,6,bfbff5c0,0,0) at kern_select+0x36f
> select(c6222d10,d810,c0455899,3ee,5) at select+0x66
> syscall(2f,2f,2f,8055050,bfbff5b8) at syscall+0x273
> Xint0x80_syscall() at Xint0x80_syscall+0x1d
> --- syscall (93), eip = 0x280ccacc, esp = 0x2832eb68, ebp = 0x2832ebc0 ---
> Debugger("witness_lock")
> Stopped at  Debugger+0x54:  xchgl   %ebx,in_Debugger.0
> 
> Kris

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

5.2-RELEASE TODO

2003-07-29 Thread Robert Watson

arwin  |
 | |  | | operating system|
 | |  | | has fairly  |
 | |  | | extensive   |
 | Merge of Darwin |  | | improvements to |
 | msdosfs, other  | --   | --  | msdosfs and other   |
 | fixes   |  | | kernel services;|
 | |  | | these fixes must be |
 | |  | | reviewed and merged |
 | |  | | to the FreeBSD  |
 | |  | | tree.   |
 |-+--+-+-|
 | |  | | Port syscons to |
 | |  | | sparc64. Add device |
 | |  | | drivers for sun |
 | |  | | mice and keyboards. |
 | |  | | Allow for more than |
 | sparc64 adaptation  | In   | | 3 bits of   |
 | of syscons  | progress | Jake Burkholder | background colour   |
 | |  | | in syscons. Creator |
 | |  | | frame buffer device |
 | |  | | driver. In the  |
 | |  | | process, generally  |
 | |  | | improve the MI-ness |
 | |  | | of syscons. |
 |-+--+-+-|
 | |  | | Many systems|
 | |  | | supporting POSIX.1e |
 | |  | | ACLs permit a minor |
 | |  | | violation to that   |
 | |  | | specification, in   |
 | |  | | which the ACL_MASK  |
 | |  | | entry overrides the |
 | ACL_MASK override   | In   | | umask, rather than  |
 | of umask support in | progress | Robert Watson   | being intersected   |
 | UFS |  | | with it. The|
 | |  | | resulting semantics |
 | |  | | can be useful in|
 | |  | | group-oriented  |
 | |  | | environments, and   |
 | |  | | as such would be|
 | |  | | very helpful on |
 | |  | | FreeBSD.|
 |-+--+-+-|
 | |  | | Significant parts   |
 | |  | | of the network  |
 | |  | | stack (especially   |
 | |  | | IPv4 and IPv6) now  |
 | |  | | have fine-grained   |
 | |  | | locking of their|
 | |  | | data structures.|
 | |  | | However, it is not  |
 | |  | | yet possible for|
 | |  | | the netisr threads  |
 | |  | | to run without  |
 | |  | | Giant, due to   |
 | Fine-grained|  | | dependencies on |
 | network stack   | In   | Jeffrey Hsu,| sockets, routing,   |
 | locking without | progress | Seigo Tanimura  | etc. A 5.2-RELEASE  |
 | Giant   |  | | goal is to have the |
 | |  | | network stack   |
 | |  | | running largely |
 | |  | | without Giant,  |
 | |  | | which should|
 | |  | | substantially   |
 | |  | | improve performance |
 | |  | | of the stac

Re: STEP 2, fixing dhclient behaviour with multiple interfaces

2003-07-29 Thread Robert Watson


On Tue, 29 Jul 2003, Terry Lambert wrote:

> Martin Blapp wrote:
> > I't is my goal to make dhclient really functional, so it can not only
> > be used with one interface, but several.
> > 
> > On a well known OS this works just fine. A first interface gets
> > initialized and the GW gets set as usual. But if a second interface
> > gets added, and the first one is still active and has a working lease,
> > the GW will not be overwriten. If you remove now the first interface,
> > the default GW changes to the one of the second interface.
> [ ... ]
> > If there are other ideas, I'm open to them.
> 
> You could add kevents for interface arrival and departure, and add a
> kqueue to the dhcpd to catch the arrival/departure events, and then just
> act on them. 

Some of those events already exist for routing sockets, so in a worst case
scenario, you can hook up a routing socket to a kqueue :-).

Martin -- you might want to try the "route monitor" command sometime and
take a look at the vent stream there for things to consider.

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories


___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: [PATCH] jail NG schript patch for mounting devfs and procfsautomatically

2003-07-29 Thread Robert Watson

On Tue, 29 Jul 2003, Jens Rehsack wrote:

> I updated the rcng jail start script to mount devfs and procfs into the
> jail if wanted. Adding entries to /etc/fstab didn't work properly,
> because the jail filesystem wasn't mounted when the startup process
> wants to mount it. 
> 
> Going this way allows us to control which jail could be used via ssh (or
> another remote shell), too. 
> 
> Any comments gladly welcome. 
> 
> If it's useful for FreeBSD, I will write the rc.conf(5) update, too.
> Please inform me to do this. 

Neat.

Someone, and unfortunately I appear to have lost track of who, had some
tweaks to the rcNG scripts to set up some reasonable devfs rules for a
jail, and apply them to the devfs mounted in a jail.  Otherwise, you risk
exposing "undesired" device nodes to the virtual environment.  I suspect a
search of the -current archives will turn up who, but I think a necessary
part of a solution here will be to make sure jails are set up with the
right devfs contents. 

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: STEP 2, fixing dhclient behaviour with multiple interfaces

2003-07-29 Thread Robert Watson


On Tue, 29 Jul 2003, Terry Lambert wrote:

> > Some of those events already exist for routing sockets, so in a worst case
> > scenario, you can hook up a routing socket to a kqueue :-).
> > 
> > Martin -- you might want to try the "route monitor" command sometime and
> > take a look at the vent stream there for things to consider.
> 
> Does that work if you don't have an IP address assigned to the interface
> at all yet?  I was under the impression that it only sent out route
> change events (maybe I need to update my copy of the -current sources,
> though).  What I was talking about is the idea that naked interface
> (0.0.0.0) arrivals and departures could be signalled, which would cause
> dhclient to try to get a lease on the interface. 

got message of size 24 on Tue Jul 29 13:27:59 2003
RTM_IFANNOUNCE: interface arrival/departure: len 24, if# 6, what: arrival

got message of size 96 on Tue Jul 29 13:28:45 2003
RTM_IFINFO: iface status change: len 96, if# 6, flags:

got message of size 24 on Tue Jul 29 13:28:45 2003
RTM_IFANNOUNCE: interface arrival/departure: len 24, if# 6, what: departure

The event that you currently have to get using kqueue() is the link state,
which isn't announced using routing sockets.  If only for consistency, I'd
like it if there were an ifnet level announcement in routing sockets for a
link state change on capable interfaces.

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: STEP 2, fixing dhclient behaviour with multiple interfaces

2003-07-29 Thread Robert Watson

On Tue, 29 Jul 2003, Daniel C. Sobral wrote:

> > You could add kevents for interface arrival and departure, and
> > add a kqueue to the dhcpd to catch the arrival/departure events,
> > and then just act on them.
> 
> Instead of just adding the stuff to devd? 

Currently, devd is in the business of dealing with attachment and removal
from the hardware management subsystem.  Network subsystem events, such as
"interface has arrived" are semantically different, but "close enough" in
many cases.  In the past, routing sockets have been the means by which
topology-relevant changes are announced to the user processes.  More
recently, kqueue has permitted monitoring of a plethora of event types.  I
think there's a decent argument for a neteventd, perhaps integrated as a
thread into devd, listening on network events rather than device
attach/detach events.  The only real problem is that it would be very nice
if the DHCP client code were available in a library so it could be linked
into a network event manager. 

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: STEP 2, fixing dhclient behaviour with multiple interfaces

2003-07-30 Thread Robert Watson

On Tue, 29 Jul 2003, Terry Lambert wrote:

> This is part of the problem.  The other parts are that this is really
> networking code, and should be a separate thing, if possible, and, as
> Martin just pointed out, the OMAPI stuff is not really cooked yet.

Hence the notion of a neteventd -- I don't really mind what container it's
in, as long as its structured around the network-centric pieces.

> It's really a lot easier the process a small list of events in dhacpd as
> a result of a kqueue or kqueue/select combo, if you want to avoid
> rewriting as much code as humanly possible, and still be able to pull
> this feature out of the project.

Absolutely agreed.  If OMAPI is working well, that may well provide a lot
of what we want.  One benefit to central management of the network
configuration would be that we'd get things like GUI control and profiles
a lot more easily.

> I still haven't been able to repeat your test; are you sure you are
> listening on a routing socket for the configuration change events? 
> Maybe I'm doing something silly with my dumb little test program that
> you aren't doing with yours?  I'm not seeing my Linksys my 3COM
> interfaces showing up and disappearing as kevents, but they are
> definitely still being seen by the laptop.  Maybe it's my local hacks to
> make it work at all (it's an older Sony VAIO PCG-XG29). 

I can't speak to your configuration, but I can describe mine :-).  I've
tested two different situations with "route monitor" creation and removal
of virtual interfaces (in this case, vlan interfaces), and physical
interfaces (in this case, my wi0 card in my notebook).  Both appear to be
generating proper notifications during arrival and departure.  The only
events that I expected to see, but didn't, were IPv6 link layer address
arrival and departure.  I ran these tests with 5.0-CURRENT from around
July 3, but I seem to recall seeing this work properly previously.  I
don't have any RELENG_4 boxes with removable physical interfaces at this
point. 

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: panic: spin lock sched lock held

2003-07-31 Thread Robert Watson

On Thu, 31 Jul 2003, Lars Eggert wrote:

> got this panic overnight. Machine was wedged solid and didn't enter ddb
> after the panic. I'll recompile with verbose diagnostics and see if it
> happens again. In the meantime, maybe the message will give someone a
> clue: 
> 
> panic: spin lock sched lock held by 0xca462390 for > 5 seconds
> cpuid = 0; lapic.id = 
> Debugger("panic")

If this is reproduceable, you might try setting 'debug.trace_on_panic' so
that you get an automatic trace even if you can't get into DDB.  This
might or might not work, but it's worth a try.

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories


___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

5.2-RELEASE TODO

2003-08-01 Thread Robert Watson

arwin  |
 | |  | | operating system|
 | |  | | has fairly  |
 | |  | | extensive   |
 | Merge of Darwin |  | | improvements to |
 | msdosfs, other  | --   | --  | msdosfs and other   |
 | fixes   |  | | kernel services;|
 | |  | | these fixes must be |
 | |  | | reviewed and merged |
 | |  | | to the FreeBSD  |
 | |  | | tree.   |
 |-+--+-+-|
 | |  | | Port syscons to |
 | |  | | sparc64. Add device |
 | |  | | drivers for sun |
 | |  | | mice and keyboards. |
 | |  | | Allow for more than |
 | sparc64 adaptation  | In   | | 3 bits of   |
 | of syscons  | progress | Jake Burkholder | background colour   |
 | |  | | in syscons. Creator |
 | |  | | frame buffer device |
 | |  | | driver. In the  |
 | |  | | process, generally  |
 | |  | | improve the MI-ness |
 | |  | | of syscons. |
 |-+--+-+-|
 | |  | | Many systems|
 | |  | | supporting POSIX.1e |
 | |  | | ACLs permit a minor |
 | |  | | violation to that   |
 | |  | | specification, in   |
 | |  | | which the ACL_MASK  |
 | |  | | entry overrides the |
 | ACL_MASK override   | In   | | umask, rather than  |
 | of umask support in | progress | Robert Watson   | being intersected   |
 | UFS |  | | with it. The|
 | |  | | resulting semantics |
 | |  | | can be useful in|
 | |  | | group-oriented  |
 | |  | | environments, and   |
 | |  | | as such would be|
 | |  | | very helpful on |
 | |  | | FreeBSD.|
 |-+--+-+-|
 | |  | | Significant parts   |
 | |  | | of the network  |
 | |  | | stack (especially   |
 | |  | | IPv4 and IPv6) now  |
 | |  | | have fine-grained   |
 | |  | | locking of their|
 | |  | | data structures.|
 | |  | | However, it is not  |
 | |  | | yet possible for|
 | |  | | the netisr threads  |
 | |  | | to run without  |
 | |  | | Giant, due to   |
 | Fine-grained|  | | dependencies on |
 | network stack   | In   | Jeffrey Hsu,| sockets, routing,   |
 | locking without | progress | Seigo Tanimura  | etc. A 5.2-RELEASE  |
 | Giant   |  | | goal is to have the |
 | |  | | network stack   |
 | |  | | running largely |
 | |  | | without Giant,  |
 | |  | | which should|
 | |  | | substantially   |
 | |  | | improve performance |
 | |  | | of the stac

Re: Problems with bktr on -current

2003-08-03 Thread Robert Watson

On Sun, 3 Aug 2003, Guido Berhoerster wrote:

> I've got some trouble with the bktr-driver on FreeBSD 5.x. With fxtv the
> video-output is distorted and choppy, it appears that only odd scanlines
> are redrawn regularly while even scanlines remain for like half a second
> as "ghost images". When the fxtv window is overlapped by some other
> window the video is only updated about every 30 seconds. When using
> mplayer's bsdbt848-driver I get an undistorted image but also choppy
> video.  I wasn't able to test it with xawtv since it's still broken on
> 5.x. 

Interesting.  I've also seen some visual peculiarities with fxtv lately on
my 5.1-CURRENT box under similar circumstances: if I move windows around,
sometimes video garbage is left behind -- especially alternating
scanlines.  I haven't tried backing out to earlier versions, though, so
I'll give that a try and see if the problem goes away.

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Change in application of default ACLs in UFS

2003-08-03 Thread Robert Watson


Just an FYI to users of ACLs on UFS -- I've modified the semantics of the
application of the default ACL in combination with the umask.  The result
is that the application of default ACLs is now more conservative than
previously, so you may want to keep an eye out and make sure all the ACLs
still mean what you thought they meant.

I'm still exploring what the best default ACL semantics to use are --
we're now implementing POSIX.1e "as spec" (bitwise and).  It's worth
observing this is not quite the same semantics as Solaris and Linux, in
which the the ACL mask overrides the umask.  I have an ACL development
branch in Perforce where I'm experimenting with these semantics, and will
probably merge support for that prior to 5.3, probably as an option. 

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories

-- Forwarded message --
Date: Sun, 3 Aug 2003 20:29:13 -0700 (PDT)
From: Robert Watson <[EMAIL PROTECTED]>
To: [EMAIL PROTECTED], [EMAIL PROTECTED], [EMAIL PROTECTED]
Subject: cvs commit: src/sys/ufs/ufs acl.h ufs_acl.c ufs_vnops.c

rwatson 2003/08/03 20:29:13 PDT

  FreeBSD src repository

  Modified files:
sys/ufs/ufs  acl.h ufs_acl.c ufs_vnops.c 
  Log:
  Now that the central POSIX.1e ACL code implements functions to
  generate the inode mode from a default ACL and creation mask,
  implement ufs_sync_inode_from_acl() using acl_posix1e_newfilemode().
  
  Since ACL_OVERRIDE_MASK/ACL_PRESERVE_MASK are defined, we no
  longer need to explicitly pass in a "preserve_mask" field: this
  is implicit in the use of POSIX.1e semantics.
  
  Note: this change contains a semantic bugfix for new file creation:
  we now intersect the ACL-generated mode and the cmode requested by
  the user process.  This means permissions on newly created file
  objects will now be more conservative.  In the future, we may want
  to provide alternative semantics (similar to Solaris and Linux) in
  which the ACL mask overrides the umask, permitting ACLs to broaden
  the rights beyond the requested umask.
  
  PR: 50148
  Reported by:Ritz, Bruno <[EMAIL PROTECTED]>
  Obtained from:  TrustedBSD Project
  
  Revision  ChangesPath
  1.5   +1 -2  src/sys/ufs/ufs/acl.h
  1.18  +8 -78 src/sys/ufs/ufs/ufs_acl.c
  1.232 +4 -8  src/sys/ufs/ufs/ufs_vnops.c

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: Any patch for ICMP in a jail?

2003-08-04 Thread Robert Watson

On Mon, 4 Aug 2003, Rus Foster wrote:

> Is there a patch that will allow ping from inside a jail on 5.x? Google
> didn't show anything? 

The problem is that, to generate pings, you have to have access to a raw
socket.  And unfortuantely, raw sockets imply access to a lot more than
just the ability to send/receive ICMP: a number of management components
in the IP stack assume that if you have a raw socket, you're also allowed
to configure those components.  Take a look at rip_ctloutput() in raw_ip.c
for some examples.  We have some local in-progress changes to modify this
as part of our capabilities work, but there's no timeline for integrating
it.  The best short-term suggestion would be to write a
privilege-separated ping tool -- a pingd running outside the jail,
providing UNIX domain sockets in each jail that needs the ability to ping; 
ping then becomes a client that RPC's to pingd. 

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: ACLS on UFS2 from FreeBSD 5.1-RELEASE install.

2003-08-05 Thread Robert Watson

On 2 Aug 2003, Scott M. Likens wrote:

> Has anyone noticed the ACLS being disabled? 
> 
> tunefs -p /dev/da1s1c shows that ACLS are disabled on every partition I
> have, i've gone through them all. 
> 
> any reason why? 

Yes -- they are disabled by default because they're not required by most
users, and as a new (and slightly experimental) feature, involve a
slightly greater risk of problems.  I believe I added support to
sysinstall to enable ACLs during the partition process; if not, you can
enable them later using tunefs.  One of the difficulties associated with
ACLs is that not all applications understand them -- while the failure
mode is predictable and relatively clean, it means that you may sometimes
lose ACLs on objects when they are replaced by an application without ACL
support.  For example, some applications will move a file out of the way
and create a new copy when updating a file -- if they don't understand
ACLs, they can't propagate the ACL from the old object to the new object.
Also, several of the base system utilities (such as mv) don't currently
propagate ACLs.  I hope to fix up a number of them for 5.3, but I suspect
we'll bump into such programs once in a while as we move forwards.

Most of the performance loss associated with ACLs on UFS1 have been
eliminated through UFS2, which is a point in favor of enabling ACLs by
default. 

Once they've settled for some time and the feedback is all looking good,
we might choose to enable them by default.  Disabling by default is
consistent with several other systems also supporting ACLs. 

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: ffsinfo missing in 5.1-RELEASE

2003-08-14 Thread Robert Watson


On Thu, 14 Aug 2003, Lukas Ertl wrote:

> On Wed, 13 Aug 2003, Robert Watson wrote:
> 
> > The only problem with this patch is that we lose the ability to do the
> > "START BLOCK SUMMARY AND POSITION TABLE" display for UFS1.  I'm not sure
> > this is a big issue; I will go ahead and commit it with those #ifdef'd
> > out (rather than removed as is the case in the patch) and look for advice
> > on reintroducing them from someone with more UFS expertise than me.
> 
> I'll have a look at it once it's back in the tree, maybe I find out what
> to do about it. 

It's in my commit queue for today, so hopefully that opportunity will come
quickly :-).  I haven't managed to track down much about cg_blks() yet --
it disappeared with UFS2 coming in, so it will probably just take checking
out an older tree and pushing it forward again for UFS1 dumping.

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories


___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: openpam_load_module():no pam_wheel.so found

2003-08-14 Thread Robert Watson


On Mon, 11 Aug 2003, Christoph P. Kukulies wrote:

> > Did you mergemaster when updating last?  pam_wheel has, I believe, been
> > replaced with pam_group.  A coredump is an undesirable result, of course,
> > but I suspecft that this is the trigger.  If you want to follow up on the
> > core dump, build a copy of su with debugging symbols, and enable
> > kern.sugid_coredump to get a coredump and stack tracfe from su (turn it
> > off again when done).
> 
> How can I find out which module is using pam_wheel.so? It is annoying
> not to have a functioning 'su'. Since locate updatedb uses su also I'm
> additionally impeded since I cannot locate libs and stuff efficiently. 

grep pam_wheel /etc/pam.d/* | grep -v '^#'

should show you the list.

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories


___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: 5.1-R acl problem (again)

2003-08-14 Thread Robert Watson

On Sat, 9 Aug 2003, [iso-8859-2] Branko F. Graènar wrote:

> Now i create directory /export/a. I want to be owned by root:wheel,
> others will no have any access at all and i want that user branko will
> have rw access to it. 
> 
> # mkdir a
> 
> # getfacl a
> #file:a
> #owner:0
> #group:0
> user::rwx
> group::r-x
> other::r-x
> 
> # setfacl -m u::rwx,g::rx,o::---,u:branko:rwx a
> 
> # getfacl a
> #file:a
> #owner:0
> #group:0
> user::rwx
> user:branko:rwx
> group::r-x
> mask::rwx
> other::---
> 
> (testing as branko - works okay)
> 
> Now, if root creates some files (od dirs) in 'a', owner of that file
> will be root and only standard unix triple acl will be assigned, so that
> user branko will not be able to access that file read/write.
> 
> Well, it seems, that default directory acl need to be set to achive
> above goal. 
> 
> # setfacl -b a

This strips your extended access ACL from a, so it now just has owner,
group, and other fields; however, there appears to be an inconsistency in
the POSIX.2c spec regarding using -b without -n -- to make all the entries
disappear and not recalculate a mask, you need "-bn".  We might want to
change this beahvior.

> # setfacl -dm u::rwx,g::rwx,o::--,u:branko:rwx a

A default ACL should now be set, and will be visible if you use "getfacl
-d a".

> # getfacl a
> #file:a
> #owner:0
> #group:0
> user::rwx
> group::r-x
> mask::r-x
> other::---
> 
> WHOOPS, where is user branko?! Why group's acls was not altered from
> 'r-x' to 'rwx' ?! 

Do you mean to use "getfacl -d" here?  This looks like the correct access
ACL.  Try touch a/b, then getfacl a/b, and you'll see the ACL derived from
the default ACL.

> Ofcourse, trying to access directory 'a' as branko doesn't succeed.

The commands you used denied access to user branko.  In POSIX.1e, there
are two kinds of ACLs: access, and default.  Access ACLs are used for
access control, and default ACLs are used to determine the default and
access ACLs of new objects created in a directory.  So if you create a/b,
b will have the access ACL derived from the default ACL on a.

Note that in 5.1-CURRENT, we've changed the semantics for merging the
umask, creation mode, and default ACL, and will probably tweak them a bit
more, but you should be able to see fairly reasonable default ACL behavior
in 5.1 -- certainly visible behavior.

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: openpam_load_module():no pam_wheel.so found

2003-08-14 Thread Robert Watson

On Sun, 10 Aug 2003, Christoph Kukulies wrote:

> When I do an su command from a normal user on my 5.1-current of
> yesterday I'm getting a segfault/core dump. 
> 
> /var/log/messages then shows:
> Aug 10 15:27:44 kukuboo2k su: in openpam_load_module(): no pam_wheel.so found
> Aug 10 15:27:44 kukuboo2k kernel: pid 54586 (su), uid 0: exited on signal 10 (core 
> dumped)
> 
> I also get this when I try to do a sh /etc/periodic/weekly/310.locate. 
> 
> I almost there again with 5.1-current and X on my notebook after a disk
> crash and data loss. Compiled the XFree86-4.3.1 tree and NVIDIA drivers
> today after a complete cvsup and kernel build yesterday. 

Did you mergemaster when updating last?  pam_wheel has, I believe, been
replaced with pam_group.  A coredump is an undesirable result, of course,
but I suspecft that this is the trigger.  If you want to follow up on the
core dump, build a copy of su with debugging symbols, and enable
kern.sugid_coredump to get a coredump and stack tracfe from su (turn it
off again when done).

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: ffsinfo missing in 5.1-RELEASE

2003-08-14 Thread Robert Watson

On Wed, 13 Aug 2003, Ulrich Spoerlein wrote:

> On Wed, 13.08.2003 at 14:32:32 +0200, Lukas Ertl wrote:
> > > I just used growfs (nice tool btw) and noticed that growfs(8) has a
> > > reference to ffsinfo(8). But neither ffsinfo(8) nor the binary are
> > > present on my 5.1 System.
> > 
> > I already submitted a PR bin/53517, which has a patch that repairs ffsinfo
> > on -CURRENT.
>  
> 
> Excellent!
> 
> 
> Sorry for the noise. 

The only problem with this patch is that we lose the ability to do the
"START BLOCK SUMMARY AND POSITION TABLE" display for UFS1.  I'm not sure
this is a big issue; I will go ahead and commit it with those #ifdef'd
out (rather than removed as is the case in the patch) and look for advice
on reintroducing them from someone with more UFS expertise than me.

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: ACLS on UFS2 from FreeBSD 5.1-RELEASE install.

2003-08-14 Thread Robert Watson

On Fri, 8 Aug 2003, Terry Lambert wrote:

> "Daniel C. Sobral" wrote:
> > You'll also notice I'm not questioning the _existence_ of ACL. My point
> > is that FreeBSD is Unix (no matter what the lawyers say), and people
> > don't usually think of ACL when they think of Unix. Ergo, enabling ACL
> > by defautl violates POLA.
> 
> Not if you never *set* an ACL on anything.  It's only when there are
> ACL's set on things that POLA may be violated. 
> 
> One presumes that an ACL has to be set on purpose... 

Well, I think it's more a question of risk with a new feature: it is strue
that the intended semantics of the POSIX.1e ACLs is that they are 100%
compatible: if you don't have any default or extended ACLs, you should get
permissions equivilent to not using ACLs.  However, ACLs both rely on UFS2
EAs, which are a new feature, and include a substantial chunk of logic. 
This suggests that for users never using ACLs, there's a lower risk (in
terms of security and stability) by disabling them by default.

There's also a small potential performance cost associates with ACLs: you
have to access the EAs (generally cheap on UFS2) and do a bit more memory
allocation and evaluation.  When we ran our original ACL performance
benchmarks with UFS1, the difference was fairly measurable for
directory-intensive create operations (since the worst case involves
accessing two ACLs on a parent directory, and writing two on the child) --
almost all of that cost was the EA cost.  With UFS2, EA contents have much
more locality to the file, make use of the buffer cache more effectively,
etc.  All my performance measurements with MAC have seen the EA cost go
almost to zero with UFS2, but I haven't rerun the ACL performance tests
since the move to UFS2.

There are also some application compatibility concerns, which I think is
where the POLA element comes into play: if your users do start using ACLs,
they may get surprises, which may surprise you :-).

I think that having ACLs as an option is lower risk -- in a few minor
revisions, once we have more deployed experience, and have rerun the
performance tests, and more applications have been adapted (for example,
they get backed up by common backup tools) it should be reasonable to
enable them by default.

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: 5.1-R acl problem (again)

2003-08-14 Thread Robert Watson

On Sun, 10 Aug 2003, Branko F. Gracnar wrote:

> Thanks for quick and very informative answer. 
> 
> You're right about getfacl -d (i used linux + acl patch before, where
> default acls are displayed without any arguments and i didn't read
> getfacl man page). 

Yeah -- the Linux tool implementation is based more on Solaris than on
POSIX.1e.  That has some upsides, and some downsides.  I believe there's
an environmental variable you can set on Linux to cause the
getfacl/setfacl to behave in strict accordance with the spec
("POSIXLY_CORRECT" or the like). 

> Thanks alot again.

Sure.

> But there is one thing, i don't understand. 
> 
> if i issue the following command:
> 
> setfacl -dm u::rwx,g::rx,o::---,u:branko:rwx,m::rwx  directory
> 
> and then create file under that directory, why getfacl reports:
> 
> #file:a/c
> #owner:0
> #group:0
> user::rw-
> user:branko:rwx # effective: r--
> group::r-x  # effective: r--
> mask::r--
> other::r--
> 
> why is mask just 'r' ?!

One of the contentious issues in the design of POSIX.1e was how to set the
protections on a new object.  There are three variables of interest: the
creation mode requested by a process, the umask of that process, and the
default ACL on the parent directory where the object is being created.  In
5.0-R and 5.1-R, we combine them as follows: we mask all elements of the
creation mode using the umask; we then combine the ACL and combined mode
by converting the default ACL to the access ACL on the new object and
overwriting the access ACL fields with the equivilent fields in the mode.
So in the above example, a mask of r-- is likely a result of the creation
mode and umask having a group mode of 4.

In 5.1-CURRENT, we recently switched these semantics to perform a further
intersection of rights in the ACL, rather than a replacement of rights. 
The result is that if the mask in your default ACL is --- and the
combination of creation mode and umask is r--, you get a mask of --- in
the final access ACL.  This implements the algorithm in the POSIX.1e spec
to the letter: at some point, these semantics got changed during a
retrofit of the ACL code, and it wasn't picked up (this might actually
have been after 5.0 but I haven't checked the logs).

I'm currently in the throes of implementing a mode of operation which uses
the Solaris/Linux algorithm, which works in the following manner: if an
default ACL is being used to create a new object, the default ACL replaces
the umask, rather than combining with it.  This allows directory default
ACLs to override the umask locally, producing more liberal rights, which
may be what you're expecting.  This is a violation of the spec, but it's a
common violation due to its utility (POSIX.1e doesn't allow the "create
more liberal protections" because it was deemed unsafe).  I hope to finish
prototyping this and get a patch out to the current@ list in the next
couple of weeks.  The complication is that currently, the umask and
requested creation mode are combined at the system call layer, above VFS,
so we need to expose them separately on the entry to the file system.  The
result is that all file systems would now have to combine the two
elements, and it touches a lot of code. 

Hope this information is useful, and gives you a good picture of where
we're going. 

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: Change in application of default ACLs in UFS

2003-08-14 Thread Robert Watson

On Wed, 6 Aug 2003, Daniel C. Sobral wrote:

> >   Note: this change contains a semantic bugfix for new file creation:
> >   we now intersect the ACL-generated mode and the cmode requested by
> >   the user process.  This means permissions on newly created file
> >   objects will now be more conservative.  In the future, we may want
> >   to provide alternative semantics (similar to Solaris and Linux) in
> >   which the ACL mask overrides the umask, permitting ACLs to broaden
> >   the rights beyond the requested umask.
> 
> FWIW, I don't like it. This means I'll have to change my umask to o+rw
> for my ACLs to work correctly, since I use ACLs to _give_ rights in ways
> that umask cannot. 

I'm in the throes of implementing changes that push umask processing down
into individual file systems, permitting UFS ACLs to override the umask
using the ACL mask, which would reproduce the Solaris/Linux model
(non-POSIX.1e).  However, there are some interesting implementation
question shtere, so it will probably be a bit (perhaps a couple of weeks)
before I have a useful prototype worth reviewing.  I agree that those
semantics are useful, however :-).

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

5.2-RELEASE TODO

2003-08-15 Thread Robert Watson

arwin  |
 | |  | | operating system|
 | |  | | has fairly  |
 | |  | | extensive   |
 | Merge of Darwin |  | | improvements to |
 | msdosfs, other  | --   | --  | msdosfs and other   |
 | fixes   |  | | kernel services;|
 | |  | | these fixes must be |
 | |  | | reviewed and merged |
 | |  | | to the FreeBSD  |
 | |  | | tree.   |
 |-+--+-+-|
 | |  | | Port syscons to |
 | |  | | sparc64. Add device |
 | |  | | drivers for sun |
 | |  | | mice and keyboards. |
 | |  | | Allow for more than |
 | sparc64 adaptation  | In   | | 3 bits of   |
 | of syscons  | progress | Jake Burkholder | background colour   |
 | |  | | in syscons. Creator |
 | |  | | frame buffer device |
 | |  | | driver. In the  |
 | |  | | process, generally  |
 | |  | | improve the MI-ness |
 | |  | | of syscons. |
 |-+--+-+-|
 | |  | | Many systems|
 | |  | | supporting POSIX.1e |
 | |  | | ACLs permit a minor |
 | |  | | violation to that   |
 | |  | | specification, in   |
 | |  | | which the ACL_MASK  |
 | |  | | entry overrides the |
 | ACL_MASK override   | In   | | umask, rather than  |
 | of umask support in | progress | Robert Watson   | being intersected   |
 | UFS |  | | with it. The|
 | |  | | resulting semantics |
 | |  | | can be useful in|
 | |  | | group-oriented  |
 | |  | | environments, and   |
 | |  | | as such would be|
 | |  | | very helpful on |
 | |  | | FreeBSD.|
 |-+--+-+-|
 | |  | | Significant parts   |
 | |  | | of the network  |
 | |  | | stack (especially   |
 | |  | | IPv4 and IPv6) now  |
 | |  | | have fine-grained   |
 | |  | | locking of their|
 | |  | | data structures.|
 | |  | | However, it is not  |
 | |  | | yet possible for|
 | |  | | the netisr threads  |
 | |  | | to run without  |
 | |  | | Giant, due to   |
 | Fine-grained|  | | dependencies on |
 | network stack   | In   | Jeffrey Hsu,| sockets, routing,   |
 | locking without | progress | Seigo Tanimura  | etc. A 5.2-RELEASE  |
 | Giant   |  | | goal is to have the |
 | |  | | network stack   |
 | |  | | running largely |
 | |  | | without Giant,  |
 | |  | | which should|
 | |  | | substantially   |
 | |  | | improve performance |
 | |  | | of the stac

Re: LOR with filedesc structure and Giant

2003-08-15 Thread Robert Watson

On Fri, 15 Aug 2003, Kris Kennaway wrote:

> The problem seems to be due to select() being called on the /dev/null
> device, and it is holding the filedesc lock when it reaches
> PICKUP_GIANT() in spec_poll.

Yeah, this is pretty much the same issue you've been bumping into for a
bit -- we hold filedesc lock over select(), which means every object we
poll can't grab a lock that either comes before the file descriptor lockin
the lock order, or that might sleep.

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: LOR with filedesc structure and Giant

2003-08-16 Thread Robert Watson

On Sat, 16 Aug 2003, Poul-Henning Kamp wrote:

> >> The problem seems to be due to select() being called on the /dev/null
> >> device, and it is holding the filedesc lock when it reaches
> >> PICKUP_GIANT() in spec_poll.
> >
> >Yeah, this is pretty much the same issue you've been bumping into for a
> >bit -- we hold filedesc lock over select(), which means every object we
> >poll can't grab a lock that either comes before the file descriptor lockin
> >the lock order, or that might sleep.
> 
> Doesn't this effectively doom any attempt at getting rid af Giant from
> below ? 

I have mixed feelings about our current strategy.  On the one hand, it's a
very simple strategy to understand and implement -- it's also a reasonable
argument that poll operations for status might return "quickly" -- i.e.,
be safe while holding a mutex to prevent the filedesc array from changing.
On the other hand, the lock order and sleep implications are pretty
alarming, and have already caused a substantial number of problems.  It
would be interesting to know what consistency guarantees are provided for
the user app on other platforms with fine-grained kernel locking.

Approaches that come to mind include making a copy of the filedesc array
to prevent it from changing (sounds expensive for a select() call) to
avoid holding the mutex for long; we could move to an sx lock which would
fix the sleep issue at a slightly increased locking cost (but not solve
the lock order problem); if we push Giant past the file descriptor code in
one big throw that would resolve the lock order issue (but not the sleep
problem).  In a recent pass, I identified some of the locks with order
relationships with the file descriptor lock, and many of those will be
non-trivial to resolve.  For example, we grab file descriptor locks during
execve() to clean up the file descriptor array, and kevent interacts with
file descriptor locks.  Pushing Giant off further off execve() might have
a fair number of interactions with VFS and VM we'd want to watch out for
(on the other hand, we are probably close..)  Most of the changes to push
Giant behind the filedesc lock are not too bad, though.  I think it would
be worth a concerted effort by an interested and competent party to push
Giant behind the file descriptor lock.

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: HEADS UP: dynamic root support now in the tree

2003-08-17 Thread Robert Watson

On Sun, 17 Aug 2003, Shin-ichi Yoshimoto wrote:

> make installworld broken.
> 
> ==>libexex/rtld-elf
> [snip]
> ln: /usr/libexec/ld-elf.so.1: Operation not permitted
> *** Error code 1
> 
> any idea ? 

I'm guessing we need to remove the schg flag from the old ld-elf.so.1
before trying to replace it with a symlink:

paprika:/usr/src/kerberos5/usr.bin/kadmin> ls -lo /usr/libexec/ld-elf.so.1
-r-xr-xr-x  1 root  wheel  schg 133292 Jul  3 21:07 /usr/libexec/ld-elf.so.1*

You can work around it locally, probably, by doing a "chflags noschg
/usr/libexec/ld-elf.so.1", but the official makefiles probably need to do
something about it also.

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

freebsd-current@freebsd.org

2003-08-19 Thread Robert Watson

On Tue, 19 Aug 2003, Mark Sergeant wrote:

> There are no other errors apart from those listed so I may try compiling
> as a module that gets loaded on boot. Just one problem, I succesfully
> build an SMP kernel without PAE and then rebooted and the server is no
> longer responding, it seems it crashed just after coming up as I was
> able to ping it 5 or 6 times and then it went away again. If I've got a
> crash dump I'll post it. 

Can't speak to the specifics of this, but you want to be very careful not
to use kernel modules with PAE: modules are currently without the context
of the kernel configuration file, and PAE introduces possible binary
incompatibility with modules that dig into VM (which many drivers do). The
only supported configuration is to not use modules, but link the driver
directly into the kernel when running PAE.  This is why the PAE kernel has
no_modules defined in the sample PAE configuration file.  Various
conversations have happened regarding how to address this problem, and I'm
not sure we've come up with the right answer yet.  There seem to be two
conflicting directions: build modules in the context of a kernel module
(and get conditionally compiled type/structure/code/... pieces), and try
to make the module build entirely independent of a kernel configuration. 
As someone who uses conditionally compiled components in modules, I tend
to fall into the first camp, and no doubt we'll figure out the right
answer in due course.

The above crash sounds unfortunate too, but quite possibly a separate
failure mode :-).

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: mksnap_ffs, snapshot issues, again

2003-08-19 Thread Robert Watson


On Tue, 19 Aug 2003, Branko F. Gracnar wrote:

> >The behaviour of filesystem activity stalling during snapshot creation
> >is intentional, but 30 minutes to snapshot an empty FS is not.  Is
> >there disk activity during this time?  It's not clear from your mail
> >whether bg fsck is in operation during this time.  If so, that's
> >probably the cause, since bg fsck itself uses a snapshot to check the
> >FS consistency.
> 
> Background fsck was NOT running. I formatted fs and then tried to make
> snapshot. 

When reporting bgfsck/snapshot/... problems, you may want to CC Kirk
McKusick <[EMAIL PROTECTED]> -- I don't believe he closely tracks
current@, and he's the best person to track down and fix problems in this
area.  I forwarded your earlier message to him, but haven't heard back as
yet.  Just FYI.

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories


___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: 5.1-R: zero byte core file.

2003-08-20 Thread Robert Watson


On Wed, 20 Aug 2003, Yogeshwar Shenoy wrote:

> While using 5.1-RELEASE, I find that if my application program seg
> faults, it produces "programname.core"; but it is 0 bytes.  I ran the
> exact same program on another machine that was running 4.4-RELEASE, and
> I do get a core file that I can use with gdb.  I'd really appreciate if
> someone could help me resolve this. 
> 
> Additional details:  - It is not specific to the application program. I
> tried a 2 line program: 
> char p[8]; 
> memcpy(p, "1234567890123456789012345678901234567890", 40); 
>  with same results on 5.1-R(0 byte core file) and 4.4-R(usable core
> file) 
> 
> - "ulimit -a" on the 5.1-R machine gives
>core file size (blocks, -c) unlimited
> 
> - Just to be sure I used getrlimit() to find what the limit for
> RLIMIT_CORE is in my processes, and it is RLIM_INFINITY. 
> 
> - I did the basic checks like write permission on current directory, it
> looks fine. 
> 
> Can someone help me resolve this? 

With 5.1-CURRENT from July, I get:

paprika:~/tmp/core> ./tmp 
Segmentation fault (core dumped)
paprika:~/tmp/core> ls -l
total 348
drwxr-xr-x  2 rwatson  rwatson 512 Aug 20 17:23 ./
drwxr-xr-x  9 rwatson  rwatson 512 Aug 20 17:23 ../
-rwxr-xr-x  1 rwatson  rwatson4677 Aug 20 17:23 tmp*
-rw-r--r--  1 rwatson  rwatson 131 Aug 20 17:23 tmp.c
-rw---  1 rwatson  rwatson  323584 Aug 20 17:23 tmp.core

The corefile isn't very useful, since the stack is completely hosed by the
operations, but I do get a core that I can point gdb at.  Some elements of
core file generation are platform-specific: what architecture are you
running on?  And, just to confirm, "df ." indicates you have space, right? 

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories


___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: sysinstall spec_getpages panic (with VM overtones)

2003-08-20 Thread Robert Watson


On Wed, 20 Aug 2003, Gavin Atkinson wrote:

> On the 8th August [EMAIL PROTECTED] mentioned he was getting a panic
> with FreeBSD inside VMware where _mtx_lock is being called with a NULL
> mutex from spec_getpages. I'm also seeing this, 100% reproducible, on
> real hardware. (see message ID [EMAIL PROTECTED] for
> the original posters email and jhb's reply) For me, Sysinstall panics
> during the extraction of the base package: 
> 
> (note that I do not get to see a register dump)  kernel: type 12 trap,
> code=0
> 
> _mtx_lock_flags(0,0,c0529513,300,) at _mtx_lock_flags+0x43
> spec_getpages(cce33b3c,54,0,cce33b2c,0) at spec_getpages+0x26c
> ffs_getpages(cce33b80,0,c05459de,274,c05c63e0) at ffs_getpages+0x5f6
> vnode_pager_getpages(c0bebafc,cce33c70,1,0,cce33c20) at
> vnode_pager_getpages+0x73 vm_fault(c1259900,819b000,1,0,c12534c0) at
> vm_fault+0x8e2 trap_pfault(cce33d48,1,819b004,200,819b004) at
> trap_pfault+0x109 trap(2f,2f,2f,82e533c,0) at trap+0x1fc calltrap() at
> calltrap+0x5

I've been getting similar reports locally from our trustedbsd_sebsd
branch.  We thought originally it was a local merge problem we introduced
due to some inconsistent merging of specfs changes, but I think we have
now have eliminated that.  I suppose I'm relieved... (?)

> I first noticed this with the 20030811 JPSNAP, but have tried with the
> 9th July 2003 JPSNAP, and yesterdays snapshot, and see the same result
> on both. I see the same panic whether installing over the net or from
> CD.  With 64 meg of ram, it panics half way through the read the chunks
> that make up the base package, upping the ram to 256 allows it to read
> all of the chunks before panicing. 

Sounds identical.

> *c0529513 = "/usr/src/sys/fs/specfs/spec_vnops.c", line 0x300 is line 768:
> 
> 766 gotreqpage = 0;
> 767 VM_OBJECT_LOCK(vp->v_object);
> 768 vm_page_lock_queues();
> 769 for (i = 0, toff = 0; i < pcount; i++, toff = nextoff) {
> 
> so ap->a_vp is null. I'#m afraid that's the limit of my ddb ability. 
> 
> Any suggestions as to where I should go from here? I don't really have
> the facility at the moment to make release to test patches but will try
> to if necessary. 

Is it ap->a_vp that's NULL, or vp->v_object that's NULL?  vp is
dereferenced several times before that in the code, so if vp is really
NULL at line 767, we're probably talking about memory corruption.  But if
vp->v_object is NULL, then it could be we're not creating a VM object
along some code path.

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: malloc message with nfs transfer

2003-08-21 Thread Robert Watson

On Thu, 21 Aug 2003, cosmin wrote:

> malloc() of "64" with the following non-sleepable locks held:  exclusive
> sleep mutex inpr = 0 (0xcef0) locked @
> /usr/src/sys/netinet/udp_usrreq.c:378 exclusive sleep mutex netisr lock
> r = 0 (0xc061be80) locked @ /usr/src/sys/net/netisr.c:215
> 
> I'm getting those on the console, and it seems that they only happen
> when users start an nfs transfer to the nfs exported filesystem.  The
> exported filesystem is a vinum raid5 array but I don't know if that has
> anything to do with the messages.

Sorry, just to be clear -- is the message you're getting on the NFS
client, or the NFS server?  Could you turn on debug.witness_ddb and get a
stack trace for the warning?

> Before I upgraded from 4.8, I used to be able to send at about 8mb/s to
> the nfs exported raid5.  After upgrading to 5.1-CURRENT, the maximum
> speed has been only 4mb/s.  I'm wondering if the messages above have
> anything to do with the performance drop. 

You appear to have the kernel debugging features turned to high (which
will be useful for resolving this problem :-).  Turn off WITNESS and
INVARIANTS and you should see a substantial performance improvement.  It
may not be back up to 4.x levels -- we hope that with ongoing network
stack locking work we'll be back to 4.x (and exceed them) in the next few
months.

Thanks,

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: malloc message with nfs transfer

2003-08-21 Thread Robert Watson

On Thu, 21 Aug 2003, cosmin wrote:

> > Sorry, just to be clear -- is the message you're getting on the NFS
> > client, or the NFS server?  Could you turn on debug.witness_ddb and get a
> > stack trace for the warning?
> 
> This is on the NFS server.  I turned on debug.witness_ddb, but I'm not
> sure if this will help, because the system isn't locking up, or
> otherwise stopping.  I have tried setting a breakpoint in ddb for
> 0xcef0, but it starts breaking right away.  The malloc() messages
> are many minutes apart.

With witness_ddb turned on, the kernel should drop into ddb whenever
there's a witness-related warning, which should include the warnings you
mentioned in your previous e-mail.

> I'm not sure if these messages indicate anything critical.  I was mainly
> concerned with the nfs performance.

Generally speaking, WITNESS warns about potential problems, as opposed to
actual problems: i.e., it warns when a deadlock "would have occurred", if
the timing had been just right.  This was a warning that a potentially
blocking activity was performed while holding a mutex, which is generally
a bad idea.  A little bit more detail on the strack trace should be
sufficient to track it down.  Turning off WITNESS should dramatically
improve performance at the risk of lowering debugging output (maybe that's
not a risk :-).

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: Top in jails

2003-08-24 Thread Robert Watson


On Fri, 22 Aug 2003, Rus Foster wrote:

> I'm playing with jail on FBSD5 and wondered if there was anyway I could
> use top without have to create /dev/mem. ATM anyone in the jail could
> just do cat /dev/mem | grep for_intresting_stuff. Any ideas? 
> 
> Tried using devfs and still no luck

top should run fine without /dev/mem -- by default, it will use sysctl(),
and isn't setgid kmem, so can't access /dev/mem by default, so isn't using
it even outside jail.

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: sysinstall spec_getpages panic (with VM overtones)

2003-08-25 Thread Robert Watson

On Mon, 25 Aug 2003, Gavin Atkinson wrote:

> On Wed, 20 Aug 2003, Robert Watson wrote:
> > On Wed, 20 Aug 2003, Gavin Atkinson wrote:
> > > _mtx_lock_flags(0,0,c0529513,300,) at _mtx_lock_flags+0x43
> > > spec_getpages(cce33b3c,54,0,cce33b2c,0) at spec_getpages+0x26c
> > > ffs_getpages(cce33b80,0,c05459de,274,c05c63e0) at ffs_getpages+0x5f6
> > > vnode_pager_getpages(c0bebafc,cce33c70,1,0,cce33c20) at
> > > vnode_pager_getpages+0x73 vm_fault(c1259900,819b000,1,0,c12534c0) at
> > > vm_fault+0x8e2 trap_pfault(cce33d48,1,819b004,200,819b004) at
> > > trap_pfault+0x109 trap(2f,2f,2f,82e533c,0) at trap+0x1fc calltrap() at
> > > calltrap+0x5
> > >
> > > *c0529513 = "/usr/src/sys/fs/specfs/spec_vnops.c", line 0x300 is line 768:
> > >
> > > 766 gotreqpage = 0;
> > > 767 VM_OBJECT_LOCK(vp->v_object);
> > > 768 vm_page_lock_queues();
> > > 769 for (i = 0, toff = 0; i < pcount; i++, toff = nextoff) {
> >
> > Is it ap->a_vp that's NULL, or vp->v_object that's NULL?  vp is
> > dereferenced several times before that in the code, so if vp is really
> > NULL at line 767, we're probably talking about memory corruption.  But if
> > vp->v_object is NULL, then it could be we're not creating a VM object
> > along some code path.
> 
> Although this panic is 100% reproducible during the initial install
> through sysinstall, I have tried hard but can not reproduce this once
> the system is installed and running multiuser, even by performing the
> same actions within sysinstall. I have I have also tried without success
> to get a crash dump of the panic, however after a fair bit of head
> scratching it looks from a grep of the source code like the "dumpdev"
> loader variable documented in loader(8) is not yet implemented... and as
> far as I can tell there is no other way I can get the installer off CD
> to generate a dump. 
> 
> I'm trying to make a release with extra debugging info, but won't be
> able to test this until at least Wednesday or Thursday. What extra
> debugging info would be useful? Who would be the best person to discuss
> this with?  From what kuriyama said, it appears that it is indeed
> vp->v_object that is null, so I have added the following to
> specfs_vnops.c just before the lock that fails: 
> 
>   if (vp->v_object == NULL) 
> panic("vp->v_object is null in %s, rdev=%s", __func__,
> devtoname(vp->v_rdev)); 
> 
> Hopefully that will help diagnose the cause a little further, but I'm
> really working blind here - this is not an area of the kernel I
> understand at all. If there is any other debugging info I can provide
> that may be useful, I'm happy to have a go. Kuriyama, if you have any
> spare time before I am able to do it, maybe you could add the above code
> and find out what message it panics with? 

Alan Cox just made a commit a couple of days ago that seems to resolve the
problem for us.  Here's the commit message so you can give it a try. 

alc 2003/08/22 10:50:32 PDT

  FreeBSD src repository

  Modified files:
sys/fs/specfsspec_vnops.c 
  Log:
  Use the requested page's object field instead of the vnode's.  In some
  cases, the vnode's object field is not initialized leading to a NULL
  pointer dereference when the object is locked.
  
  Tested by:  rwatson
  
  Revision  ChangesPath
  1.208 +5 -2  src/sys/fs/specfs/spec_vnops.c


___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: nfs tranfers hang in state getblck or nfsread

2003-08-27 Thread Robert Watson

On Wed, 27 Aug 2003, Pawel Worach wrote:

> >In this configuration I see a lot of "nfs server ...: is not responding"
> >and "nfs server ...: is alive again" when I copy large files (e.g. a CD
> >image). All of them happen in the same second. I haven't looked at the
> >state or priority of the cp process when this happens.
> 
> I have seen this too and i can reproduce it. I have a diskless client
> and if i unplug the power and boot it up again i see the "nfs server not
> responding" messages for every filesystem being mounted. Both client and
> server are of course FreeBSD-current, i have seen this for about four
> mounts now. 

I have a very similar configuration, but it sounds like I'm not bumping
into the same problem.  Are you using NFSv2 or v3, and how many file
systems are you mounting?  Are you generally using UFS1 or UFS2?  Right
now, I'm mounting a single UFS2 file system was the root, and I believe
right now we always mount NFS roots at NFSv2, which could by why I'm not
seeing the same problem...

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: nfs tranfers hang in state getblck or nfsread

2003-08-28 Thread Robert Watson

On Wed, 27 Aug 2003, Pawel Worach wrote:

> I get the errors every time the nfs mounts are not unmounted "cleanly",
> if the client (which is a laptop and i often forget to plug in the power
> so the battery dies) dies and the server is rebooted the client boots
> fine, i.e. no "nfs server not responding errors". So it looks like there
> is some kind of state mismatch in the nfs server code. 

Ok, so let me see if I have the sequence of events straight:

(1) Boot a 4.8-RELEASE/STABLE NFS server
(2) Boot a 5.1-RELEASE/CURRENT NFS client
(3) Mount a file system using TCP NFSv3
(4) Reboot the client system, reboot, and remount
(5) Thrash the file system a bit with large reads/writes, and it hangs

Is this correct?  I'd like to work out the minimum sequence of events
necessary to cause the problem.  Is (4) necessary to reproduce the hang,
or can you cause it without (4) if you wait long enough?  You mention a
server reboot here, also, so I want to make sure I'm not confused about
the steps to hit the problem.

Also, could you try enabling the all.log entry in syslogd, and looking for
messages that read something like "nfs send error" in it after this has
happened?

Once the hang is occuring on the client, can you drop into DDB and do a
ps, and in particular, paste into an e-mail any lines about nfsiod
threads, and any threads that are blocked in nfs?

Likewise, on the server, could you drop into DDB and do a ps, and paste in
the state of any nfsd threads?

> rc.conf parameters look like this:  server: rpcbind_enable="YES"
> nfs_server_enable="YES" mountd_enable="YES" 
> nfs_reserved_port_only="YES" rpc_lockd_enable="YES"
> rpc_statd_enable="YES"  client: rpcbind_enable="YES"
> nfs_client_enable="YES"  rpc_lockd_enable="YES" rpc_statd_enable="YES" 

For kicks, try disabling rpc.lockd on all sides, as well as rpc.statd.  I
don't think they're involved here, but it's worth disabling them to be
sure.

Thanks,

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: nfs tranfers hang in state getblck or nfsread

2003-08-28 Thread Robert Watson

On Thu, 28 Aug 2003, Terry Lambert wrote:

> Pawel Worach wrote:
> [ ... subject ... ]
> 
> > This only seem to happen for nfs over tcp.
> 
> That's strange; most of the problems I've ever seen are from using UDP,
> large read/write sizes, and then droping one packet out of a bunch of
> frags caused by the MTU being much smaller than the read/write size
> (misguided attempt to emulate a fixed window size and get more packets
> in flight, without using TCP to do it). 

I'm wondering if large block sizes are causing the TCP socket buffer to
fill, resulting in some bad behavior on the client or server.  Most
probably the server, given that the scenarios in question seem to involve
reading.  Another intereseting test case might be to use dd with various
block sizes to read from a file on the server and see whether a particular
size triggers the problem, or if it's less deterministic (and more likely
a race condition of some sort).

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: Someone help me understand this...?

2003-08-28 Thread Robert Watson

On Wed, 27 Aug 2003, Joe Greco wrote:

> The specific OS below is 5.1-RELEASE but apparently this happens on 4.8
> as well. 

Could you confim this happens with 4.8?  The access control checks there
are substantially different, and I wouldn't expect the behavior you're
seeing on 4.8...

<...>
> Well, the sending and receiving processes both clearly have equal uid/euid.
> 
> We're not running in a jail, so I don't expect any issues there.
<...>
> 
> The parent process did actually start as root and then shed privilege with
> 
> struct passwd *pw = getpwnam("news");
> struct group *gr = getgrnam("news");
> gid_t gid;
> 
> if (pw == NULL) {
> perror("getpwnam('news')");
> exit(1);
> }
> if (gr == NULL) {
> perror("getgrnam('news')");
> exit(1);
> }
> gid = gr->gr_gid;
> setgroups(1, &gid);
> setgid(gr->gr_gid);
> setuid(pw->pw_uid);
> 
> so that looks all well and fine...  so why can't it kill its own children,
> and why can't I kill one of its children from a shell with equivalent 
> uid/euid?
> 
> I know there's been some paranoia about signal delivery and all that, but
> my searching hasn't turned up anything that would explain this.  Certainly
> the manual page ought to be updated if this is a new expected behaviour or
> something...  at least some clue as to why it might fail would be helpful.

The man page definitely needs to be updated, but I think it's worth having
a conversation about whether the current behavior is too conservative
first...

These changes come in response to a class of application vulnerabilities
relating to the delivery of "unexpected signals".  The reason the process
in question is being treated as special from an access control perspective
is that it has undergone a credential change, resulting in the setting of
the process P_SUGID bit.  This bit remains set even if the remaining
credentials of the process appear "normal" -- i.e., even if ruid==euid,
rgid==egid, and can only be reset by calling execve() on a "normal" 
binary, which is considered sufficient to flush the state of the process. 

These processes are given special protection properties because they
almost always have cached access to memory or resources acquired using the
original credential.  For example, the process accesses the password file
while holding root privilege, which means that the process may well have
password hashes in memory from its reading the shadow password file -- in
fact, it likely even have a file descriptor to the shadow password file
still open.  The same P_SUGID flag is used to prevent against unprivileged
debugging of applications that have changed credentials and now appear
"normal".  P_SUGID is also used to determine the results of the
issetugid() system call, which is used by many libraries to see if they
are running with (or have run with)  privilege and need to behave in a
more conservative manner. 

I don't remember the details, but there have been at least a couple of
demonstrated exploits of vulnerable applications using signals in which
setuid applications rely on certain signals (such as SIGALRM, SIGIO,
SIGURG) only being delivered as a result of system calls that set up
timers, IO, etc. I seem to recall it might have involved a setuid
application such as sendmail on OpenBSD, but I'll have to do some googling
and get back to you.  These protections probably fall into the same class
of conservative behavior as our preventing setuid programs from being
started with closed stdin/stdout/stderr descriptors.

Giving up privilege without performing an exec() is very difficult in
UNIX, unfortunately, since the trappings of privilege may be maintained by
libraries, etc, without the knowledge of application writers.  Right now,
signal delivery in 5.x is pretty conservative if a process has changed
credentials, to protect against tampering with a class of applications
that has, historically, been vulnerable to a broad variety of exploits. 
I've attached an (untested) patch that makes this behavior run-time
configuration using a sysctl -- when the sysctl is disabled, special-case
handling for P_SUGID processes is disabled.  I believe that this will
cause the problem you're experiencing in 5.x to go away -- please let me
know.

Clearly, unbreaking applications like Diablo by default is desirable.  At
least OpenBSD has similar protections to these turned on by default, and
possibly other systems as well.  As 5.x sees more broad use, we may well
bump into other cases where applications have similar behavior: they rely
on no special protections once they've given up privilege.  I wonder if
Diablo can run unmodified on OpenBSD; it could be they don't include
SIGALRM on the list of "protect against" signals, or it could be that they
modify Diablo for their environment to use an alternative signaling
mechanism.  Another alternative to this patch would simply be to

Re: Someone help me understand this...?

2003-08-28 Thread Robert Watson

On Thu, 28 Aug 2003, Joe Greco wrote:

> > On Wed, 27 Aug 2003, Joe Greco wrote:
> > > The specific OS below is 5.1-RELEASE but apparently this happens on 4.8
> > > as well. 
> > 
> > Could you confim this happens with 4.8?  The access control checks there
> > are substantially different, and I wouldn't expect the behavior you're
> > seeing on 4.8...
> 
> Rather difficult.  I'll see if the client will let me trash a production
> system, but usually people don't like $40K servers handing out a few
> hundred megabits of traffic going out of service.  We were trying to fix
> it on the scratch box (which happens to have 5.1R on it) and then were
> going to see how it fared on the production systems. 

I think it's safe to assume that if you're seeing a similar failure,
there's a different source given my reading of the code, but I'm willing
to be proven wrong.  It's probably not worth the investment if you're
talking about large quantities of money, though.

> > Clearly, unbreaking applications like Diablo by default is desirable.  At
> > least OpenBSD has similar protections to these turned on by default, and
> > possibly other systems as well.  As 5.x sees more broad use, we may well
> > bump into other cases where applications have similar behavior: they rely
> > on no special protections once they've given up privilege.  I wonder if
> > Diablo can run unmodified on OpenBSD; it could be they don't include
> > SIGALRM on the list of "protect against" signals, or it could be that they
> > modify Diablo for their environment to use an alternative signaling
> > mechanism.  Another alternative to this patch would simply be to add
> > SIGARLM to the list of acceptable signals to deliver in the
> > privilege-change case.
> 
> I wonder if it would be reasonable to have some sort of interface that
> allowed a program to tell FreeBSD not to set this flag...  if not, at
> least if there was a sysctl, code could be added so that the daemon
> checked the flag when starting and errored out if it wasn't set. 

We actually have such an interface, but it's only enabled for the purposes
of regression testing.  If you compile "options REGRESSION" into the
kernel configuration, a new system call __setsugid(), is exposed to
applications.  It's used by src/tools/regression/security/proc_to_proc to
make it easier to set up process pairs for regression testing of
inter-process access control.  When I added it, there was some interest in
just making it setsugid() and exposing it to all processes.  Maybe we
should just go this route for 5.2-RELEASE.  Invoking it with a (0)
argument would mean the application writer accepted the inherrent risks.

However, this would open the application to the risks of debugging
attachment, which are probably greater than the signal risks in most
cases.  It's not clear what the best way to express "I want to accept
 but not " would be...  So far, it sounds like
we have three work-arounds in the pot, perhaps we can think of something
better:

(1) Remove SIGALRM from the list of prohibited signals in the P_SUGID
case.  Not clear what the risks are here based on common application
use, but this is an easy change to make.

(2) Add setsugid() to allow applications to give up implicit protections
associated with credential changes.  This comes with greater risks, I
suspect, since it opens up applications to more explicit
vulnerabilities:  signal attacks require more sophistication and luck,
but debugging attacks are "easy".

(3) Allow administrators to selectively disable the more restrictive
signal checks at a system scope using a sysctl.  This is easy, and
comes with no risks as long as the setting is unchanged (the default
in the patch I sent out earlier). 

I'm tempted to commit (1) immediately to allow a workaround if we get
nothing else figured out, and to think some more about (2) and (3).
Another possibility would be to encourage application writers to avoid
overloading signals that already have "meanings", and rely on the USR
signals.  I assume the reason Diablo uses ALRM is that the USR signals
already have assigned semantics?

> > BTW, it's worth noting that the mechanism Diablo is using to give up
> > privilege actually does retain some "privileges" -- it doesn't, for
> > example, synchronize its resource limits with those of the user it is
> > switching to, so it retains the starting resource limits (likely those of
> > the root account). 
> 
> That's actually preferred in most cases.  News servers almost always eat
> far more resources than whatever limits you might set by default, which
> just turns into telling people to remove the limits or use root's
> limits.  Generally if a news package bumps limits bad things happen. 

Right now, most applications in the base system make use of the
setusercontext() call to modify their protections as part of a switch of
users.  They often pass in the flag LOGIN_SETALL and then remove the bits
they don't need, such a

5.2-RELEASE TODO

2003-08-29 Thread Robert Watson

   |
 | |   | | post-procfs world. |
 | |   | | Dag-Erling |
 | |   | | Smorgrav had   |
 | |   | | prototype patches; |
 | |   | | Robert Drehmel is  |
 | |   | | developing and |
 | |   | | testing patches|
 | |   | | now.   |
 |-+---+-+|
 | |   | | Apple's Darwin |
 | |   | | operating system   |
 | |   | | has fairly |
 | |   | | extensive  |
 | Merge of Darwin |   | | improvements to|
 | msdosfs, other  | --| --  | msdosfs and other  |
 | fixes   |   | | kernel services;   |
 | |   | | these fixes must   |
 | |   | | be reviewed and|
 | |   | | merged to the  |
 | |   | | FreeBSD tree.  |
 |-+---+-+|
 | |   | | Port syscons to|
 | |   | | sparc64. Add   |
 | |   | | device drivers for |
 | |   | | sun mice and   |
 | |   | | keyboards. Allow   |
 | |   | | for more than 3|
 | sparc64 adaptation  | In| | bits of background |
 | of syscons  | progress  | Jake Burkholder | colour in syscons. |
 | |   | | Creator frame  |
 | |   | | buffer device  |
 | |   | | driver. In the |
 | |   | | process, generally |
 | |   | | improve the|
 | |   | | MI-ness of |
 | |   | | syscons.   |
 |-+---+-+|
 | |   | | Many systems   |
 | |   | | supporting |
 | |   | | POSIX.1e ACLs  |
 | |   | | permit a minor |
 | |   | | violation to that  |
 | |   | | specification, in  |
 | |   | | which the ACL_MASK |
 | |   | | entry overrides|
 | ACL_MASK override   | In| | the umask, rather  |
 | of umask support in | progress  | Robert Watson   | than being |
 | UFS |   | | intersected with   |
 | |   | | it. The resulting  |
 | |   | | semantics can be   |
 | |   | | useful in  |
 | |   | | group-oriented |
 | |   | | environments, and  |
 | |   | | as such would be   |
 | |   | | very helpful on|
 | |   | | FreeBSD.   |
 |-+---+-+|
 | |   | | Significant parts  |
 | |   | | of the network |
 | |   | | stack (especially  |
 | |   | | IPv4 and IPv6) now |
 | |   | | have fine-grained  |
 | |   | | locking of their   |
 | |   | | data structures.   |
 | |   | | However, it is not |
 | |   | | yet possibl

Re: buildworld failure

2003-08-29 Thread Robert Watson


On Fri, 29 Aug 2003, Mike Jakubik wrote:

> I have re cvsuped 2 days later (Fri Aug 29 10:19:29 EDT 2003) and I am
> still getting the same error, can anyone shed some light here? 

Here's the build output from my build of pam_echo a couple of days ago:

cc -O -pipe -mcpu=pentiumpro
-I/usr/src/lib/libpam/modules/pam_echo/../../../../contrib/openpam/include
-I/usr/src/lib/libpam/modules/pam_echo/../../libpam -Wsystem-headers
-Werror -Wall -Wno-format-y2k -W -Wstrict-prototypes -Wmissing-prototypes
-Wpointer-arith -Wreturn-type -Wcast-qual -Wwrite-strings -Wswitch
-Wshadow -Wcast-align -Wno-uninitialized -c
/usr/src/lib/libpam/modules/pam_echo/pam_echo.c

The differences here seem to be:

(1) I'm using -O, not -O2
(2) I'm optimizing -mcpu as pentiumpro, not pentium4

I'd try changing your optimization settings and see if things improve --
the cast in question takes the pointer to a local (const char *) and
passes it to pam_get_item as a (const void **), which probably confuses
alias analysis, but is legitimate and necessary, I think.

> 
> Thanks.
> 
> ===> lib/libpam/modules/pam_echo cc -O2 -pipe -march=pentium4
> -I/usr/src/lib/libpam/modules/pam_echo/../../.. 
> /../contrib/openpam/include
> -I/usr/src/lib/libpam/modules/pam_echo/../../lib pam -Wsystem-headers
> -Werror -Wall -Wno-format-y2k -W -Wstrict-prototypes -
> Wmissing-prototypes -Wpointer-arith -Wreturn-type -Wcast-qual
> -Wwrite-strings -Wswitch -Wshadow -Wcast-align -Wno-uninitialized -c
> /usr/src/lib/libpam/modules/pam_echo/pam_echo.c
> /usr/src/lib/libpam/modules/pam_echo/pam_echo.c: In function
> `_pam_echo':  /usr/src/lib/libpam/modules/pam_echo/pam_echo.c:92:
> warning: dereferencing type-punned pointer will break strict-aliasing
> rules *** Error code 1
> 
> Stop in /usr/src/lib/libpam/modules/pam_echo.
> *** Error code 1
> 
> ___
> [EMAIL PROTECTED] mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "[EMAIL PROTECTED]"
> 

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

RE: buildworld failure

2003-08-29 Thread Robert Watson


On Fri, 29 Aug 2003, Mike Jakubik wrote:

> > Depends on how much work is involved in fixing it, and what the negative
> > impact is of leaving it.  Do you know what the impact is?
> 
>   I think the impact is more social. People will try to compile
> world and get failures. Specially people coming from the 4.x branch,
> where this sort of think never occurred. If this is the only thing
> preventing a clean makeworld with -O2, I think its worth taking a look
> at. 
> 
>   I've been using freebsd since the 2.x days, I have always
> compiled world and ports with -O2, and never had any instability issues
> due to the optimizations. I have switched back to -O and
> -march=pentium4, the buildworld finished ok. 

Well, it looks to me like the pam_echo code is correct, although I'm
willing to admit I'm not a expert in the code in question -- it's just not
optimizable to the desired level of optimization due to the interfaces
used.  As such, the real problem is either than the warning is generated,
or that the warning causes a build failure... 

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories



___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: 2 ports broken after gcc import

2003-08-29 Thread Robert Watson


On Fri, 29 Aug 2003, Kenneth Culver wrote:

> > Just for more info, when was the last time you updated your /etc? on my
> > 4th -CURRENT machine, with the same compiler etc... I havn't updated my
> > /etc/ since June 1, and that machine works, the other 3 have been updated
> > very recently, like within the last few weeks, and they're all broken. So
> > I guess it's not a compiler issue, but some kind of configuration issue. I
> > can't think of what the problem could be though.
> >
> OK, checked over my kernel configurations and found that ACL's were in my
> kernel configuration. I took that option out and things are working again.
> I have no idea how ACL's could've caused what I was seeing, but everything
> is working now. Thanks for your help.

Bizarre.  I use ACLs in my kernel daily, and I use nmap almost daily, and
haven't seen this.  If you re-add ACLs with a fresh kernel build, does the
problem come back?  Could you look at ktraces of nmap with and without
ACLs and see what causes it?  Do you have ACLs enabled on any file
systems, or are you just running with the kernel option? 

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories


___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: sysinstall spec_getpages panic (with VM overtones)

2003-08-30 Thread Robert Watson


On Sat, 30 Aug 2003, Gavin Atkinson wrote:

> On Sun, 24 Aug 2003, Robert Watson wrote:
> > Alan Cox just made a commit a couple of days ago that seems to resolve the
> > problem for us.  Here's the commit message so you can give it a try.
> >
> > [...]
> >   1.208 +5 -2  src/sys/fs/specfs/spec_vnops.c
> 
> I can confirm this fixes things for me too. Thanks all for the help, and
> sorry for the false alarm after the commit. 

Great, thanks for letting us know.

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories


___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: 2 ports broken after gcc import

2003-08-30 Thread Robert Watson

On Fri, 29 Aug 2003, Kenneth Culver wrote:

> > Bizarre.  I use ACLs in my kernel daily, and I use nmap almost daily,
> > and haven't seen this.  If you re-add ACLs with a fresh kernel build,
> > does the problem come back?  Could you look at ktraces of nmap with and
> > without ACLs and see what causes it?  Do you have ACLs enabled on any
> > file systems, or are you just running with the kernel option?
> 
> I was running with just the kernel option, and nothing configured for
> it.  I can't think of what else the problem could be, when I recompiled
> the kernel it just started working again, it might not have anything at
> all to do with ACL's and more to do with the fact that I just recompiled
> it. One of my other -CURRENT machines is working now as well after a
> recompile.  I'll do more testing to see if I can pinpoint the problem
> and I'll probably have results by Tuesday (holiday weekend :-P ) 

I just built a fresh nmap on my -current box and it appears to work fine
for me, as did the older nmap.  So I guess that leaves me firmly in the
"unable to reproduce" camp.  I have noticed that, on my wi0 boxes, I tend
to get a fair number of ENOBUFS errors when nmaping, but that appears to
be unrelated to the presence of UFS_ACL in the kernel.

Are your different boxes using the same type of network interface?  Do you
rely on routed or use static routes?  If you tcpdump the interface, do any
nmap packets get out -- for example, the initial ping it performs before
scanning a host, or none? 

Have a good holiday weekend :-).

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: ifconfig -a blows up if /etc/mac.conf isn't installed

2003-08-30 Thread Robert Watson


On Fri, 29 Aug 2003, Bryan Liesner wrote:

> On Fri, 29 Aug 2003, Kenneth D. Merry wrote:
> 
> > Memory fault (core dumped)
> 
> Same here.  I took a look, and found that line 62 of
> /usr/src/sbin/ifconfig/ifmac.c returns ENOENT, but the docs say this
> should return a -1.  So this code looks correct. 

Ken, Bryan,

Thanks -- I've committed a variation on this patch which returns (-1) but
also sets errno to ENOENT.  You can pull down the results in mac.c#1.7.
Sorry about that!

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories


___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: 2 ports broken after gcc import

2003-08-30 Thread Robert Watson

On Fri, 29 Aug 2003, Kenneth Culver wrote:

> > Might devfs propagate ACL characteristics via /dev nodes into
> > applications? Otherwise, the symptom you described would have made me
> > point to the IP firewall first.
> 
> My machine that was showing the problem didn't have a firewall enabled. 
> I'll still mess with it some more to see what I can come up with, maybe
> it was the firewall, but I don't remember having ipfilter or ipfirewall
> in the kernel but it'd been a while since I edited that config file or
> compiled that kernel so maybe I took out the firewall options and never
> compiled, and then compiled today. (It's been about a month since I did
> anything kernel related on that machine). Anyway, when I pinpoint the
> problem I'll mail the list.

I think I missed the message that this is a response to, but here's an
answer to the question: UFS_ACL controls only the introduction of ACL code
into UFS1 and UFS2 file systems, and enables conditional use of ACLs code
if the ACLs flag is set on a file system.  If the ACLs flag is not set on
a file system, the UFS1/UFS2 code is intended to run along its original
permissions-based code path.  Devfs isn't based on UFS, and so it should
be unaffected by the UFS_ACL flag.  If there's a definite causal
relationship between UFS_ACL and the nmap failure, I can't help but wonder
if it's a result of a timing, code layout, or memory allocation change of
some sort.  I wouldn't rule out a bug in the ACL code, but it seems
somewhat unlikely as, without the ACLs flag set, the code path in the UFS
code should be minimally changed... 

The best path to track this down is to try to figure out for sure which
system call is failing, compare expected vs. wire network transmissions,
and see if we can reproduce this in a simpler test program.

We've recently made some changes in how the permissions of new objects are
calculated using ACLs; they were made somewhat before the gcc changes, I
believe, but it might also be interesting to see test cases from before
both changes, in between the changes, and after both, to confirm that it
was definitely the gcc change that kicked off the problem, rather than the
UFS change.  Finally, I'd like to know what, if any, optimization flags
you're using for the kernel compile...

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED] Network Associates Laboratories

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: Someone help me understand this...?

2003-08-30 Thread Robert Watson

On Sat, 30 Aug 2003, Jilles Tjoelker wrote:

> On Thu, Aug 28, 2003 at 11:34:09AM -0400, Robert Watson wrote:
> > > > Clearly, unbreaking applications like Diablo by default is desirable.  At
> > > > least OpenBSD has similar protections to these turned on by default, and
> > > > possibly other systems as well.  As 5.x sees more broad use, we may well
> > > > bump into other cases where applications have similar behavior: they rely
> > > > on no special protections once they've given up privilege.  I wonder if
> > > > Diablo can run unmodified on OpenBSD; it could be they don't include
> > > > SIGALRM on the list of "protect against" signals, or it could be that they
> > > > modify Diablo for their environment to use an alternative signaling
> > > > mechanism.  Another alternative to this patch would simply be to add
> > > > SIGARLM to the list of acceptable signals to deliver in the
> > > > privilege-change case.
> 
> OpenBSD does not consider a process 'tainted' if it changes credentials
> while running. From the issetugid(2) manpage: 
> 
> The status of issetugid() is only affected by execve().

In OpenBSD, two flags are used to represent the credential change notion: 
P_SUGIDEXEC, and P_SUGID.  issetugid() checks the first of these, but
signal delivery checks P_SUGID.  P_SUGIDEXEC is set during execve().  In
FreeBSD, we have a combined notion used by both, since the same
protections generally apply.  You can find a comment comparing our use of
P_SUGID to the OpenBSD approach in our issetugid() implementation:

/*
 * Note: OpenBSD sets a P_SUGIDEXEC flag set at execve() time,
 * we use P_SUGID because we consider changing the owners as
 * "tainting" as well.   
 * This is significant for procs that start as root and "become"
 * a user without an exec - programs cannot know *everything*
 * that libc *might* have put in their data segment.
 */

Regarding specific signals: inspection of the OpenBSD implementation
reveals that the following signals are permitted in the P_SUGID case,
assuming a reasonable credential match:

case 0:
case SIGKILL:
case SIGINT:
case SIGTERM:
case SIGALRM:
case SIGSTOP:
case SIGTTIN:
case SIGTTOU:
case SIGTSTP:
case SIGHUP:
case SIGUSR1:
case SIGUSR2:

In FreeBSD, we permit:

case 0:
case SIGKILL:
case SIGINT:
case SIGTERM:
case SIGSTOP:
case SIGTTIN:
case SIGTTOU:
case SIGTSTP:
case SIGHUP:
case SIGUSR1:
case SIGUSR2:

So they permit SIGALRM in addition to the signals we support.  In light of
this thread, I think it would be reasonable to add SIGALRM to our list as
well.

> > In most cases, fail-stop is a reasonable behavior for unexpected security
> > behavior from the system, but ignore is likely to shoot you later. :-)  I
> > tend to wrap even kill() calls as uid 0 in an assertion check, just to be
> > on the safe side.  If nothing else, it helps detect the case where the
> > other process has died, and you're using a stale pid.  It's particular
> > useful if the other process has died, the pid has been reused, and it's
> > now owned by another user, which is a real-world case where kill() as a
> > non-0 uid can fail even when you're sure it can't :-). 
> 
> This can be avoided by careful programming: do not use SA_NOCLDWAIT and
> don't pass pids to kill() when they have been returned by wait() or
> similar functions. If the process has terminated in between, it's a
> zombie. In that case, FreeBSD probably returns ESRCH but SUSv3 mandates
> returning success (but performing no action). 

There's still a race possible here, it just becomes more narrow with
conservative programming.  And in the classic use of pids for signalling
(/var/run/foo.pid, or kill -9 pid), these approaches won't help.  The only
way to close this sort of race is to have a notion of a unique process
identifier that lasts beyond the lifetime of the process itself -- i.e.,
the ability to return EMYSINCERESTREGRESTS if you try to signal a process
after it has died, and have a guarantee that the handle won't be reused. 

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: Someone help me understand this...?

2003-08-30 Thread Robert Watson


On Thu, 28 Aug 2003, Joe Greco wrote:

> > > > Could you confim this happens with 4.8?  The access control checks there
> > > > are substantially different, and I wouldn't expect the behavior you're
> > > > seeing on 4.8...
> > > 
> > > Rather difficult.  I'll see if the client will let me trash a production
> > > system, but usually people don't like $40K servers handing out a few
> > > hundred megabits of traffic going out of service.  We were trying to fix
> > > it on the scratch box (which happens to have 5.1R on it) and then were
> > > going to see how it fared on the production systems. 
> > 
> > I think it's safe to assume that if you're seeing a similar failure,
> > there's a different source given my reading of the code, but I'm willing
> > to be proven wrong.  It's probably not worth the investment if you're
> > talking about large quantities of money, though.
> 
> It's more like "large quantities of annoyance and work".  Can you
> describe the case you're envisioning?  If I can easily poke at it, I can
> at least get some clues. 

I guess all I'm looking for is confirmation that your original statement
(happens in 4.8 and 5.1) is completely correct: the 5.1 behavior is
expected, but I'm surprised it happens with 4.8. 

> Correct.  The USR signals control debug levels.  If it was a signal that
> was only used internally, it could be changed, of course, but changing a
> signal used by humans (and one used in the same manner as other
> programs)  is probably a bad idea. 

Try the patch attached, which introduces both the conservative_signals
sysctl, and adds SIGALRM to the list of acceptable signals for P_SUGID
processes.

> Yeah, if anything, we probably don't want to do that, because the
> resources set up as root are usually more attractive.  I don't have a
> problem with coding in some FreeBSD-isms, but I don't see it as buying
> us anything, does it?

I'm not sure there are explicit benefits in this specific situation,
except that you can run Diablo with the resource limits of the user you
configure, and potentially those might be similar to (but perhaps not
identical to) those given to root.  I.e., instead of hard-coding "use the
resource limits of root", you're saying "use the resource limits of the
user Diablo is run as, and set those to what you want".  Given that
heavy-weight news servers are likely to be dedicated machines, it's a
subtle but perhaps useful semantic difference.

Updated patch below.

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories

Index: kern_prot.c
===
RCS file: /home/ncvs/src/sys/kern/kern_prot.c,v
retrieving revision 1.175
diff -u -r1.175 kern_prot.c
--- kern_prot.c 13 Jul 2003 01:22:20 -  1.175
+++ kern_prot.c 30 Aug 2003 19:45:50 -
@@ -1367,6 +1367,20 @@
return (cr_cansee(td->td_ucred, p->p_ucred));
 }
 
+/*
+ * 'conservative_signals' prevents the delivery of a broad class of
+ * signals by unprivileged processes to processes that have changed their
+ * credentials since the last invocation of execve().  This can prevent
+ * the leakage of cached information or retained privileges as a result
+ * of a common class of signal-related vulnerabilities.  However, this
+ * may interfere with some applications that expect to be able to
+ * deliver these signals to peer processes after having given up
+ * privilege.
+ */
+static int conservative_signals = 1;
+SYSCTL_INT(_security_bsd, OID_AUTO, conservative_signals, CTLFLAG_RW,
+&conservative_signals, 0, "Unprivileged processes prevented from "
+"sending certain signals to processes whose credentials have changed");
 /*-
  * Determine whether cred may deliver the specified signal to proc.
  * Returns: 0 for permitted, an errno value otherwise.
@@ -1399,12 +1413,13 @@
 * bit on the target process.  If the bit is set, then additional
 * restrictions are placed on the set of available signals.
 */
-   if (proc->p_flag & P_SUGID) {
+   if (conservative_signals && (proc->p_flag & P_SUGID)) {
switch (signum) {
case 0:
case SIGKILL:
case SIGINT:
case SIGTERM:
+   case SIGALRM:
case SIGSTOP:
case SIGTTIN:
case SIGTTOU:

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: Filesystem problem

2003-08-31 Thread Robert Watson

On Sun, 31 Aug 2003, Kevin Bockman wrote:

> Anyone have any suggestions?  I can not control-C out of 'man vmstat'. 
> While doing 'make' in /usr/src/sys/boot it was hanging on as, when I
> restarted it, it got to i386/libi386 and will not do anything else.  I'm
> running that through serial console, it let me ^C out of that.  I tried
> going into single user mode and running umount, now it just sits there
> and I can't ^C.  I have no ideas, this was all working yesterday!! :-) 
> 
> Any ideas on what else to check or other helpful hints would help
> bunches. 
> 
> Sorry for the cross-posts.  Just not sure where to go with this one. 

Could you show the output of:

  ps axlwww

when things are hanging?  I'm particularly interested in the WCHAN entries
for hung processes and kernel threads.  That entry is the wait channel for
kernel thread sleeps, which should give us some sense of what they're
waiting for.  If it's a UFS bug of some sort, you'll likely see a lot of
processes blocked in "inode" -- this could also happen in a hardware
scenario, but should still be useful. In addition, do you have the entire
serial console log output since boot?  It would be interesting to know if
you've had kernel log messages regarding your hard disk controller, etc. 
This might help distinguish a hardware problem from a software problem. 

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories

> 
> Thanks,
> 
> Kevin
> 
> 
> 
> __
> Do you Yahoo!?
> Yahoo! SiteBuilder - Free, easy-to-use web site design software
> http://sitebuilder.yahoo.com
> ___
> [EMAIL PROTECTED] mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "[EMAIL PROTECTED]"
> 

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

5.2-RELEASE TODO

2003-09-01 Thread Robert Watson

; |
 | |   | | Robert Drehmel is  |
 | |   | | developing and |
 | |   | | testing patches|
 | |   | | now.   |
 |-+---+-+|
 | |   | | Apple's Darwin |
 | |   | | operating system   |
 | |   | | has fairly |
 | |   | | extensive  |
 | Merge of Darwin |   | | improvements to|
 | msdosfs, other  | --| --  | msdosfs and other  |
 | fixes   |   | | kernel services;   |
 | |   | | these fixes must   |
 | |   | | be reviewed and|
 | |   | | merged to the  |
 | |   | | FreeBSD tree.  |
 |-+---+-+|
 | |   | | Many systems   |
 | |   | | supporting |
 | |   | | POSIX.1e ACLs  |
 | |   | | permit a minor |
 | |   | | violation to that  |
 | |   | | specification, in  |
 | |   | | which the ACL_MASK |
 | |   | | entry overrides|
 | ACL_MASK override   | In| | the umask, rather  |
 | of umask support in | progress  | Robert Watson   | than being |
 | UFS |   | | intersected with   |
 | |   | | it. The resulting  |
 | |   | | semantics can be   |
 | |   | | useful in  |
 | |   | | group-oriented |
 | |   | | environments, and  |
 | |   | | as such would be   |
 | |   | | very helpful on|
 | |   | | FreeBSD.   |
 |-+---+-+|
 | |   | | Significant parts  |
 | |   | | of the network |
 | |   | | stack (especially  |
 | |   | | IPv4 and IPv6) now |
 | |   | | have fine-grained  |
 | |   | | locking of their   |
 | |   | | data structures.   |
 | |   | | However, it is not |
 | |   | | yet possible for   |
 | |   | | the netisr threads |
 | |   | | to run without |
 | |   | | Giant, due to  |
 | Fine-grained|   | | dependencies on|
 | network stack   | In| Jeffrey Hsu,| sockets, routing,  |
 | locking without | progress  | Seigo Tanimura, | etc. A 5.2-RELEASE |
 | Giant   |   | Sam Leffler | goal is to have|
 | |   | | the network stack  |
 | |   | | running largely|
 | |   | | without Giant, |
 | |   | | which should   |
 | |   | | substantially  |
 | |   | | improve|
 | |   | | performance of the |
 | |   | | stack, as well as  |
 | |   | | other system   |
 | |   | | components by  |
 | |   | | reducing   |
 | |   | | contention on  |
 | |   | |

Re: .fsck_snapshot file

2003-09-02 Thread Robert Watson

On Tue, 2 Sep 2003, Peter Jeremy wrote:

> On Tue, Sep 02, 2003 at 08:27:00AM +0200, Christoph Kukulies wrote:
> >I have a file .fsck_snapshot in /usr (of 7 GB ?!)
> >-r   1 root   wheel  7220781056 Aug 22 18:08 .fsck_snapshot
> 
> The '7GB' does not mean you'll free up 7GB of disk space by freeing it. 
> IIRC, it's actually the size of the filesystem. 

Although, as time goes by since the creation a snapshot, if you have a
fairly live file system, it will begin to approach it :-).  I've been
wondering if we ought to have some magic to GC these snapshots if/when
they are discovered during the boot or mount process just to be on the
safe side.  In theory, the window where the name exists is small (fsck
creates, opens, and immediately unlinks the snapshot).  Unfortunately, in
practice the window is apparently pretty wide since any system failure
during snapshot creation will (I believe) leave the snapshot reference
behind. 

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: Syncer "giving up" on buffers

2003-09-02 Thread Robert Watson


On Tue, 2 Sep 2003, Tim Robbins wrote:

> > Rev.1.3 of ext2fs/fs.h (etc.) abuses B_LOCKED to do little more than
> > make the sync() ignore ext2fs's private buffers (its complications are
> > mainly to handle the resulting B_LOCKED buffers).  It wants to brelse()
> > the buffers so that their BUF_REFCOUNT() is 0 and the sync() in boot()
> > is happy to handle them.  In the original fix, I think the buffers
> > could be B_DELWRI and then the sync() would fulush them, but setting
> > B_DELWRI was wrong and was changed (in rev.1.4) to setting the private
> > flag B_DIRTY instead.  Rev.1.13 esssentially removes the brelse() and
> > adds a new complication (BUF_KERNPROC()) and keeps the old ones.  I
> > think the BUF_KERNPROC() is less than useful -- without the brelse()'s,
> > the buffers are completely private to the file system.
> 
> Is there any particular reason why ext2fs keeps these buffers locked
> instead of reading/writing them in when it needs to, or storing them in
> malloc'd memory and reading/writing them at mount/unmount time? Do we
> need to ensure that group descriptors & inode/block bitmaps are not
> written to disk until the filesystem gets unmounted, or is it merely to
> improve performance and simplify the code? 

FWIW, I've seen similar behavior to this on ext2fs-free systems: in
particular, the "fsck in single user mode prevents clean shutdown" with
UFS.  It doesn't seem to happen regularly, and I thought the problem had
"gone away", but now that I think back, it could just be that I hardly
ever fsck the root file system in single-user mode since most of my
kernels crash on diskless dev boxes.  And since we don't log the last few
lines of the reboot when file systems are unmounted and buffers are
flushing... 

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories



___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Reminder: BSDCon next week in San Mateo!

2003-09-04 Thread Robert Watson


This is just a friendly reminder e-mail that the BSD Conference is taking
place in San Mateo next week, and that if you're planning to attend and
haven't yet registered, you might want to.  Or, just turn up and register
at the door.

There's a really strong lineup of FreeBSD-related papers, especially
relating to new features in the 5-CURRENT development line. I've attached
a list of just some of the interesting things that will be going on there:
they include a number of tutorials relating to development and
administration, technical session presentations relating to the
development of FreeBSD, development of products using FreeBSD, and the
deployment of FreeBSD-based systems.  And, as always, there will be a
variety of invited talks, BoFs and work-in-progress sessions. 

USENIX has extended their early registration pricing, and also (I believe) 
has an online registration discount.  Multi-employee discounts are also
available for companies sending more than one employee.  You can find out
more about the location, schedule of events, etc, at: 

  http://www.usenix.org/events/bsdcon03/

I look forward to seeing you there!

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories


  Several excellent tutorials including one on developing storage
extensions using GEOM

  Keynote: Computing Fallacies (or, What Is the World Coming To?)

  Reasoning about SMP in FreeBSD

  devd-A Device Configuration Daemon

  ULE: A Modern Scheduler for FreeBSD

  An Automated Binary Security Update System for FreeBSD

  Building a High-performance Computing Cluster Using FreeBSD

  build.sh: Cross-building NetBSD

  Invited Talk: Long Range 802.11 WANs

  BSD Status Reports

  GBDE-GEOM Based Disk Encryption

  Cryptographic Device Support for FreeBSD

  Enhancements to the Fast Filesystem to Support Multi-Terabyte Storage
Systems

  Invited Talk: Social and Technical Implications of Nonproprietary
Software

  Running BSD Kernels as User Processes by Partial Emulation and Rewriting
of Machine Instructions

  A Digital Preservation Network Appliance Based on OpenBSD

  Using FreeBSD to Render Realtime Localized Audio and Video

  Work in Progess Reports (WiPs)

  Tagging Data in the Network Stack: mbuf_tags

  Fast IPSec: A High-Performance IPsec Implementation

  The WHBA Project: Experiences "deeply embedding" NetBSD

  Invited Talk: Post-Digital Possibilities


___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: The bikeshed T-shirt

2003-09-11 Thread Robert Watson


On Fri, 12 Sep 2003, Poul-Henning Kamp wrote:

> The bikeshed T-shirt which has been referred about was only produced in
> 5 copies and I hadn't really expected that so many people would ask me
> about it. 
> 
> I don't want to get into the clothing business, so if you want one,
> you'll probably have to make it yourself.  I can ask the company which
> produced them if they will be willing to ship abroad, but I doubt they
> are set up for that sort of thing. 
> 
> The xfig source is here: 
>   http://phk.frebsd.dk/misc/bsdcon03.fig
>   http://phk.frebsd.dk/misc/bsdcon03.pdf
> 
> Enjoy... 

Note: before making a t-shirt, be sure to modify the image and select your
favorite bikeshed color.

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

5.2-RELEASE TODO

2003-09-15 Thread Robert Watson

; |
 | |   | | Robert Drehmel is  |
 | |   | | developing and |
 | |   | | testing patches|
 | |   | | now.   |
 |-+---+-+|
 | |   | | Apple's Darwin |
 | |   | | operating system   |
 | |   | | has fairly |
 | |   | | extensive  |
 | Merge of Darwin |   | | improvements to|
 | msdosfs, other  | --| --  | msdosfs and other  |
 | fixes   |   | | kernel services;   |
 | |   | | these fixes must   |
 | |   | | be reviewed and|
 | |   | | merged to the  |
 | |   | | FreeBSD tree.  |
 |-+---+-+|
 | |   | | Many systems   |
 | |   | | supporting |
 | |   | | POSIX.1e ACLs  |
 | |   | | permit a minor |
 | |   | | violation to that  |
 | |   | | specification, in  |
 | |   | | which the ACL_MASK |
 | |   | | entry overrides|
 | ACL_MASK override   | In| | the umask, rather  |
 | of umask support in | progress  | Robert Watson   | than being |
 | UFS |   | | intersected with   |
 | |   | | it. The resulting  |
 | |   | | semantics can be   |
 | |   | | useful in  |
 | |   | | group-oriented |
 | |   | | environments, and  |
 | |   | | as such would be   |
 | |   | | very helpful on|
 | |   | | FreeBSD.   |
 |-+---+-+|
 | |   | | Significant parts  |
 | |   | | of the network |
 | |   | | stack (especially  |
 | |   | | IPv4 and IPv6) now |
 | |   | | have fine-grained  |
 | |   | | locking of their   |
 | |   | | data structures.   |
 | |   | | However, it is not |
 | |   | | yet possible for   |
 | |   | | the netisr threads |
 | |   | | to run without |
 | |   | | Giant, due to  |
 | Fine-grained|   | | dependencies on|
 | network stack   | In| Jeffrey Hsu,| sockets, routing,  |
 | locking without | progress  | Seigo Tanimura, | etc. A 5.2-RELEASE |
 | Giant   |   | Sam Leffler | goal is to have|
 | |   | | the network stack  |
 | |   | | running largely|
 | |   | | without Giant, |
 | |   | | which should   |
 | |   | | substantially  |
 | |   | | improve|
 | |   | | performance of the |
 | |   | | stack, as well as  |
 | |   | | other system   |
 | |   | | components by  |
 | |   | | reducing   |
 | |   | | contention on  |
 | |   | |

Re: Fixing -pthreads (Re: ports and -current)

2003-09-24 Thread Robert Watson


On Wed, 24 Sep 2003, Scott Long wrote:

> I'm a big advocate of using libmap to deal with this.

Ditto.

Based on the results seen thus far, my preference would really be for:

(1) Keep -pthread, make it imply -lpthread, saving a lot of hassle.

(2) Ship all packages and binaries using threading with -lpthread -- i.e.,
a dynamic library dependency on libpthread.  This will mean that
administrators don't have to list each possible threading library in
/etc/libmap.conf in order to be sure they caught all of them.

(3) Use libmap to perform the necessary substitution on a per-application
or per-system basis.  If libpthread isn't available on an
architecture, default ship libmap.conf to substitute libthr for
libpthread on the platform for all applications.  Or libc_r, or
whatever.

This will result in all applications we ship having a consistent thread
library name so that administrators can substitute more easily. libpthread
would give you M:N threading by default, but it would be easy to perform
local changes to improve performance for applications that specifically
benefit from 1:1 threading, cothreads, etc.  Or if a serious compatibility
bug is found between libpthread and an application, they can substitute
easily as well.  I suppose this case might imply (4):

(4) If an application is known to be compatible only with a specific
threading model, do hard-code that in the application build somehow.

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories



___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: Fixing -pthreads (Re: ports and -current)

2003-09-24 Thread Robert Watson

On Wed, 24 Sep 2003, Dan Nelson wrote:

> Does it really matter if you end up linked to multiple threads
> libraries?  The first library providing a symbol wins, so the other
> shlibs just won't get used at all.  Libraries linked from the executable
> trump libraries linked from libraries, and LD_PRELOAD wins above all. 
> If one threads library exports a symbol not in the others, I'd call that
> an API bug in the first library. 
> 
> This should be no different from explicitly linking in dmalloc to
> override the malloc functions in libc, for example. 

One potential downside to the LD_PRELOAD approach is that it will only
work for applications that aren't setuid/setgid.  While today most of our
privileged and credential-munging applications aren't threaded, I'd like
to avoid precluding that in the future as much as possible.  What
mechanism should be used when LD_PRELOAD is being ignored due to
issetugid() returning true?

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

panic on yesterday's -CURRENT: linux emulation and vm (lockmgr: locking against myself)

2003-09-25 Thread Robert Watson


Running -CURRENT from yesterday:
FreeBSD paprika 5.1-CURRENT FreeBSD 5.1-CURRENT #1: Wed Sep 24 19:42:45
EDT 2003 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/PAPRIKAMAC  i386

MAC, mac_mls, mac_biba, X11, KDE, vic, sdr, xchat.  When I ran aim, the
system panicked. Trace below.  Please let me know if more information
would be useful.

panic: lockmgr: locking against myself
panic messages:
---
panic: lockmgr: locking against myself

syncing disks, buffers remaining... 3850 3850 3849 3849 3849 3849 3849
3849 3849
 3849 3849 3849 3849 3849 3849 3849 3849 3849 3849 3849 3849 3849 
giving up on 3349 buffers
Uptime: 14h13m32s
Dumping 511 MB
 16 32 48 64 80 96 112 128 144 160 176 192 208 224 240 256 272 288 304 320
336 3
52 368 384 400 416 432 448 464 480 496

0  doadump () at /usr/src/sys/kern/kern_shutdown.c:240
240 dumping++;
(kgdb) bt
#0  doadump () at /usr/src/sys/kern/kern_shutdown.c:240
#1  0xc034bed8 in boot (howto=256) at
/usr/src/sys/kern/kern_shutdown.c:372
#2  0xc034c267 in panic () at /usr/src/sys/kern/kern_shutdown.c:550
#3  0xc032baad in lockmgr (lkp=0xc6fc173c, flags=2, interlkp=0x100, 
td=0xc6498ab0) at /usr/src/sys/kern/kern_lock.c:439
#4  0xc04a424a in _vm_map_lock_read (map=0x0, file=0x0, line=0)
at /usr/src/sys/vm/vm_map.c:378
#5  0xc04a7d78 in vm_map_lookup (var_map=0xdd699a30, vaddr=0, 
fault_typea=1 '\001', out_entry=0xdd699a34, object=0x0, pindex=0x0, 
out_prot=0x0, wired=0xdd6999f8) at /usr/src/sys/vm/vm_map.c:2888
#6  0xc049f355 in vm_fault (map=0xc6fc1700, vaddr=0, fault_type=1 '\001', 
fault_flags=0) at /usr/src/sys/vm/vm_fault.c:219
#7  0xc04eddd9 in trap_pfault (frame=0xdd699b18, usermode=0, eva=0)
at /usr/src/sys/i386/i386/trap.c:709
#8  0xc04eda50 in trap (frame=
  {tf_fs = -1070333928, tf_es = -1067384816, tf_ds = 16, tf_edi = 0,
tf_esi = -1068054086, tf_ebp = -580281484, tf_isp = -580281532, tf_ebx =
441, tf_edx = -968258896, tf_ecx = 0, tf_eax = -968258896, tf_trapno = 12,
tf_err = 0, tf_eip = -1070325008, tf_cs = 8, tf_eflags = 66178, tf_esp =
0, tf_ss = -1068146900})
at /usr/src/sys/i386/i386/trap.c:418
#9  0xc04dd9e8 in calltrap () at {standard input}:102
#10 0xc04adbde in vm_page_sleep_if_busy (m=0x1b9, also_m_busy=1, msg=0x0)
at /usr/src/sys/vm/vm_page.c:441
#11 0xc04abcfb in vm_object_split (entry=0xc6eea8ac)
at /usr/src/sys/vm/vm_object.c:1226
#12 0xc04a702a in vm_map_copy_entry (src_map=0xc6fc1700,
dst_map=0xc6fc1e00, 
src_entry=0xc6eea8ac, dst_entry=0xc701803c)
at /usr/src/sys/vm/vm_map.c:2347
#13 0xc04a73ee in vmspace_fork (vm1=0xc6fc1700)
at /usr/src/sys/vm/vm_map.c:2488
#14 0xc04a201e in vm_forkproc (td=0xc6498ab0, p2=0xc714d1e4,
td2=0xc714bab0, 
flags=20) at /usr/src/sys/vm/vm_glue.c:624
#15 0xc032340e in fork1 (td=0xc6498ab0, flags=20, pages=0,
procp=0xdd699cc4)
at /usr/src/sys/kern/kern_fork.c:654
#16 0xc032239b in fork (td=0xc6498ab0, uap=0xdd699d10)
at /usr/src/sys/kern/kern_fork.c:102
#17 0xc4260fab in linux_fork (td=0xc6498ab0, args=0x0)
at /usr/src/sys/i386/linux/linux_machdep.c:280
#18 0xc04ee503 in syscall (frame=
  {tf_fs = 47, tf_es = 47, tf_ds = 47, tf_edi = 671827304, tf_esi = 0,
tf_ebp = -1090520360, tf_isp = -580280972, tf_ebx = 671837948, tf_edx = 1,
tf_ecx = 1, tf_eax = 2, tf_trapno = 12, tf_err = 2, tf_eip = 675477303,
tf_cs = 31, tf_eflags = 582, tf_esp = -1090520388, tf_ss = 47})
at /usr/src/sys/i386/i386/trap.c:1006
#19 0xc04dda3d in Xint0x80_syscall () at {standard input}:144




Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: recent changes prohibit vinum swap.

2003-09-26 Thread Robert Watson

On Fri, 26 Sep 2003, David Gilbert wrote:

> Recent changes to -CURRENT prohibit vinum swap: 
> 
> [1:6:[EMAIL PROTECTED]:~> swapon /dev/vinum/swapmu swapon: /dev/vinum/swapmu:
> Operation not supported by device

In order to support swapping, Vinum will need to be modified to use struct
disk and the disk(9) API, rather than exposing its storage devices
directly via struct cdevsw and make_dev(9).  I.e., Vinum probably needs to
start approaching things as "disks" rather than "devices", a distinction
that's becoming more mature in -CURRENT.

>From a quick read of vinumconfig.c, I'm guessing this wouldn't be hard to
implement.  Some subset of struct sd, struct plex, and struct volume will
need to start holding a struct disk instance which would be passed to
disk_create() instead of a call to make_dev().  Much of the remainder will
just consist of a bit of tweaking to make Vinum extract its data from
bp->bio_disk->d_drv1 instead of bp->b_dev, replacing the ioctl dev_t
argument with a disk argument, etc.

I recently noticed that Vinum may be averse to blocksizes other than 512
bytes.  Or at least, I can get Vinum mirrors up and running on md devices
backed to memory, but not to swap, and the usual reason for problems on
that front is the 4k blocksize for swap-backed md devices.

I also noticed that the vinum commandline tool is a bit devfs-unfriendly,
or at least, it gets pretty verbose about how all the files/directories it
wants to create are already present.  It could be that a test for devfs
conditionally causing a test for EEXIST would go a long way in muffling
the somewhat loud complaining :-). 

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: recent changes prohibit vinum swap.

2003-09-26 Thread Robert Watson

On Sat, 27 Sep 2003, Greg 'groggy' Lehey wrote:

> > I recently noticed that Vinum may be averse to blocksizes other than
> > 512 bytes.
> 
> It shouldn't be.  There's never been any dependency on it.

I've attached the output from trying to use a swap md set below the malloc
md set.

> > I also noticed that the vinum commandline tool is a bit
> > devfs-unfriendly, or at least, it gets pretty verbose about how all
> > the files/directories it wants to create are already present.  It
> > could be that a test for devfs conditionally causing a test for
> > EEXIST would go a long way in muffling the somewhat loud complaining
> > :-).
> 
> I'm not sure I understand this.  Can you give me a concrete example?

none# vinum
vinum -> quit
none# mdconfig -a -t malloc -s 4m
md0
none# mdconfig -a -t malloc -s 4m
md1
none# mdconfig -a -t malloc -s 4m
md2
none# mdconfig -a -t malloc -s 4m
md3
none# vinum
vinum -> concat -v /dev/md0 /dev/md1 /dev/md2 /dev/md3
volume vinum0
  plex name vinum0.p0 org concat
drive vinumdrive0 device /dev/md0
sd name vinum0.p0.s0 drive vinumdrive0 size 0
drive vinumdrive1 device /dev/md1
sd name vinum0.p0.s1 drive vinumdrive1 size 0
drive vinumdrive2 device /dev/md2
sd name vinum0.p0.s2 drive vinumdrive2 size 0
drive vinumdrive3 device /dev/md3
sd name vinum0.p0.s3 drive vinumdrive3 size 0
Can't create /dev/vinum/vinum0: File exists
Can't create /dev/vinum/vol/vinum0: No such file or directory
Can't create /dev/vinum/vol/vinum0.plex: No such file or directory
Can't create /dev/vinum/plex/vinum0.p0: File exists
Can't create /dev/vinum/vol/vinum0.plex/vinum0.p0: No such file or directory
Can't create /dev/vinum/vol/vinum0.plex/vinum0.p0.sd: No such file or directory
Can't create /dev/vinum/sd/vinum0.p0.s0: File exists
Can't create /dev/vinum/sd/vinum0.p0.s1: File exists
Can't create /dev/vinum/sd/vinum0.p0.s2: File exists
Can't create /dev/vinum/sd/vinum0.p0.s3: File exists
V vinum0State: up   Plexes:   1 Size: 15 MB
P vinum0.p0   C State: up   Subdisks: 4 Size: 15 MB
S vinum0.p0.s0  State: up   D: vinumdrive0  Size:   3963 kB
S vinum0.p0.s1  State: up   D: vinumdrive1  Size:   3963 kB
S vinum0.p0.s2  State: up   D: vinumdrive2  Size:   3963 kB
S vinum0.p0.s3  State: up   D: vinumdrive3  Size:   3963 kB

Just to demonstrate it's not because of -v:

none# vinum
vinum -> concat /dev/md0 /dev/md1 /dev/md2 /dev/md3
Can't create /dev/vinum/vinum0: File exists
Can't create /dev/vinum/vol/vinum0: No such file or directory
Can't create /dev/vinum/vol/vinum0.plex: No such file or directory
Can't create /dev/vinum/plex/vinum0.p0: File exists
Can't create /dev/vinum/vol/vinum0.plex/vinum0.p0: No such file or directory
Can't create /dev/vinum/vol/vinum0.plex/vinum0.p0.sd: No such file or directory
Can't create /dev/vinum/sd/vinum0.p0.s0: File exists
Can't create /dev/vinum/sd/vinum0.p0.s1: File exists
Can't create /dev/vinum/sd/vinum0.p0.s2: File exists
Can't create /dev/vinum/sd/vinum0.p0.s3: File exists

Attached below is the swap-backed version.  Previously when I've bumped
into compatability problems involving different types of md devices, it's
come down to the blocksize on swap-backed nodes.  Doesn't have to be the
actual cause, but it might be a good place to start.

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories

none# mdconfig -a -t swap -s 4m
md0
none# mdconfig -a -t swap -s 4m
md1
none# mdconfig -a -t swap -s 4m
md2
none# mdconfig -a -t swap -s 4m
md3
none# vinum
vinum -> concat -v /dev/md0 /dev/md1 /dev/md2 /dev/md3
volume vinum0
  plex name vinum0.p0 org concat
drive vinumdrive0 device /dev/md0
sd name vinum0.p0.s0 drive vinumdrive0 size 0
drive vinumdrive1 device /dev/md1
sd name vinum0.p0.s1 drive vinumdrive1 size 0
drive vinumdrive2 device /dev/md2
sd name vinum0.p0.s2 drive vinumdrive2 size 0
drive vinumdrive3 device /dev/md3
sd name vinum0.p0.s3 drive vinumdrive3 size 0
Can't create /dev/vinum/vinum0: File exists
Can't create /dev/vinum/vol/vinum0: No such file or directory
Can't create /dev/vinum/vol/vinum0.plex: No such file or directory
Can't create /dev/vinum/plex/vinum0.p0: File exists
Can't create /dev/vinum/vol/vinum0.plex/vinum0.p0: No such file or directory
Can't create /dev/vinum/vol/vinum0.plex/vinum0.p0.sd: No such file or directory
Can't create /dev/vinum/sd/vinum0.p0.s0: File exists
Can't create /dev/vinum/sd/vinum0.p0.s1: File exists
Can't create /dev/vinum/sd/vinum0.p0.s2: File exists
Can't create /dev/vinum/sd/vinum0.p0.s3: File exists
V vinum0State: down Plexes:   1 Size: 15 MB
P vinum0.p0   C State: faulty   Subdisks: 4 Size: 15 MB
S vinum0.p0.s0  State: crashed  D: vinumdrive0  Size:   3963 kB
S vinum0.p0.s1  State: crashed  D: vinumdrive1  Size:   3963 kB
S vinum0.p0.

Re: HEADSUP: Change of makedev() semantics.

2003-09-28 Thread Robert Watson

On Mon, 29 Sep 2003, Greg 'groggy' Lehey wrote:

> On Sunday, 28 September 2003 at 23:22:07 +0200, Poul-Henning Kamp wrote:
> > Basically:
> >
> > 3. If you do a "normal" device driver, cache the result
> >from when you call make_dev().
> > ...
> >
> > ./dev/vinum
> > Failure to cache result of make_dev() ?
> 
> Where should this be cached?  Can you point to example code?

Actually, it looks like Vinum is caching the dev_t's, but it's not always
using them to get back to the dev_t--sometimes it's invoking makedev() 
instead.  However, this appears to happen only in the vinumrevive.c code,
so I'm not sure if that's a property of the cached reference being
unavailable -- it looks like it should be available in that context
though.  I.e., using sd->dev instead of VINUM_SD() -- it looks like there
is a valid (struct sd *) reference there to follow, so you can get to the
dev_t without doing a makedev(). 

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

5.2-RELEASE TODO

2003-09-29 Thread Robert Watson

; |
 | |   | | Robert Drehmel is  |
 | |   | | developing and |
 | |   | | testing patches|
 | |   | | now.   |
 |-+---+-+|
 | |   | | Apple's Darwin |
 | |   | | operating system   |
 | |   | | has fairly |
 | |   | | extensive  |
 | Merge of Darwin |   | | improvements to|
 | msdosfs, other  | --| --  | msdosfs and other  |
 | fixes   |   | | kernel services;   |
 | |   | | these fixes must   |
 | |   | | be reviewed and|
 | |   | | merged to the  |
 | |   | | FreeBSD tree.  |
 |-+---+-+|
 | |   | | Many systems   |
 | |   | | supporting |
 | |   | | POSIX.1e ACLs  |
 | |   | | permit a minor |
 | |   | | violation to that  |
 | |   | | specification, in  |
 | |   | | which the ACL_MASK |
 | |   | | entry overrides|
 | ACL_MASK override   | In| | the umask, rather  |
 | of umask support in | progress  | Robert Watson   | than being |
 | UFS |   | | intersected with   |
 | |   | | it. The resulting  |
 | |   | | semantics can be   |
 | |   | | useful in  |
 | |   | | group-oriented |
 | |   | | environments, and  |
 | |   | | as such would be   |
 | |   | | very helpful on|
 | |   | | FreeBSD.   |
 |-+---+-+|
 | |   | | Significant parts  |
 | |   | | of the network |
 | |   | | stack (especially  |
 | |   | | IPv4 and IPv6) now |
 | |   | | have fine-grained  |
 | |   | | locking of their   |
 | |   | | data structures.   |
 | |   | | However, it is not |
 | |   | | yet possible for   |
 | |   | | the netisr threads |
 | |   | | to run without |
 | |   | | Giant, due to  |
 | Fine-grained|   | | dependencies on|
 | network stack   | In| Jeffrey Hsu,| sockets, routing,  |
 | locking without | progress  | Seigo Tanimura, | etc. A 5.2-RELEASE |
 | Giant   |   | Sam Leffler | goal is to have|
 | |   | | the network stack  |
 | |   | | running largely|
 | |   | | without Giant, |
 | |   | | which should   |
 | |   | | substantially  |
 | |   | | improve|
 | |   | | performance of the |
 | |   | | stack, as well as  |
 | |   | | other system   |
 | |   | | components by  |
 | |   | | reducing   |
 | |   | | contention on  |
 | |   | |

Re: Improvements to fsck performance in -current ...?

2003-09-30 Thread Robert Watson

On Tue, 30 Sep 2003, Marc G. Fournier wrote:

> > Current has two major changes re speeding up fsck.
> >
> > The most significant is the background operation of fsck on file system
> > with soft updates enabled. Because of the way softupdates works, you are
> > assured of metadata consistency on reboot, so the file systems can be
> > mounted and used immediately with fsck started up in the background
> > about a minute after the system comes up.
> 
> Actually, I had this blow up on my -CURRENT desktop once ... didn't have
> a clue on how to debug it, so I switched from fsck -p to fsck -y to
> prevent it from happening again :(

No idea when this happened to you, but background fsck/snapshots have
become dramatically more stable since about half way between 5.0-release
and 5.1-release.  Kirk chased down a lot of serious bugs and issues with
hangs.  So experience from before that time may not be characteristic of
current behavior. 

> Now,I don't/wouldn't have softupdates enabled on / .. does the
> 'background fsck' know to not background if softupdates are not enabled? 
> I'm going to switch back to -p and look a bit closer the next time it
> happens (if it happens) to see if it is/was a softupdate file system
> that failed, now that I have a better idea of what I'm looking for ... 

sysinstall doesn't enable soft updates on / by default, as for small
partitions you increase the chance of running into space concerns.  Many
of the space concerns have been resolved by some more recent behavioral
changes in UFS.  The issue in question is that soft updates trickles out
changes in a write-behind, and in the event of a large delete followed by
an immediate large allocation, the deleted storage may not have been
reclaimed when the allocation comes along.  For example, if you had a
really small / and did an installkernel.  In more recent 5.x, UFS now
tracks the space that "will be freed" and reports it as freed, and
includes some support for waiting for space to become available (which it
inevitably will in that situation, once the soft updates have been
processed).

So the picture may have improved a lot since you last used it, depending
on when that was.

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

5.2-RELEASE TODO

2003-10-01 Thread Robert Watson

rting |
 | |   | | POSIX.1e ACLs  |
 | |   | | permit a minor |
 | |   | | violation to that  |
 | |   | | specification, in  |
 | |   | | which the ACL_MASK |
 | |   | | entry overrides|
 | ACL_MASK override   | In| | the umask, rather  |
 | of umask support in | progress  | Robert Watson   | than being |
 | UFS |   | | intersected with   |
 | |   | | it. The resulting  |
 | |   | | semantics can be   |
 | |   | | useful in  |
 | |   | | group-oriented |
 | |   | | environments, and  |
 | |   | | as such would be   |
 | |   | | very helpful on|
 | |   | | FreeBSD.   |
 |-+---+-+|
 | |   | | Significant parts  |
 | |   | | of the network |
 | |   | | stack (especially  |
 | |   | | IPv4 and IPv6) now |
 | |   | | have fine-grained  |
 | |   | | locking of their   |
 | |   | | data structures.   |
 | |   | | However, it is not |
 | |   | | yet possible for   |
 | |   | | the netisr threads |
 | |   | | to run without |
 | |   | | Giant, due to  |
 | Fine-grained|   | | dependencies on|
 | network stack   | In| Jeffrey Hsu,| sockets, routing,  |
 | locking without | progress  | Seigo Tanimura, | etc. A 5.2-RELEASE |
 | Giant   |   | Sam Leffler | goal is to have|
 | |   | | the network stack  |
 | |   | | running largely|
 | |   | | without Giant, |
 | |   | | which should   |
 | |   | | substantially  |
 | |   | | improve|
 | |   | | performance of the |
 | |   | | stack, as well as  |
 | |   | | other system   |
 | |   | | components by  |
 | |   | | reducing   |
 | |   | | contention on  |
 | |   | | Giant. |
 |-+---+-+|
 | |   | | Productionable |
 | |   | | support for the|
 | |   | | AMD64 platform. It |
 | |   | | currently meets|
 | |   | | most of the|
 | Tier-1 Support for  | In| Peter Wemm, | requirements for   |
 | AMD64 Hammer| progress  | David O'Brien   | the Tier-1 |
 | |   | | classification,|
 | |   | | but a formal   |
 | |   | | ruling must be |
 | |   | | made in time for   |
 | |   | | 5.2-RELEASE.   |
 |-+---+-+|
 | |   | | Kernel modules are |
 | |   | | currently built|
 | |   | | independently from |
 | |   | | a kernel   |
 | |   | | configur

< 1 2 3 4 5 6 7 >

201 - 300 of 682 matches

Mail list logo