rc.d scripts to control multiple instances of the same daemon?

2013-06-25 Thread Garrett Wollman
I'm in the process of (re)writing an rc.d script for kadmind
(security/krb5).  Unlike the main Kerberos daemon, kadmind needs to
have a separate instance for each realm on the server -- it can't
support multiple realms in a single process.  What I need to be able
to do:

1) Have different flags and pidfiles for each instance.
2) Be able to start, stop, restart, and status each individual
instance by giving its name on the command line.
3) Have all instances start/stop automatically when a specific
instance isn't specified.

I've looked around for examples of good practice to emulate, and
haven't found much.  The closest to what I want looks to be
vboxheadless, but I'm uncomfortable with the amount of mechanism from
rc.subr that it needs to reimplement.  Are there any better examples?

-GAWollman

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: NFS server bottlenecks

2012-10-12 Thread Garrett Wollman
On Fri, 12 Oct 2012 22:05:54 -0400 (EDT), Rick Macklem rmack...@uoguelph.ca 
said:

 I've attached the patch drc3.patch (it assumes drc2.patch has already been
 applied) that replaces the single mutex with one for each hash list
 for tcp. It also increases the size of NFSRVCACHE_HASHSIZE to 200.

I haven't tested this at all, but I think putting all of the mutexes
in an array like that is likely to cause cache-line ping-ponging.  It
may be better to use a pool mutex, or to put the mutexes adjacent in
memory to the list heads that they protect.  (But I probably won't be
able to do the performance testing on any of these for a while.  I
have a server running the drc2 code but haven't gotten my users to
put a load on it yet.)

-GAWollman
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: NFS server bottlenecks

2012-10-10 Thread Garrett Wollman
On Tue, 9 Oct 2012 20:18:00 -0400 (EDT), Rick Macklem rmack...@uoguelph.ca 
said:

 And, although this experiment seems useful for testing patches that try
 and reduce DRC CPU overheads, most real NFS servers will be doing disk
 I/O.

We don't always have control over what the user does.  I think the
worst-case for my users involves a third-party program (that they're
not willing to modify) that does line-buffered writes in append mode.
This uses nearly all of the CPU on per-RPC overhead (each write is
three RPCs: GETATTR, WRITE, COMMIT).

-GAWollman

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: NFS server bottlenecks

2012-10-03 Thread Garrett Wollman
On Wed, 3 Oct 2012 09:21:06 -0400 (EDT), Rick Macklem rmack...@uoguelph.ca 
said:

 Simple: just use a sepatate mutex for each list that a cache entry
 is on, rather than a global lock for everything. This would reduce
 the mutex contention, but I'm not sure how significantly since I
 don't have the means to measure it yet.
 
 Well, since the cache trimming is removing entries from the lists, I don't
 see how that can be done with a global lock for list updates?

Well, the global lock is what we have now, but the cache trimming
process only looks at one list at a time, so not locking the list that
isn't being iterated over probably wouldn't hurt, unless there's some
mechanism (that I didn't see) for entries to move from one list to
another.  Note that I'm considering each hash bucket a separate
list.  (One issue to worry about in that case would be cache-line
contention in the array of hash buckets; perhaps NFSRVCACHE_HASHSIZE
ought to be increased to reduce that.)

 Only doing it once/sec would result in a very large cache when bursts of
 traffic arrives.

My servers have 96 GB of memory so that's not a big deal for me.

 I'm not sure I see why doing it as a separate thread will improve things.
 There are N nfsd threads already (N can be bumped up to 256 if you wish)
 and having a bunch more cache trimming threads would just increase
 contention, wouldn't it?

Only one cache-trimming thread.  The cache trim holds the (global)
mutex for much longer than any individual nfsd service thread has any
need to, and having N threads doing that in parallel is why it's so
heavily contended.  If there's only one thread doing the trim, then
the nfsd service threads aren't spending time either contending on the
mutex (it will be held less frequently and for shorter periods).

 The only negative effect I can think of w.r.t.  having the nfsd
 threads doing it would be a (I believe negligible) increase in RPC
 response times (the time the nfsd thread spends trimming the cache).
 As noted, I think this time would be negligible compared to disk I/O
 and network transit times in the total RPC response time?

With adaptive mutexes, many CPUs, lots of in-memory cache, and 10G
network connectivity, spinning on a contended mutex takes a
significant amount of CPU time.  (For the current design of the NFS
server, it may actually be a win to turn off adaptive mutexes -- I
should give that a try once I'm able to do more testing.)

-GAWollman
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: NFS server bottlenecks

2012-10-02 Thread Garrett Wollman
[Adding freebsd-fs@ to the Cc list, which I neglected the first time
around...]

On Tue, 2 Oct 2012 08:28:29 -0400 (EDT), Rick Macklem rmack...@uoguelph.ca 
said:

 I can't remember (I am early retired now;-) if I mentioned this patch before:
   http://people.freebsd.org/~rmacklem/drc.patch
 It adds tunables vfs.nfsd.tcphighwater and vfs.nfsd.udphighwater that can
 be twiddled so that the drc is trimmed less frequently. By making these
 values larger, the trim will only happen once/sec until the high water
 mark is reached, instead of on every RPC. The tradeoff is that the DRC will
 become larger, but given memory sizes these days, that may be fine for you.

It will be a while before I have another server that isn't in
production (it's on my deployment plan, but getting the production
servers going is taking first priority).

The approaches that I was going to look at:

Simplest: only do the cache trim once every N requests (for some
reasonable value of N, e.g., 1000).  Maybe keep track of the number of
entries in each hash bucket and ignore those buckets that only have
one entry even if is stale.

Simple: just use a sepatate mutex for each list that a cache entry
is on, rather than a global lock for everything.  This would reduce
the mutex contention, but I'm not sure how significantly since I
don't have the means to measure it yet.

Moderately complicated: figure out if a different synchronization type
can safely be used (e.g., rmlock instead of mutex) and do so.

More complicated: move all cache trimming to a separate thread and
just have the rest of the code wake it up when the cache is getting
too big (or just once a second since that's easy to implement).  Maybe
just move all cache processing to a separate thread.

It's pretty clear from the profile that the cache mutex is heavily
contended, so anything that reduces the length of time it's held is
probably a win.

That URL again, for the benefit of people on freebsd-fs who didn't see
it on hackers, is:

 http://people.csail.mit.edu/wollman/nfs-server.unhalted-core-cycles.png.

(This graph is slightly modified from my previous post as I removed
some spurious edges to make the formatting look better.  Still looking
for a way to get a profile that includes all kernel modules with the
kernel.)

-GAWollman
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


NFS server bottlenecks

2012-10-01 Thread Garrett Wollman
I had an email conversation with Rick Macklem about six months ago
about NFS server bottlenecks.  I'm now in a position to observe my
large-scale NFS server under an actual production load, so I thought I
would update folks on what it looks like.  This is a 9.1 prerelease
kernel (I hope 9.1 will be released soon as I have four moe of these
servers to deploy!).  When under nearly 100% load on an 8-core
(16-thread) Quanta QSSC-S99Q storage server, with a 10G network
interface, pmcstat tells me this:

PMC: [INST_RETIRED.ANY_P] Samples: 2727105 (100.0%) , 27 unresolved
Key: q = exiting...
%SAMP IMAGE  FUNCTION CALLERS
 29.3 kernel _mtx_lock_sleep  nfsrvd_updatecache:10.0 
nfsrvd_getcache:7.4 ...
  9.5 kernel cpu_search_highest   cpu_search_highest:8.1 sched_idletd:1.4
  7.4 zfs.ko lzjb_decompress  zio_decompress
  4.3 kernel _mtx_lock_spin   turnstile_trywait:2.2 pmclog_reserve:1.0 
...
  4.0 zfs.ko fletcher_4_nativezio_checksum_error:3.1 
zio_checksum_compute:0.8
  3.6 kernel cpu_search_lowestcpu_search_lowest
  3.3 kernel nfsrc_trimcache  nfsrvd_getcache:1.6 nfsrvd_updatecache:1.6
  2.3 kernel ipfw_chk ipfw_check_hook
  2.1 pmcstat_init
  1.1 kernel _sx_xunlock
  0.9 kernel _sx_xlock
  0.9 kernel spinlock_exit

This does seem to confirm my original impression that the NFS replay
cache is quite expensive.  Running a gprof(1) analysis on the same PMC
data reveals a bit more detail (I've removed some uninteresting parts
of the call graph):


  called/total   parents 
index  %timeself descendents  called+selfname   index
  called/total   children
 4881.00  2004642.70  932627/932627  svc_run_internal [2]
[4] 45.1 4881.00  2004642.70  932627 nfssvc_program [4]
 13199.00   504436.33  584319/584319  nfsrvd_updatecache [9]
 23075.00   403396.18  468009/468009  nfsrvd_getcache [14]
 1032.25   416249.442239/2284svc_sendreply_mbuf [15]
 6168.00   381770.44   11618/11618   nfsrvd_dorpc [24]
 3526.8786869.88  112478/112514  nfsrvd_sentcache [74]
  890.0050540.894252/4252svc_getcred [101]
 14876.6032394.264177/24500   crfree cycle 3 [263]
 11550.1125150.733243/24500   free cycle 3 [102]
 1348.8815451.662716/16831   m_freem [59]
 4066.61  216.811434/1456svc_freereq [321]
 2342.15  677.40 557/1459malloc_type_freed [265]
   59.14 1916.84 134/2941crget [113]
 1602.250.00 322/9682bzero [105]
  690.930.00  43/44  getmicrotime [571]
  287.227.33 138/1205prison_free [384]
  233.610.00  60/798 PHYS_TO_VM_PAGE [358]
  203.120.00  94/230 nfsrv_mallocmget_limit 
[632]
  151.760.00  51/1723pmap_kextract [309]
0.78   70.28   9/3281_mtx_unlock_sleep [154]
   19.22   16.88  38/400403  nfsrc_trimcache [26]
   11.05   21.74   7/197 crsetgroups [532]
   30.370.00  11/6592critical_enter [190]
   25.500.00   9/36  turnstile_chain_unlock 
[844]
   24.860.00   3/7   nfsd_errmap [913]
   12.368.57   8/2145in_cksum_skip [298]
9.103.59   5/12455   mb_free_ext [140]
1.844.85   2/2202VOP_UNLOCK_APV [269]

---

0.490.15   1/1129009 uhub_explore [1581]
0.490.15   1/1129009 tcp_output [10]
0.490.15   1/1129009 pmap_remove_all [1141]
0.490.15   1/1129009 vm_map_insert [236]
0.490.15   1/1129009 vnode_create_vobject [281]
0.490.15   1/1129009 biodone [351]
0.490.15   1/1129009 vm_object_madvise [670]
0.490.15   1/1129009 xpt_done [483]
0.490.15   1/1129009 vputx [80]
0.490.15   1/1129009 vm_map_delete cycle 3 
[49]
0.490.15   1/1129009 vm_object_deallocate 
cycle 3 [356]
0.490.15   1/1129009 vm_page_unwire [338]
0.490.15   1/1129009 pmap_change_wiring [318]
0.980.31   2/1129009 getnewvnode [227]
0.98

Re: Replacing BIND with unbound (Was: Re: Pull in upstream before 9.1 code freeze?)

2012-07-09 Thread Garrett Wollman
On Sun, 8 Jul 2012 23:16:04 -0700, Avleen Vig avl...@gmail.com said:

 I could care less about the resolver daemon itself, I agree with what
 you're saying and I don't think most end users will care about that.
 But getting rid of dig and host in base would be bad.

I don't think it's as bad as you suggest, although I do think they we
would likely get a few black eyes from just the same.  After all, as
Doug says, people can just install the bind-tools package.  So long as
there is *a* tool in the base, even if it's not the two we all are
used to, that's sufficient for the purpose of doing enough
troubleshooting to get package installs working.

I think the embedded people have a better argument, but that's
probably still not strong enough versus the benefits of making this
change.

(The black eyes will come from reviewers saying WTF, FreeBSD! How can
you not have host and dig?! without bothering to read the release
notes or investigate ports that they might want to have installed.
The fact that this same crowd always installs sudo from ports will not
prevent them from being astonished.  I think there's a good case to be
made for having a set of packages that are installed by default by the
installer unless you disable it -- in my lab, we'll be wanting to
install puppet -- and bind-tools is probably one of them.)

-GAWollman

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Replacing BIND with unbound (Was: Re: Pull in upstream before 9.1 code freeze?)

2012-07-08 Thread Garrett Wollman
On Sun, 08 Jul 2012 02:31:17 -0700, Doug Barton do...@freebsd.org said:

 Neither of which has any relevance to the actual root zone ZSK, which
 could require an emergency roll tomorrow.

Surely that's why there's a separate KSK.  The ZSK can be rolled at
any time.

-GAWollman
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Replacing BIND with unbound (Was: Re: Pull in upstream before 9.1 code freeze?)

2012-07-07 Thread Garrett Wollman
On Sat, 07 Jul 2012 16:17:53 -0700, Doug Barton do...@freebsd.org said:

 BIND in the base today comes with a full-featured local resolver
 configuration, which I'm confident that Dag-Erling can do for unbound
 (and which I would be glad to assist with if needed). Other than that,
 what integration are you concerned about?

The utilities (specifically host(1) and dig(1)) are the only
user-visible interfaces I care about.  I don't see any need for there
to be an authoritative name server in the base system.  So long as the
resolver works properly and does DNSsec validation

-GAWollman
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


CFR: Exceedingly minor fixes to libc

2009-11-13 Thread Garrett Wollman
If you have a moment, please take a look at the following patch.  It
contains some very minor fixes to various parts of libc which were
found by the clang static analyzer.  They fall into a few categories:

1) Bug fixes in very rare situations (mostly error-handling code that
has probably never been executed).

2) Dead store elimination.

3) Elimination of unused variables.  (Or in a few cases, making use of
them.)

Some minor style problems were also fixed in the vicinity.  There
should be no functional changes except in very unusual conditions.

-GAWollman

Index: stdio/fvwrite.c
===
--- stdio/fvwrite.c (revision 199242)
+++ stdio/fvwrite.c (working copy)
@@ -60,7 +60,7 @@
char *nl;
int nlknown, nldist;
 
-   if ((len = uio-uio_resid) == 0)
+   if (uio-uio_resid == 0)
return (0);
/* make sure we can write */
if (prepwrite(fp) != 0)
Index: stdio/vfwprintf.c
===
--- stdio/vfwprintf.c   (revision 199242)
+++ stdio/vfwprintf.c   (working copy)
@@ -293,7 +293,7 @@
 * number of characters to print.
 */
p = mbsarg;
-   insize = nchars = 0;
+   insize = nchars = nconv = 0;
mbs = initial_mbs;
while (nchars != (size_t)prec) {
nconv = mbrlen(p, MB_CUR_MAX, mbs);
Index: stdio/xprintf_time.c
===
--- stdio/xprintf_time.c(revision 199242)
+++ stdio/xprintf_time.c(working copy)
@@ -64,7 +64,6 @@
intmax_t t, tx;
int i, prec, nsec;
 
-   prec = 0;
if (pi-is_long) {
tv = *((struct timeval **)arg[0]);
t = tv-tv_sec;
@@ -78,6 +77,8 @@
} else {
tp = *((time_t **)arg[0]);
t = *tp;
+   nsec = 0;
+   prec = 0;
}
 
p = buf;
Index: stdio/fgetws.c
===
--- stdio/fgetws.c  (revision 199242)
+++ stdio/fgetws.c  (working copy)
@@ -89,7 +89,7 @@
if (!__mbsinit(fp-_mbstate))
/* Incomplete character */
goto error;
-   *wsp++ = L'\0';
+   *wsp = L'\0';
FUNLOCKFILE(fp);
 
return (ws);
Index: rpc/getnetconfig.c
===
--- rpc/getnetconfig.c  (revision 199242)
+++ rpc/getnetconfig.c  (working copy)
@@ -412,13 +412,13 @@
  * Noone needs these entries anymore, then frees them.
  * Make sure all info in netconfig_info structure has been reinitialized.
  */
-q = p = ni.head;
+q = ni.head;
 ni.eof = ni.ref = 0;
 ni.head = NULL;
 ni.tail = NULL;
 mutex_unlock(ni_lock);
 
-while (q) {
+while (q != NULL) {
p = q-next;
if (q-ncp-nc_lookups != NULL) free(q-ncp-nc_lookups);
free(q-ncp);
Index: rpc/svc_raw.c
===
--- rpc/svc_raw.c   (revision 199242)
+++ rpc/svc_raw.c   (working copy)
@@ -176,9 +176,8 @@
msg-acpted_rply.ar_results.proc = (xdrproc_t) xdr_void;
msg-acpted_rply.ar_results.where = NULL;
 
-   if (!xdr_replymsg(xdrs, msg) ||
-   !SVCAUTH_WRAP(SVC_AUTH(xprt), xdrs, xdr_proc, xdr_where))
-   stat = FALSE;
+   stat = xdr_replymsg(xdrs, msg) 
+   SVCAUTH_WRAP(SVC_AUTH(xprt), xdrs, xdr_proc, xdr_where);
} else {
stat = xdr_replymsg(xdrs, msg);
}
Index: rpc/clnt_raw.c
===
--- rpc/clnt_raw.c  (revision 199242)
+++ rpc/clnt_raw.c  (working copy)
@@ -92,13 +92,13 @@
rpcprog_t prog;
rpcvers_t vers;
 {
-   struct clntraw_private *clp = clntraw_private;
+   struct clntraw_private *clp;
struct rpc_msg call_msg;
-   XDR *xdrs = clp-xdr_stream;
-   CLIENT  *client = clp-client_object;
+   XDR *xdrs;
+   CLIENT  *client;
 
mutex_lock(clntraw_lock);
-   if (clp == NULL) {
+   if ((clp = clntraw_private) == NULL) {
clp = (struct clntraw_private *)calloc(1, sizeof (*clp));
if (clp == NULL) {
mutex_unlock(clntraw_lock);
@@ -110,6 +110,9 @@
clp-_raw_buf = __rpc_rawcombuf;
clntraw_private = clp;
}
+   xdrs = clp-xdr_stream;
+   client = clp-client_object;
+
/*
 * pre-serialize the static part of the call msg and stash it away
 */
Index: rpc/svc_vc.c
===
--- rpc/svc_vc.c(revision 199242)
+++ rpc/svc_vc.c(working copy)
@@ -141,7 

Re: NFS client/buffer cache deadlock

2005-04-23 Thread Garrett Wollman
On Fri, 22 Apr 2005 11:08:35 -0400, Brian Fundakowski Feldman [EMAIL 
PROTECTED] said:


 Can you find any evidence that it's acceptable to interleave multiple
 writers that are doing O_APPEND?  At best, to do what you're asking,
 they could be kept from being interleaved from the context of one
 specific NFS client host...

As far as POSIX goes, the standard says that applications are expected
to handle serialization.  It makes no exception for O_APPEND.

-GAWollman

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: NFS client/buffer cache deadlock

2005-04-22 Thread Garrett Wollman
On Thu, 21 Apr 2005 10:36:12 +0200, [EMAIL PROTECTED] (Dag-Erling Smørgrav) 
said:

 POSIX == SUSv3 these days.

Not quite.  POSIX and SUSv3 use the same specification, but don't
require the same things.  (Specifically, SUSv3 requires the XSI option
to be implemented.)

-GAWollman

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: NFS client/buffer cache deadlock

2005-04-21 Thread Garrett Wollman
On Wed, 20 Apr 2005 16:38:42 +0200, Marc Olzheim [EMAIL PROTECTED] said:

 Btw.: I'm not sure write(),writev() and pwrite() are allowed to do short
 writes on regular files... ?

I believe it is the intent of the Standard to prohibit this (a
paragraph in the rationale says that short writes can only happen if
O_NONBLOCK is set, but this is clearly wrong because the normative
text says end-of-medium also results in a short write) but there does
not appear to be any language which requires atomic behavior for
descriptors other than pipes and FIFOs.

As a quality-of-implementation matter, for writes to regular files not
to be atomic would be considered surprising.

-GAWollman

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: NFS client/buffer cache deadlock

2005-04-21 Thread Garrett Wollman
On Wed, 20 Apr 2005 11:52:33 -0400, Brian Fundakowski Feldman [EMAIL 
PROTECTED] said:

 I think the first is more useful behavior than the last.  Supporting it
 should be exactly the same as supporting what happens if the actual
 filesystem fills up.  In this case, the filesystem is being requested to
 write more than there is room for.

Returning a short write for operations on regular files would
definitely be considered astonishing.  The changes that you have made
should be considered flow control, not admission control, and should
appear to the user no differently than if we were waiting for a slow
disk to write something; i.e., the user thread should be blocked until
either the entire write completes, or the process is interrupted by a
signal.

-GAWollman

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: My project wish-list for the next 12 months

2004-12-02 Thread Garrett Wollman
On Wed, 01 Dec 2004 15:29:10 -0800, Jason C. Wells [EMAIL PROTECTED] said:

 This sounds very close to OpenAFS.  I don't know what distinguishes a SAN 
 from other types of NAS.  OpenAFS does everything you mentioned in the 
 above paragraph.  OpenAFS _almost_ works on FreeBSD right now.

AFS's consistency model is wholly unsuitable for clustering.

-GAWollman

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: My project wish-list for the next 12 months

2004-12-02 Thread Garrett Wollman
[Cc list trimmed]

On Thu, 2 Dec 2004 07:55:30 +0100, Miguel Mendez [EMAIL PROTECTED] said:

 The lack of speed in some apps can be blamed mostly on the toolkits.

I'll second that.

 GTK+ 1.2 was a speed demon, GTK+ 2.x is a lot slower.

And either one is an enormous hog compared to Athena widgets.  (This
is something of an accomplishment, since people have been complaining
about the efficiency of Xt since it was first released nearly 20 years
ago.  All the credit goes to Gordon Moore: X was slow on a 1-MIPS
MicroVAX or Sun-2 workstation.)

I spend almost all of my time in two applications: xterm and emacs.
Both once set records for obscene memory consumption on an earlier
generation of systems.

-GAWollman

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Problem detecting POSIX symbolic constants

2002-10-09 Thread Garrett Wollman

On Wed, 9 Oct 2002 22:23:07 -0400, Craig Rodrigues [EMAIL PROTECTED] said:

 I was advised by Terry Lambert to use:
 #ifdef _POSIX_REALTIME_SIGNALS

Terry was wrong.  If _POSIX_REALTIME_SIGNALS is undefined, it means
one of two things:

- The RTS option is not supported, or
- You can't tell whether or not the RTS option is supported.

The section of XBD you quoted refers only to the two symbolic onstants
listed in that bullet point: _POSIX_CHOWN_RESTRICTED and
_POSIX_NO_TRUNC.  Both of these represent options from older version
of POSIX which are now mandatory (which is also true of the following
bullet-point).  This change was made in POSIX.1-2001 for alignment
with the FIPS-151 standard.

 The change I made worked fine in -STABLE.
 However, in -CURRENT, this test breaks, because _POSIX_REALTIME_SIGNALS
 is defined, but it is -1.

POSIX.1-2001 requires that this constant be so defined if the option
is not supported.  See the unistd.h section.

 Can I appeal to the freebsd-standards team to leave these macros undefined
 instead of defining them to -1?  #ifdef/#ifndef is a pretty common way
 to detect if a feature is available on a system, especially when used
 in conjunction with something like autoconf.

It's also wrong when used in conjunction with POSIX options.

The correct test is as follows:

If the option constant is not defined, you really have no idea whether
the option is available or not.  If the corresponding sysconf(3) key
*is* defined, then you might be able to use sysconf(3) or getconf(1)
to determine whether the option is supported.  You will probably still
need to check for the headers and functions manually, because the
implementation is not being forthcoming and probably doesn't implement
the function even if sysconf() claims it does.

If the option constant is defined as -1, the option is guaranteed not
to be supported, and the implementation provides neither the library
functions nor the header files necessary to compile a program which
makes use of the option.

If the option constant is defined as a positive value, the option is
guaranteed to be supported under all configurations of the system.
The library functions and header files are supplied for use with the C
compiler, c99(1) (or c89(1) for older POSIX standards).  For some
functions, additional library specifications must be provided on the
c99 command line (see the standard for details).  The precise valu of
the option constant will tell you what version of the option is
supported; for POSIX.1-2001, the option constant if positive must
usually be defined to 200112L.

If the option constant is defined but zero, the option may or may not
be supported depending on run-time configuration.  Library functions
and header files are supplied as described in the previous case, but
applications must call sysconf(3) with the appropriate key to
determine at run time whether the option is supported on the system as
currently configured.

These rules have changed somewhat as the POSIX standards have evolved,
but this is the current state of affairs as set out in 1003.1-2001.

Hope this helps.

-GAWollman


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: sem_init help?

2002-09-18 Thread Garrett Wollman

On Wed, 18 Sep 2002 22:16:07 -0400 (EDT), Daniel Eischen [EMAIL PROTECTED] 
said:

 The semaphore remains active until it is destroyed.  If you don't
 want to track its page, can you hook it into ipcrm(1)?

A simple way of implementing process-shared anonymous semaphores,
using the kernel support, is to simply create a temporary semaphore,
and (important part) store the pathname in the sem_t.  Then, every
semaphore operation becomes sem_open, kernel operation, sem_close, and
destroy unlinks the temporary semaphore.

This would be a poor-quality implementation, but it would work.  A
better implementation would not use a temporary semaphore at all.

-GAWollman


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: sys/types.h or not sys/types.h? [Was: cvs commit: src/include grp.h]

2002-02-25 Thread Garrett Wollman

On Mon, 25 Feb 2002 17:23:53 +0300, Andrey A. Chernov [EMAIL PROTECTED] said:

 From IEEE P1003.1 Draft 7:

You're looking at the wrong document.  FreeBSD is very far from being
ready to implement POSIX 2001 header files.  POSIX 1990, which we do
implement, requires sys/types.h almost everywhere.

-GAWollman


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Some thoughts on if_ioctl()

2001-10-08 Thread Garrett Wollman

On Mon, 8 Oct 2001 11:32:14 +0400, Yar Tikhiy [EMAIL PROTECTED] said:

 Second, let's look at the handling of SIOCADDMULTI/SIOCDELMULTI.
 There is code obviously taken from if_loop.c and used in some
 drivers, which tries to do something with the third argument data
 of the if_ioctl() driver method if data isn't NULL.

The historic implementation passed SIOCADDMULTI directly down to the
interface to implement, which resulted in lots of duplicated code all
over the place to manage the list of multicast addresses.  Several
years ago, I rewrote the multicast management code to simply indicate
to the driver when the list has changed, obviating the need for the
driver itself to manage the list.

 If I understand the kernel code right, if_ioctl()'s third argument
 is always NULL

Not so.  Any ioctl() in class 'i' which is not handled by the generic
code will get passed down to the driver to handle; some of these
requests may require the data pointer.

-GAWollman


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: [kernel patch] fcntl(...) to close many descriptors

2001-02-13 Thread Garrett Wollman

On Sun, 28 Jan 2001 10:37:16 -0800 (PST), Luigi Rizzo [EMAIL PROTECTED] said:

 basically i am thinking of something like

   generic_syscall("fdcloseall", );

No less clear than
ret = syscall(SYS_FDCLOSEALL, ...);

-GAWollman



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: [kernel patch] fcntl(...) to close many descriptors

2001-01-29 Thread Garrett Wollman

On Sun, 28 Jan 2001 12:45:10 -0800 (PST), Luigi Rizzo [EMAIL PROTECTED] said:

 kind-of, though the function name should be a string and not
 an integer (easier to extend/allocate), and it should allow
 return values in user-supplied buffers, same as ioctl/fcntl
 calls do.

dlsym()

-GAWollman



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: [kernel patch] fcntl(...) to close many descriptors

2001-01-29 Thread Garrett Wollman

On Mon, 29 Jan 2001 11:02:45 -0800 (PST), Luigi Rizzo [EMAIL PROTECTED] said:

 And, this mechanism would be explicitly used for "non portable" or
 experimental functions (such as the closeall() which started the
 thread, or next time someone comes up with a start_http_server_thread())
 and avoiding overloading an existing syscall or having to modify
 libc

This assumes that experimental functionality is always going to
implemented as a system call.

-GAWollman



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: [kernel patch] fcntl(...) to close many descriptors

2001-01-29 Thread Garrett Wollman

On Mon, 29 Jan 2001 11:26:29 -0800 (PST), Luigi Rizzo [EMAIL PROTECTED] said:

 but there is a problem with syscall() in that according to
 the manpages it cannot handle in/out parameters as instead
 it is supported by ioctl/fcntl

Of course it can, and the manual page doesn't even suggest what you
say.  It says:

 There is no way to simulate system calls that have multiple return values
 such as pipe(2).

pipe(2) is a special case in that it returns two values rather than
one.  The actual pipe(2) system call has *no* formal parameters; the
unpacking of the two values returned into the declared C formal
parameter is done by an assembly-language stub.  (This was done for
reasons of speed; it is much faster to return two values than it is to
copyout() a two-element array.

If you were implementing a Lisp binding of POSIX, you would probably
define PIPE to be a niladic function which returns a list of two
descriptors.

-GAWollman



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: [kernel patch] fcntl(...) to close many descriptors

2001-01-29 Thread Garrett Wollman

On Mon, 29 Jan 2001 12:07:11 -0800 (PST), Luigi Rizzo [EMAIL PROTECTED] said:

 ok, sorry for the confusion then (though, how does one tell from
 the manpage for pipe(2) what is going on there!)

You're not supposed to -- it's an implementation detail.

-GAWollman



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: RFC: /dev/console - /var/log/messages idea/patch

2000-11-22 Thread Garrett Wollman

On Wed, 22 Nov 2000 15:20:13 -0800 (PST), Mikko Tyolajarvi [EMAIL PROTECTED] said:

 Do you mean something like this?

Yes, exactly like that!

-GAWollman

--
Garrett A. Wollman   | O Siem / We are all family / O Siem / We're all the same
[EMAIL PROTECTED]  | O Siem / The fires of freedom 
Opinions not those of| Dance in the burning flame
MIT, LCS, CRS, or NSA| - Susan Aglukark and Chad Irschick


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



mtree verification output format

2000-10-02 Thread Garrett Wollman

On Mon, 02 Oct 2000 23:53:28 +0200, Poul-Henning Kamp [EMAIL PROTECTED] said:

   make "extra" and "missing" attributes in the output
   rather than prefixes which can be confused with filenames.

   Don't do the "run-in" of the first attribute with a short
   filename

This looks like a good change, but while you're there:

   size (13134, 13135)
   cksum (2005920215, 873112433)

This is still very obscure; I'd like to see:

size (was 1234, should be 5678)
cksum (was 42424242, should be 69696969)

...so that it's clear what the meaning of the numbers is.

-GAWollman



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Announcement: Two new FreeBSD-related mailing-lists

2000-09-26 Thread Garrett Wollman

Apologies in advance for the cross-posting.  (Hopefully you have a
mail system with duplicate suppression.)

I've created two new FreeBSD-related mailing-lists which people may
wish to subscribe to.  FreeBSD's postmaster did not think there would
be sufficient interest in these lists to justify their creation on
FreeBSD.org, so they will instead live on my machine.  (This means
that there will be no public archives of the discussion, unless
someone else volunteers to create one.)

The two lists are [EMAIL PROTECTED] and
[EMAIL PROTECTED]; subscribe in the usual manner.
Here are the info files for both lists:


The freebsd-print mailing-list is intended for the discussion of
print systems and software in the FreeBSD environment.  Germane
topics would include:

- The standard FreeBSD print spooler system, lpr(1)/lpd(8).

- Other print spooler systems, such as `rlpr' and `LPRng'.

- Related document-management and translation software such as
  `a2ps', `psutils', and `ghostscript'.

- Document-formatting systems such as groff(1) and TeX.

- Standards relevant to printing, such as IPP.

- Support for foreign printing protocols in FreeBSD using
  tools like `CAP' and `samba'.

This list is maintained and retroactively moderated by Garrett
Wollman, wollman@{{FreeBSD,bostonradio,decalcomania}.org,lcs.mit.edu}

The purpose of the freebsd-standards list is to discuss the impact of
various formal and informal standards on the FreeBSD operating
system.  Discussion will include the new POSIX 1003.1-200x / Single
UNIX Specification v3 standards (currently in process), as well as
other existing and future standards.  This mailing-list specifically
excludes discussion of standards for network protocol and APIs; these
should take place exclusively on the [EMAIL PROTECTED] list
instead.

This list is maintained and retroactively moderated by Garrett
Wollman, wollman@{{FreeBSD,bostonradio,decalcomania}.org,lcs.mit.edu}


-GAWollman

--
Garrett A. Wollman   | O Siem / We are all family / O Siem / We're all the same
[EMAIL PROTECTED]  | O Siem / The fires of freedom 
Opinions not those of| Dance in the burning flame
MIT, LCS, CRS, or NSA| - Susan Aglukark and Chad Irschick


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



kern.ipc.maxsockbuf vs reality?

2000-07-23 Thread Garrett Wollman

On Sun, 23 Jul 2000 15:33:45 -0700, Alfred Perlstein [EMAIL PROTECTED] said:

   if ((u_quad_t)cc  (u_quad_t)sb_max * MCLBYTES / (MSIZE + MCLBYTES))
   return (0);

I think the code here should clip the requested size into range rather
than failing the allocation.  That way, a program could just specify a
ridiculously-large buffer size and get whatever is the configured
maximum.

-GAWollman

--
Garrett A. Wollman   | O Siem / We are all family / O Siem / We're all the same
[EMAIL PROTECTED]  | O Siem / The fires of freedom 
Opinions not those of| Dance in the burning flame
MIT, LCS, CRS, or NSA| - Susan Aglukark and Chad Irschick


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: ACPI project progress report

2000-06-19 Thread Garrett Wollman

On Mon, 19 Jun 2000 10:07:26 -0700, Mike Smith [EMAIL PROTECTED] said:

 Hmm, this has me thinking again about suspend/resume.  In the current 
 context, can we expect a suspend veto from some function to actually 
 DTRT? (ie. drivers that have been suspended get a resume call).

That's how I originally implemented it, but I'm not sure whether that
has been maintained or not.

 Or should we make two passes over the suspend method?  One with "
 intention to suspend at this level", the second to actually perform the 
 suspension once the first has been accepted?

I think this is a good idea, and better than my implementation.

-GAWollman



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



RE: stupid FS questions

2000-05-30 Thread Garrett Wollman

On Tue, 30 May 2000 16:20:53 -0400, "Yevmenkin, Maksim N, CSCIO" 
[EMAIL PROTECTED] said:

 i know that :) i guess my questions were
 1) why the same piece of code duplicated in all ``mount_xxx'' utilities?

Because the original loadable module system held strongly to the
religion that the kernel should never load anything of its own
accord.  The designers of the current loadable module system made
different design choices, but the some traces of its predecessor still
remain.

-GAWollman

--
Garrett A. Wollman   | O Siem / We are all family / O Siem / We're all the same
[EMAIL PROTECTED]  | O Siem / The fires of freedom 
Opinions not those of| Dance in the burning flame
MIT, LCS, CRS, or NSA| - Susan Aglukark and Chad Irschick


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: NETGRAPH (proposal. FINAL)

2000-02-29 Thread Garrett Wollman

On Tue, 29 Feb 2000 18:59:50 +0100 (CET), Luigi Rizzo [EMAIL PROTECTED] said:

 can you clarify this ? Looong ago i used the '586 on a bridge and it did let
 me write the MAC header...

The 82586 has a mode bit which selects one of two possibilities:

1) The transmit command specifies the destination address and
length/ethertype field; the source address is inserted by hardware.
The receive buffer descriptor gets the source address and
length/ethertype.

2) The transmit and receive buffers include a full Ethernet header.

I can't say off the top of my head which the `ie' driver uses, but I
would bet on (2) because that's easier for the driver to deal with.

These sorts of controllers are the reason why ether_input takes the
Ethernet header as a separate parameter.

-GAWollman

--
Garrett A. Wollman   | O Siem / We are all family / O Siem / We're all the same
[EMAIL PROTECTED]  | O Siem / The fires of freedom 
Opinions not those of| Dance in the burning flame
MIT, LCS, CRS, or NSA| - Susan Aglukark and Chad Irschick


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



RE: NETGRAPH patches (proposal)

2000-02-23 Thread Garrett Wollman

CC's trimmed!

On Wed, 23 Feb 2000 13:43:17 -0500, "Yevmenkin, Maksim N, CSCIO" 
[EMAIL PROTECTED] said:

 this looks more and more like STREAMS

Which is part of the reason why Netgraph will always remain an
optional add-on, rather than the way the protocol stack is normally
constructed.

-GAWollman

--
Garrett A. Wollman   | O Siem / We are all family / O Siem / We're all the same
[EMAIL PROTECTED]  | O Siem / The fires of freedom 
Opinions not those of| Dance in the burning flame
MIT, LCS, CRS, or NSA| - Susan Aglukark and Chad Irschick


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



New victim.. er.. committer

1999-12-17 Thread Garrett Wollman

On Fri, 17 Dec 1999 23:17:00 -0500 (EST), Robert Watson [EMAIL PROTECTED] 
said:

 although I actually live near Amherst, Massachusetts much of the
 time.

Anyone up for a Southern New England FreeBSD ftf gtg?  I can host.  If
you think you might be interested, please reply privately.

-GAWollman

--
Garrett A. Wollman   | O Siem / We are all family / O Siem / We're all the same
[EMAIL PROTECTED]  | O Siem / The fires of freedom 
Opinions not those of| Dance in the burning flame
MIT, LCS, CRS, or NSA| - Susan Aglukark and Chad Irschick


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: mbuf shortage situations (followup)

1999-09-13 Thread Garrett Wollman
On Sun, 12 Sep 1999 23:19:13 -0400 (EDT), Bosko Milekic bmile...@dsuper.net 
said:

   This message is in MIME format.  The first part should be readable text,
   while the remaining parts are likely unreadable without MIME-aware tools.
   Send mail to m...@docserver.cac.washington.edu for more info.

It would be preferable if text were sent as text, since MIME-encoded
patches require more effort to read.

   I'm also aware of the possiblity of some people not liking the
 fact that we tsleep() forever (e.g. tsleep(x,x,x,0)). 


I don't have any problem with sleeping forever -- but I am concerned
about the possibility of deadlock, especially when client-NFS is
involved.  If the problem just moves around and has harder-to-recover
symptoms, the change isn't helping.

The 4.3BSD code had two different behaviors:

- For clusters, if M_WAIT was specified and there was no space
left in mb_map, it panicked.  However, m_clalloc was never called with
M_WAIT, so that panic was effectively dead code.

- For mbufs, if M_WAIT was specified and there were no mbufs
available, it would sleep at PZERO - 1 (which was interruptible).

In 4.3, the code was able to deal with cluster allocation failing.  We
have a somewhat different situation now, because many network
interface devices have less-flexible DMA mechanisms which don't allow
packet reception into non-contiguous buffers, so we need to have at
least a certain number of clusters available for this purpose.

-GAWollman

--
Garrett A. Wollman   | O Siem / We are all family / O Siem / We're all the same
woll...@lcs.mit.edu  | O Siem / The fires of freedom 
Opinions not those of| Dance in the burning flame
MIT, LCS, CRS, or NSA| - Susan Aglukark and Chad Irschick


To Unsubscribe: send mail to majord...@freebsd.org
with unsubscribe freebsd-hackers in the body of the message



Re: Mandatory locking?

1999-08-22 Thread Garrett Wollman
On Mon, 23 Aug 1999 10:06:54 +0930, Greg Lehey g...@lemis.com said:

 Correct.  I suppose it's worth discussing what the default should be.
 Should they get EAGAIN or block?  Obviously you'd want a way of
 specifying which, but there would have to be a default for
 non-lock-aware programs.  I think I'd go for blocking; it's less error
 prone.

I'd be strongly opposed to any sort of mandatory locking.  The whole
notion is unspeakably evil, although this is mitigated somewhat if it
does not apply to processes with appropriate privilege.

-GAWollman

--
Garrett A. Wollman   | O Siem / We are all family / O Siem / We're all the same
woll...@lcs.mit.edu  | O Siem / The fires of freedom 
Opinions not those of| Dance in the burning flame
MIT, LCS, CRS, or NSA| - Susan Aglukark and Chad Irschick


To Unsubscribe: send mail to majord...@freebsd.org
with unsubscribe freebsd-hackers in the body of the message



Re: cvs commit: src/usr.sbin/ac ac.8 ac.c

1999-07-02 Thread Garrett Wollman
On Fri, 2 Jul 1999 14:18:37 -0400 (EDT), Brian F. Feldman 
gr...@unixhelp.org said:

 Remember, the question was, Do we need to spend the effort making all
 of our programs support the use of - to denote std{in,out}?

No, because most of them (for which such an option might be relevant)
already do, or else don't need it (because they default to stdin).

-GAWollman

--
Garrett A. Wollman   | O Siem / We are all family / O Siem / We're all the same
woll...@lcs.mit.edu  | O Siem / The fires of freedom 
Opinions not those of| Dance in the burning flame
MIT, LCS, CRS, or NSA| - Susan Aglukark and Chad Irschick


To Unsubscribe: send mail to majord...@freebsd.org
with unsubscribe freebsd-hackers in the body of the message



timeconsuming processes on FreeBSD 3.1

1999-05-19 Thread Garrett Wollman
On Wed, 19 May 1999 18:01:57 +0200 (MET DST), Andre Rikkert de Koe 
arikk...@surf.iae.nl said:

 logonserver. Since than almost every day we find timeconsuming processes
 running while the user isn't even logged in (anymore). These programs are
 mostly tin and lynx and such interactive programs. We are sure that they

Some broken interactive programs don't bother to check whether the
terminal I/O they do succeeds or not, and will happily sit there
spinning at a revoked tty forever.  It is possible for these programs
to persist after logout if they either (1) ignore SIGHUP or (2) were
started in such a way as to block the propagation of SIGHUP to them
(some shells can do this).

-GAWollman

--
Garrett A. Wollman   | O Siem / We are all family / O Siem / We're all the same
woll...@lcs.mit.edu  | O Siem / The fires of freedom 
Opinions not those of| Dance in the burning flame
MIT, LCS, CRS, or NSA| - Susan Aglukark and Chad Irschick


To Unsubscribe: send mail to majord...@freebsd.org
with unsubscribe freebsd-hackers in the body of the message