from:"Terry Lambert"

Re: Linux mode is not enabled help

2002-12-01 Thread Terry Lambert

suken woo wrote:
 after cvsup'd recent,the Linux emulator was disabled.rebuild it ,
 get the same error:
 Linux mode is not enabled.
 Loading linux kernel module now...
 kldload: can't load linux: No such file or directory

This is common in -current.  You either have to give an
absolute path to the module file, or it has to be in one of
the new path locations.  The loader has the same problem,
when you try to load a kernel installed in /.

There is apparently a problem with the parsing of the path
elements, though I have not bothered to find out what it is.

I don't think current directory works, either; maybe it's
one of those over-zealous security things that keep happening.

To see your current path:

sysctl kern.module_path

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message

Re: Trivial patch: fdisk doesn't recognize my partitions

2002-12-01 Thread Terry Lambert

[ ... Partition ID changes ... ]

Nate Lawson wrote:
  But as I said, this is  rather marginal and I really don't feel
  it should go in unless this xor-0x10 convention is more widespread.
 
 partition magic does this too. isn't the correct failure mode just to
 print the part. id in hex instead of expanding it?


Frankly, who cares?

You guys still haven't told us, if these partitions are being
hidden... WHY ARE WE NOT RESPECTING THE DECISION TO HIDE THE
THINGS?  A user installed the software doing the hiding on
purpose.  The software changed the ID hide it, on purpose.
Windows ignores these partitions -- on purpose.


If you're not going to respect the user's wishes in this, then
that's a different kettle of fish... like not respecting the
user disabling things in the BIOS, because the probe routines
still detect it.

If you're going that route, why does FreeBSD care about partition
ID at all? All it is is a *hint*; it's not definitive.  It's not
lika a protocol type encapsulation on a packet.

It doesn't matter what the ID says, the rest of the partition
table entry demarcates a region of a linear arraw of bytes that
contain data.

I think looking at the content of that linear array is what
should determine what the content is, in the absence of a valid
hint.

Specifically, if it has a valid disklabel on the thing, I don't
care what partition ID it has on it, I give it to the disklabel
handler.  If it has a valid FAT32 FS on it, I give it to the
FAT32FS.  If it has a valid FFS superblock on it, I give it to
FFS.  Etc..

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message

Re: Trivial patch: fdisk doesn't recognize my partitions

2002-12-01 Thread Terry Lambert

Garance A Drosihn wrote:
 My own opinion is that if I have explicitly hid a partition,
 then freebsd should ignore it.  There are times that I do this
 specifically so *freebsd* will ignore it, and I don't want
 freebsd trying to second-guess what I meant.

Exactly.  If you wanted the dratted thing unhidden, then you
would use the tool you hid it with to unhide it.


 Specifically, if it has a valid disklabel on the thing, I don't
 care what partition ID it has on it, I give it to the disklabel
 handler.  If it has a valid FAT32 FS on it, I give it to the
 FAT32FS.  If it has a valid FFS superblock on it, I give it to
 FFS.  Etc..
 
 The fact that the disklabel is valid does not mean that the
 filesystem in that partition is still valid.  If I hide a
 partition, it may be that I had a very good reason for hiding
 it, and freebsd shouldn't be giving it to anything when the
 partition ID is not a recognized ID.

That's really for the FS code to deal with.

Handling it any other way means that a corrupt disk can panic
the machine.  That's a really dumb thing to allow, particularly
with removable media.

That's just a general principle, totally independant of hiding
things; I only point it out because the people who were wanting
the hidden partition types known to FreeBSD are totally missing
the point about what a partition is or isn't, and who's responsible
for validating the data therein.

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message

Re: Any ideas at all about network problem?

2002-12-01 Thread Terry Lambert

Craig Reyenga wrote:
 It worked fine in 4.7 and all previous versions, just DP2 dunno about DP1.

Well, you will have to back up to a version of the source code
before DP2 that didn't have the problem, perform a binary search
to find the exact delta that caused the problem, and examine the
code differences in order to find the problem change, and why it
causes the problem.

Personally, I'd start with DP1, but that's because I have a
CDROM locally, and the CVS tree is not always buildable, since
there is no software enforcement of buildability before a change
is committed.

There are almost 2 years worth of changes in the things which
were not brough back to the -STABLE branches from -CURRENT, so
diffing 4.7 and DP2 isn't likely to get you anywhere, I think.

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message

Re: Update to UFS2 Superblock Format

2002-11-30 Thread Terry Lambert

Kirk McKusick wrote:
 Ah
 No wonder, I tried editing the /sys/boot/i386/boot2/Makefile
 to enable UFS2 bootblock but then disklabel complained that
 boot2 was too big. I will have to revert to UFS1
 Thanks
 Manfred
 
 You have hit upon the exact problem. UFS2 has a much bigger area
 reserved for the boot block, but the programs that set up disk labels
 and boot blocks don't know about it yet so assume that they have to
 cram into the much smaller UFS1 boot-block area.

Seems to be a candidate to explain the disklabel corruption,
actually.  The disklabel is expected to follow the initial
boot code, and preceed the region(s) it describes...

Basically, the boot blocks are going to have to know the
disklabel offset, as promiscuous knowledge (i.e. hard-wired
intot he code).

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message

Re: Problem with ntpdate

2002-11-30 Thread Terry Lambert

Daniel C. Sobral wrote:
 Giorgos Keramidas wrote:
  On 2002-11-28 17:00, Daniel C. Sobral [EMAIL PROTECTED] wrote:
   I found out that ntpdate just doesn't seem to be working at all
   during boot. Ntpd dies because of the time differential (windows
   changes the time two hours because of the TZ). No message from
   ntpdate (I'll next try to divert it to syslog).

If you want to add code to fix this, it's trivial:

1)  Read the CMOS clock directly

2)  Read the CMOS clock via vm86()

3)  If there is a difference measured in round units, apply
it as an adjustment to the value each time you read directly.

Problem solved.  Basically, it comes down to initializing an
integer at boot time via a SYSCONFIG() created for that purpose.

Personally, I don't have any boxes with a BIOS that's still
broken.  I used to have one, but I disassembled it with Frank
van Gilluwe's Sourcer, hacked the timezone adjustment out
of the code, assembled the new code with MASM, and burnt some
new PROMs for it.  That was back in 1997.

If you are looking for advice, my advice is to fix your BIOS...
it's easier.  8-).  Otherwise, it wouldn't be hard at all to
hack the described fix into machdep.c, to make FreeBSD more
tolerant of broken hardware (always one in the plus column).

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message

Re: Are SysV semaphores thread-safe on CURRENT?

2002-11-30 Thread Terry Lambert

Brian Smith wrote:
 On Mon, 18 Nov 2002 22:05:34 -0800, Terry Lambert wrote:
 Use mmap of a backing-store file, and then use file locking to
 do record locking in the shared memory segment.
 
 Ok, I did this, and it actually works considerably better than
 the SysV shared memory.  However flock() has the same problem
 as the SysV semaphores, where they block the entire process,
 allowing the same deadlock situation to occur.  Has this flock()
 behavior changed in CURRENT?
 
 It seems like this behavior is much more likely to change than
 the SysV code.

Do you mean flock(), or do you mean fcntl(fs, F_SETLKW, ...)?

If you are using range locks, then you mean fcntl().

That's unfortunate: there's an easy way to convert blocking
file locks into non-blocking, plus a context switch.  I
thought th threads library already did this for you in the
fcntl() wrapper, in /usr/src/lib/libc_r/uthread/uthread_fcntl.c,
but apparently it doesn't.  8-(.


The easy way to do this is to convert the blocking request into
a non-blocking request, with a retry; e.g., where you have a
call to:

err = fcntl( fd, F_SETLKW, flock);

Replace it with:

while( ( err = fcntl( fs, F_SETLK, flock)) == -1  errno == EAGAIN) {
sleep( 1);  /* use nanosleep(), if 1 second is too big */
}

This will cause the processor to be yielded to other threads for
as long as the lock can't be acquired, an acquisititon will be
retried until it succeeds (effectively, blocking only that thread
in sleep()).  The difference between F_SETLKW and F_SETLK is why
I suggested the approach in the first place (FWIW).

The cost of doing this is that blocking requests will not be
serviced in FIFO order, as they would if F_SETLKW were being
used.  This may get expensive if you have a highly contended
resource, because you are effectively implementing a low cost
polling to obtain the lock.  The answer to this is that you
are not supposed to use semaphores for highly contended resources,
or if you do, use a spin-lock before you use the semaphore, so
you can fail early at reduced expense.

Probably making the above code into an line function and/or
actually modifiying the _fcntl() implementation in the threads
library is the way to go.


Worse comes to worse, I can give you a kernel patch so that an
fcntl() to assert a blocking lock on a non-blocking fd returns
the EWOULDBLOCK error, with a patch against _fcntl() similar to
the code in _read().

I didn't do that this time, because I don't know how much code
really depends on a lock assert on a non-blocking fd blocking
anyway, and no matter how you slice it, it's still going to
have the same non-FIFO ordering, unless I implemented a FIFO
ordered request queue, as well (it'd have to, to be correct).

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message

Re: Are SysV semaphores thread-safe on CURRENT?

2002-11-30 Thread Terry Lambert

Daniel Eischen wrote:
  No, libc_r doesn't properly handle flock.  Usually, all syscalls
  that take file descriptors as arguments honor the non-blocking
  mode of the file if set.  I guess flock(2) doesn't and has its
  own option to the operation argument (LOCK_NB).
 
  I hacked libc_r to periodically check (every 100msecs) the
  flock.  See if this fixes things:

Same thing I suggested, only I think he was really using fcntl(),
not flock()?

My patch wasn't integral to the library (it was more of a hack),
and my default time was 1S, not 100uS.

Same non-FIFO request ordering, too.  8-(.

I guess the real question is what is an fcntl()/flock() supposed
to do on a blocking call against a non-blocking fd?  I could not
tell, so I punted.

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message

Re: system locks with vnode backed md(4)

2002-11-30 Thread Terry Lambert

Michal Mertl wrote:
 I'm now unable to make it dead-lock again. Yet it happened quite easily. I
 had more md backing files in the same directory at the beginning (to test
 Terry's suspicion mentioned in thread 'jail' on hackers@).
 
 After the first lock-up I tried 'while(1);tar xzf ports.tgz; rm -rf
 ports;end' on normal filesystem, let it run for long time ( 1h) and then
 I found the system almost dead-locked too (the system worked, but anything
 accessing disk was painfully slow - it might be the same problem or it
 might be different. It never ended (at least for  ~30 mins when I didn't
 (weren't able) anything on it). syncer and bufdaemon and others were in
 wdrain. Disk as seen in systat -v showed maximal usage yet no inodes were
 resolved. Sometimes during that test I had lock order reversal:

Hmm.  This isn't actually the same, I think.  This is just the
point at which you have run out of available memory to maintain
additional dependencies, and giant held.

The key to this diagnosis is that you let it run for a long time
before it locked up.

The deadlock condition requires that two people do directory
traversals at the same time in vnode backed files in the same
directory.  It has to do with the locks on the backing files
as a result of root vnode traversal vs. the backing vnode in
the parent directory.  I haven't characterized it better than
that.

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message

Re: system locks with vnode backed md(4)

2002-11-30 Thread Terry Lambert

Robert Watson wrote:
 On Sat, 30 Nov 2002, Michal Mertl wrote:
  I'm now unable to make it dead-lock again. Yet it happened quite easily.
  I had more md backing files in the same directory at the beginning (to
  test Terry's suspicion mentioned in thread 'jail' on hackers@).
 
 I've noticed that chroot() environments tend to make existing deadlock
 opportunities more likely.  I'm not quite sure why that is.  :-)

Lock to parent.  It's the same reason you can lock up if you
use automount, with all the automount mount points happening
in the same subdirectory.

 There are a fair number of vnode locking deadlock scenarios that are
 unavoidable where we rely on grabbing vnode locks out of the directory
 structure lock order.  This occurs for vnode-backed md devices, quotas,
 and UFS1 extended attributes, and probably some other situations.  I
 suspect that Terry is correct that operations on the vnode backing file
 storage directory are triggering the problem, since that increases the
 chances that a vnode lock race to root will occur from both the file
 system backed into the md device, and for the md backing vnodes during
 blocking I/O.

See other postings.  The race to root is the one I was
originally commenting on.  I'm not sure that it applies in
this case, I think this case might be the out of memory to
create new soft dependencies case, where you can end up
holding a lock on a buffer that needs to be flushed to recover
memory, until you can satisfy the request to create a dependency
(starvation deadlock).  The race to root is a deadly embrace
deadlock.


 If you can avoid directory operations on the md backing
 directory, that would probably be one way to avoid triggering the bug.

Yes.  By placing each vnconfiged device in its own subdirectory,
you avoid them.  There's still a window on your host OS doing
it's own traversal, but that's (effectively) a whole FS lock,
so it doesn't trigger a problem.

 Seeing it reproduced would probably confirm that this is the case.

It's a pain.  I wasted a couple of days trying to reproduce,
without a box I could wipe and make into a wscratch box, with
little luck.  I think that it requires reproducing the failing
box in detail, which I wasn't willing to do (hence the workaround).


 On the
 other hand, there may be other deadlocks in the vnode/ufs/md code that can
 be more easily corrected than this general VFS problem, so details there
 would be very useful.

There are a number of them; they are all a pain.  It's really
tempting to just refactor the code so that all locking occurs
at the same logical layer, without being held across function
calls.  That'd be a heck of a lot of work, though... probably
worth it, in the end.

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message

Re: C++ Issue On -CURRENT

2002-11-30 Thread Terry Lambert

Cy Schubert - CITS Open Systems Group wrote:
  does the problem still occur if you add in 'using namespace std'?
 
 Thanks. That also fixed it.


Yeah.  Just remember that the standard namespace isn't.

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message

Re: Trivial patch: fdisk doesn't recognize my partitions

2002-11-29 Thread Terry Lambert

Bruce Evans wrote:
 On Thu, 28 Nov 2002, Poul-Henning Kamp wrote:
  In message [EMAIL PROTECTED], Riccardo Torrini write
  As far as I know it use an EXOR 0x10 to hide/unhide but fdisk doesn't
  recognize 0x0B/0x0C fat32 when hidden (0x1B/0x1C)
  But as I said, this is  rather marginal and I really don't feel
  it should go in unless this xor-0x10 convention is more widespread.
 
 Hiding partitions is a bug IMO, so it should have negative support.
 This convention would break many OS's conventions.
 E.g., NextSTEP | 0x10 gives BSDI.

If you think about it, if there is no one to claim it, it's
reasonable to treat it as raw disk space, and try to find a
partition on it.

Really, there's no reason to care about partition type at all,
since the contents will have the right magic numbers and the
right data layout for a FATFS: you don't really care.

That's really only meaningful if you decide the hiding that
magic.com does doesn't apply to you; if it applies to you,
then, in fact, it's a good thing that it's not recognized: the
magic.com program has successfuly accomplished what it was
written to accomplish -- so it's a non-problem.

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message

Re: [PATCH] Searching for users of netncp and nwfs to help

2002-11-27 Thread Terry Lambert

Julian Elischer wrote:
 Where does the passed in thread come from?

Your changes to make certain functions which are exported interfaces
take a thread * instead of a proc * argument.


 Generally don't use a thread pointer other than yourself unless you have
 a lock on the proc structure, or the schedlock. Certainly never store it
 anywhere.. Particularly anywhere that may persist while you sleep in any
 way.   -exception.. kernel threads- .. they are persistant.

The received lock response is going to come in on IPX, which is
like UDP, so it's connectionless.  The NCP is an el-cheapo
timeout-and-retransmit layer on top of formated IPX datagrams
(NetWare runs on IPX, not SPX).  Basically, this means that the
response and the async response to a lock requests, or the
async server-to-client notifications (shutdowns, etc.) can
come in and activate any listener.  The connection has to be
looked up, and then the operation has to be processed in the
context of the process that has the connection open.  It does
not care which thread it's processed on, only that it's in the
process that owns the connection (there are address space issues).

The main problem here is that lockmgr() is being called to lock
things that technically don't need to be locked, at all, really,
to insure that operations are not attempted concurrently.  It's
not really necessary: the server will refuse additional requests
on a connection, when there is one request outstanding.

The only exception isn't really relevent here, because the code
that I've seen writen doesn't really support packet burst
mode data transfers (pseudo-windowed data streams layered on
top of datagrams).

Basically, this means unless someone is willing to do the work
to set up a virtual circuit -- three network handles per -- per
each potentially outstanding thread, and then, further, maintain
an idle pool for them, everyone should treat the code as if it
were not thread reentrant... because it's not.

Gary Tomlinson, Duck, Ted Cowan, and others literally put man
years into getting that working, and they had access to Novell
source code in order to do the work in the NUC (NetWare UNIX
Client) product.  It's unlikely that it can be reverse-engineered
without at least a PNW/NWU (Portable NetWare/NetWare for UNIX)
source license.

The *only* reasons there's a thread in there now as a paremeter
is that (1) the top level interfaces require it and (2) the
lockmgr() calls, that shouldn't need to be there, IMO, require
it as a parameter.

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message

Re: [PATCH] Searching for users of netncp and nwfs to help

2002-11-27 Thread Terry Lambert

Terry Lambert wrote:
 The main problem here is that lockmgr() is being called to lock
 things that technically don't need to be locked, at all, really,
 to insure that operations are not attempted concurrently.  It's
 not really necessary: the server will refuse additional requests
 on a connection, when there is one request outstanding.

In case this wasn't clear to whoever was thinking of doing the
work: add a serialization barrier at the ncp_* layer.  You can
remove it later, without any other code being adversely affected,
if you add a connection pool later.

Note also that the credentials can be passed on the VC, if you
don't mind not running on NetWare prior to 3.1b.  I recommend
this, since it means connection, but not credential, sharing
between processes for threads in the work-to-do pool.

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message

Re: Problem pulling particular directory from CVS

2002-11-27 Thread Terry Lambert

Paul A. Scott wrote:
 I do the following:
 cvs co src/contrib
 cvs checkout: in directory src/contrib/cvs:
 cvs checkout: cannot open CVS/Entries for reading: No such file or directory
 cvs [checkout aborted]: cannot write CVS/Template file: No such file or
 directory
 
 cvs co stops on the src/contrib/cvs directory and will not go further. I
 have plenty of space available on the file system. The problem may be a
 corrupt repository.
 
 Is there any way to do a checkout on src/contrib while bypassing
 src/contrib/cvs? Or, can this be fixed to work?

You are not being quite forthright, I think.

This normally happens on a cvs -R co, rather than a cvs co, when
you are asking for a specific date tag or a release tag which
no longer exists, when running against a read-only repository.

I ran into a similar problem recently, when someone suggested
I use cvs against a FreeBSD server in German, in order to match
their version of the source code so I could create a patch for
a problem they were having.

The answer is that the val-tags file is not writeable, and is
being used.

There was a long discussion on this file ablot 6 months back;
I believe the resolution of that discussion was to make the
${CVSROOT}/CVSROOT/val-tags file unnecessary, but advisory, in
the case that it was not writeable.

Probably you can get around the problem by updating your 'cvs',
though it may also be necessary to update the 'cvs' on the
remote host to have the new code, as well.

You can also checkout without a tag, or with a tag that is already
in the val-tags file on the serving host.

Alternately, have them add a line to val-tags with the tag you
want to checkout, e.g. [indentation mine]:

RELENG_2_2_2_RELEASE y

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message

Re: mbuf header bloat ?

2002-11-27 Thread Terry Lambert

Andrew Gallatin wrote:
 What I (as a 3rd party driver author working in a GNUish
 autoconf/gnumake environment) do is to require a user building from
 source to specify the location of a configured kernel tree where make
 depend has been run (defaulting to GENERIC).  I then pickup the
 various option and bus files out of that directory.  When I build binary
 modules, I build from source as a normal user (using a 4.1.1 system in
 a chroot).  Using an approach like this, a vendor could ship a MAC
 aware driver by picking up the options files from a MAC kernel build
 directory.

I believe he was talking about modules for which source code
is not available.


 How is one supposed to build a 3rd party module these days?

One is not.  The vendor supplies only a binary.


   I think you under-estimate the complexity of variably sized key kernel
   data structures.  mbuf.h is included all over the kernel, as well as in
   many user applications (although often for bogus reasons).  My proposed
   strategy is the following:
 
 Bizzare.  I had no idea userland apps used mbuf.h.  That does indeed
 sound bogus.

On the contrary: it's a very clever thing to do.

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message

Re: mbuf header bloat ?

2002-11-27 Thread Terry Lambert

Andrew Gallatin wrote:
 Terry Lambert writes:
   Andrew Gallatin wrote:
What I (as a 3rd party driver author working in a GNUish

This is how I do it.

 ...
 
How is one supposed to build a 3rd party module these days?

How are you supposed to do it?

   One is not.  The vendor supplies only a binary.
 
 Damn it Terry,  I AM the vendor.  Somtimes I wonder if you even read
 the articles you reply to.   I'm asking how the vendor (me) is
 supposed to build a binary module and I gave an example of how currently
 do it.

You're the vendor in the first statement, and a consumer in the
second.

The topic of the post to which you were replying was third party
binary compatability.

The answer is that if the structures change, then there is no
binary compatability without source code, period.

It seemed to me that you were assuming access to the source code
for consumers of third party modules.

I think the issue that Robert is concerned about is MAC modules
that are provided by a third party to a consumer of FreeBSD and
the modules, and for which the structure changes and so on can
not be permitted.

This mnakes sense, because the MAC code is being developed under
a DARPA contract, and it's likely that the module source code and
the modules won't be available to the end users, let alone the
general public, without some kind of security clearance, and then
probably not then.

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message

Re: mbuf header bloat ?

2002-11-27 Thread Terry Lambert

Andrew Gallatin wrote:
 If you're a vendor of a device which inserts MAC mtags and needs
 options MAC, you put this code in your driver:
 
 if (mbstat.m_mhlen != MHLEN) {
printf(Please rebuild your kernel with 'options MAC'\n);
goto atach_failed_no_mac;
 }
 
 I've already got code like this in my driver to check that m_mclbytes
 and m_mlen is what I expect it to be, since people sometimes change
 them.


I think you are still not getting it, but it's not worth arguing
over.

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message

Re: pw_user.c change for samba

2002-11-27 Thread Terry Lambert

David W. Chapman Jr. wrote:
 I know we're in a code freeze right now, but would anyone have a
 problem with this patch once the freeze is up?  This brings us closer
 to allowing samba to automatically joining machines to the domain.

This change permits '$' in the account name, group name, and
login class fields.

Why is this actually necessary for SAMBA?

Is it necessary for all three of these to permit this, or is
it sufficient to (for example) allow it in the group name?

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message

Re: pw_user.c change for samba

2002-11-27 Thread Terry Lambert

David W. Chapman Jr. wrote:
  Why is this actually necessary for SAMBA?
 
  Is it necessary for all three of these to permit this, or is
  it sufficient to (for example) allow it in the group name?
 
 
 Samba needs a user account for the domain machine account
 
 the machine account always ends with a $
 
 So it would only have to be for the account name

I gathered that from the SAMBA site, too.

The '$' is a pain.  None of the examples in the original post
would have worked, because the '$' was not '\$', and the shell
would have blown chunks over the variable expansion.

It seems to me that this could cause a great deal of problems
for scripts that process the password files, as they currently
exist, if they use constructs like eval, or back-ticks, etc..

If it's allowed, it whould probably only be allowed in the
user name (i.e. the patch is wrong; it should probably add
another parameter to the allowable values of 'int gecos', and
change it to 'int checktype' or similar).

It seems to me that another alternative is that all these
names end in '$'; therefore, when you are expecting one of
these names, you could imply a '$', without needing to actually
have it in the password file -- in other words, it's an
attribute, not really part of the account name.

Will this open up a security hole for a nomal user account
being used to compromise the domain system security?  Is it
absolutely necessary to use an in-band method to distinguish
these records from ordinary user accounts?

If the answer to either of these is no, then it seems that
implying the '$', rather than permitting it directly, would be
best, to keep scripts working.

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message

Re: pw_user.c change for samba

2002-11-27 Thread Terry Lambert

David W. Chapman Jr. wrote:
  If it's allowed, it whould probably only be allowed in the
  user name (i.e. the patch is wrong; it should probably add
  another parameter to the allowable values of 'int gecos', and
  change it to 'int checktype' or similar).
 
 I don't have a problem with this, but the patch I sent in is the
 extent of my abilities to give me desired results(making pw like
 samba)

See attached patch.  It could still screw scripts (e.g. the perl
script version of adduser) by allowing the $ in the login
field, but at least it keeps it out of the login class and group
fields.

See below, though: I don't think '$' should be permitted.


  It seems to me that another alternative is that all these
  names end in '$'; therefore, when you are expecting one of
  these names, you could imply a '$', without needing to actually
  have it in the password file -- in other words, it's an
  attribute, not really part of the account name.
 
  Will this open up a security hole for a nomal user account
  being used to compromise the domain system security?  Is it
  absolutely necessary to use an in-band method to distinguish
  these records from ordinary user accounts?
 
 I don't think the samba people would be willing to make this type of
 change just for FreeBSD since it works for most everyone else.  I
 also don't think there is currently a way to store attributes about
 machines/users permanently in samba.

I think you misunderstand.

The intent is to allow accounts without $ appended to be used
as machine logins.  Samba would see the '$', remove it, and check
normally.

The potential problem is that normal user accounts could be used
in place of machines.

The proper BSD way to avoid this hack would be to add a login
class samba_server (or whatever), and make Samba permit this
type of check only if the user was in the correct login class.

-- Terry
Index: pw.h
===
RCS file: /cvs/src/usr.sbin/pw/pw.h,v
retrieving revision 1.13
diff -c -r1.13 pw.h
*** pw.h5 Jul 2001 08:01:15 -   1.13
--- pw.h27 Nov 2002 17:21:03 -
***
*** 62,67 
--- 62,74 
  W_NUM
  };
  
+ enum _checktype
+ {
+   PWC_DEFAULT,
+   PWC_GECOS,
+   PWC_LOGIN
+ };
+ 
  struct carg
  {
int   ch;
***
*** 105,111 
  
  int pw_user(struct userconf * cnf, int mode, struct cargs * _args);
  int pw_group(struct userconf * cnf, int mode, struct cargs * _args);
! char*pw_checkname(u_char *name, int gecos);
  
  int addpwent(struct passwd * pwd);
  int delpwent(struct passwd * pwd);
--- 112,118 
  
  int pw_user(struct userconf * cnf, int mode, struct cargs * _args);
  int pw_group(struct userconf * cnf, int mode, struct cargs * _args);
! char*pw_checkname(u_char *name, enum _checktype checktype);
  
  int addpwent(struct passwd * pwd);
  int delpwent(struct passwd * pwd);
Index: pw_user.c
===
RCS file: /cvs/src/usr.sbin/pw/pw_user.c,v
retrieving revision 1.51
diff -c -r1.51 pw_user.c
*** pw_user.c   24 Jun 2002 11:33:17 -  1.51
--- pw_user.c   27 Nov 2002 17:30:43 -
***
*** 231,237 
}
}
if ((arg = getarg(args, 'L')) != NULL)
!   cnf-default_class = pw_checkname((u_char *)arg-val, 0);
  
if ((arg = getarg(args, 'G')) != NULL  arg-val) {
int i = 0;
--- 231,237 
}
}
if ((arg = getarg(args, 'L')) != NULL)
!   cnf-default_class = pw_checkname((u_char *)arg-val, PWC_DEFAULT);
  
if ((arg = getarg(args, 'G')) != NULL  arg-val) {
int i = 0;
***
*** 293,299 
}
  
if ((a_name = getarg(args, 'n')) != NULL)
!   pwd = GETPWNAM(pw_checkname((u_char *)a_name-val, 0));
a_uid = getarg(args, 'u');
  
if (a_uid == NULL) {
--- 293,299 
}
  
if ((a_name = getarg(args, 'n')) != NULL)
!   pwd = GETPWNAM(pw_checkname((u_char *)a_name-val, PWC_LOGIN));
a_uid = getarg(args, 'u');
  
if (a_uid == NULL) {
***
*** 455,461 
if ((arg = getarg(args, 'l')) != NULL) {
if (strcmp(pwd-pw_name, root) == 0)
errx(EX_DATAERR, can't rename `root' account);
!   pwd-pw_name = pw_checkname((u_char *)arg-val, 0);
edited = 1;
}
  
--- 455,461 
if ((arg = getarg(args, 'l')) != NULL) {
if (strcmp(pwd-pw_name, root) == 0)
errx(EX_DATAERR, can't rename `root' account);
!   pwd-pw_name = pw_checkname((u_char *)arg-val, PWC_LOGIN);
edited = 1;
}
  
***
*** 595,601 
 * Shared add/edit code

Re: pw_user.c change for samba

2002-11-27 Thread Terry Lambert

Garance A Drosihn wrote:
 the machine account always ends with a $
 
 So it would only have to be for the account name
 
 I think I'd prefer a somewhat more involved change, one which
 allowed $ only for account-name, and only as the last character.
 That seems like a good idea to me.
 
 But then, I'm not volunteering to write it...   :-)

My change doesn't allow it only for the last, but it does restrict
it to the login name.

I notice that pw.h exports the code.  If somone is using the
function from outside, that's probably something that needs to
be considered.  I've changed the prototype, so that it will
at least complain on compilation, if someone is using the code
that way.

I think the $ on the end worked because of the dangling $
handling in the shell they they happened to be using; the
original example namess are still broken for some shells, with
no back-quoting.

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message

Re: pw_user.c change for samba

2002-11-27 Thread Terry Lambert

Oops.  Better patch attached (damn Makefile dependencies are
broken unless you manually build them via make depend).

-- Terry
Index: pw.h
===
RCS file: /cvs/src/usr.sbin/pw/pw.h,v
retrieving revision 1.13
diff -c -r1.13 pw.h
*** pw.h5 Jul 2001 08:01:15 -   1.13
--- pw.h27 Nov 2002 17:21:03 -
***
*** 62,67 
--- 62,74 
  W_NUM
  };
  
+ enum _checktype
+ {
+   PWC_DEFAULT,
+   PWC_GECOS,
+   PWC_LOGIN
+ };
+ 
  struct carg
  {
int   ch;
***
*** 105,111 
  
  int pw_user(struct userconf * cnf, int mode, struct cargs * _args);
  int pw_group(struct userconf * cnf, int mode, struct cargs * _args);
! char*pw_checkname(u_char *name, int gecos);
  
  int addpwent(struct passwd * pwd);
  int delpwent(struct passwd * pwd);
--- 112,118 
  
  int pw_user(struct userconf * cnf, int mode, struct cargs * _args);
  int pw_group(struct userconf * cnf, int mode, struct cargs * _args);
! char*pw_checkname(u_char *name, enum _checktype checktype);
  
  int addpwent(struct passwd * pwd);
  int delpwent(struct passwd * pwd);
Index: pw_group.c
===
RCS file: /cvs/src/usr.sbin/pw/pw_group.c,v
retrieving revision 1.13
diff -c -r1.13 pw_group.c
*** pw_group.c  22 Jun 2000 16:48:41 -  1.13
--- pw_group.c  27 Nov 2002 17:44:10 -
***
*** 135,141 
grp-gr_gid = (gid_t) atoi(a_gid-val);
  
if ((arg = getarg(args, 'l')) != NULL)
!   grp-gr_name = pw_checkname((u_char *)arg-val, 0);
} else {
if (a_name == NULL) /* Required */
errx(EX_DATAERR, group name required);
--- 135,141 
grp-gr_gid = (gid_t) atoi(a_gid-val);
  
if ((arg = getarg(args, 'l')) != NULL)
!   grp-gr_name = pw_checkname((u_char *)arg-val, PWC_DEFAULT);
} else {
if (a_name == NULL) /* Required */
errx(EX_DATAERR, group name required);
***
*** 145,151 
extendarray(members, grmembers, 200);
members[0] = NULL;
grp = fakegroup;
!   grp-gr_name = pw_checkname((u_char *)a_name-val, 0);
grp-gr_passwd = *;
grp-gr_gid = gr_gidpolicy(cnf, args);
grp-gr_mem = members;
--- 145,151 
extendarray(members, grmembers, 200);
members[0] = NULL;
grp = fakegroup;
!   grp-gr_name = pw_checkname((u_char *)a_name-val, PWC_DEFAULT);
grp-gr_passwd = *;
grp-gr_gid = gr_gidpolicy(cnf, args);
grp-gr_mem = members;
Index: pw_user.c
===
RCS file: /cvs/src/usr.sbin/pw/pw_user.c,v
retrieving revision 1.51
diff -c -r1.51 pw_user.c
*** pw_user.c   24 Jun 2002 11:33:17 -  1.51
--- pw_user.c   27 Nov 2002 17:30:43 -
***
*** 231,237 
}
}
if ((arg = getarg(args, 'L')) != NULL)
!   cnf-default_class = pw_checkname((u_char *)arg-val, 0);
  
if ((arg = getarg(args, 'G')) != NULL  arg-val) {
int i = 0;
--- 231,237 
}
}
if ((arg = getarg(args, 'L')) != NULL)
!   cnf-default_class = pw_checkname((u_char *)arg-val, PWC_DEFAULT);
  
if ((arg = getarg(args, 'G')) != NULL  arg-val) {
int i = 0;
***
*** 293,299 
}
  
if ((a_name = getarg(args, 'n')) != NULL)
!   pwd = GETPWNAM(pw_checkname((u_char *)a_name-val, 0));
a_uid = getarg(args, 'u');
  
if (a_uid == NULL) {
--- 293,299 
}
  
if ((a_name = getarg(args, 'n')) != NULL)
!   pwd = GETPWNAM(pw_checkname((u_char *)a_name-val, PWC_LOGIN));
a_uid = getarg(args, 'u');
  
if (a_uid == NULL) {
***
*** 455,461 
if ((arg = getarg(args, 'l')) != NULL) {
if (strcmp(pwd-pw_name, root) == 0)
errx(EX_DATAERR, can't rename `root' account);
!   pwd-pw_name = pw_checkname((u_char *)arg-val, 0);
edited = 1;
}
  
--- 455,461 
if ((arg = getarg(args, 'l')) != NULL) {
if (strcmp(pwd-pw_name, root) == 0)
errx(EX_DATAERR, can't rename `root' account);
!   pwd-pw_name = pw_checkname((u_char *)arg-val, PWC_LOGIN);
edited = 1;
}
  
***
*** 595,601 
 * Shared add/edit code
 */
if ((arg = getarg(args, 'c')) != NULL) {
!

Re: bonobo-activation core dump help

2002-11-27 Thread Terry Lambert

suken woo wrote:
 hi, all:
 setting the env with zh_CN.EUC ,and run X but got the following errors.
 pid 495 (bonobo-activation-s), uid 1001: exited on signal 11 (core dumped)

Compile the bonobo-activation-s binary with debugging symbols,
so that you can debugthe core file and see where it's crashing,
and then correct the bonobo source code, so that it doesn't crash,
and once that's done, fix the locale file, which appears to be
out of date, to include the messages which are missing that the
code is not handling properly.

Bascially, bonobo is crashing on bad data, when it should handle
it and not crash.

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message

Re: pw_user.c change for samba

2002-11-27 Thread Terry Lambert

Juli Mallett wrote:
  The '$' is a pain.  None of the examples in the original post
  would have worked, because the '$' was not '\$', and the shell
  would have blown chunks over the variable expansion.
 
 Your foundation is flawed, we allow $ in passwd just fine, and
 the only problem here is whether a pw should let someone do
 something we support which they might need to do.

Apply the patch.

Then try to add a user with a trailing $ via adduser(1);  Note
the failure.

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message

Re: pw_user.c change for samba

2002-11-27 Thread Terry Lambert

Giorgos Keramidas wrote:
 On 2002-11-27 12:55, Terry Lambert [EMAIL PROTECTED] wrote:
  Will this open up a security hole for a nomal user account
  being used to compromise the domain system security?
 
 Probably 'yes'.  I haven't tried this, but I guess one could name his
 machine Administrator.  When that username is passed around, is it
 clear that it is a machine name and not a user name?  I guess that if
 this way someone just might trick a remote SMB server that his
 username is 'Administrator' by changing his local machine's name, we
 have a problem...

That's a namespace issue... they would still need a password.
I think that a login class would fix it.  That would mean that
you could not have a user and a machine with the same name,
but if you want to be technical, doing it the other way, I
can't have a user named Administrator$ and a machine named
Administrator, so either waym there's a namespace incursion.

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message

Re: pw_user.c change for samba

2002-11-27 Thread Terry Lambert

NAKAJI Hiroyuki wrote:
David W. Chapman Jr. [EMAIL PROTECTED] wrote:
 
 David Wouldn't pw still have to be updated.  I haven't looked at adduser but I
 David thought it was a wrapper for pw?
 
 No.
 
 My /usr/sbin/adduser, updated on Nov/23/2002 21:58 JST, does not call
 pw command. It adds account to /etc/master.passwd and invokes
 'pwd_mkdb'.
 
 See 'sub new_users' function in /usr/sbin/adduser.

There are two adduser scripts.  One is perl, and one was written
to use pw and provide the same semantics, in a shell script, as
part of the perl purge that happened recently.

One of them pukes on the trailing $, and the other doesn't.

It's confusing, unless you caught that we were talking about
most recent -current.

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message

Re: Problem pulling particular directory from CVS

2002-11-27 Thread Terry Lambert

Paul A. Scott wrote:
 setenv CVSROOT :pserver:[EMAIL PROTECTED]:/home/ncvs
 cvs login
 cvs co src/contrib
 
 When it gets to directory src/contrib/cvs, I get:
 
 cvs checkout: cannot open CVS/Entries for reading: No such file or directory
 cvs [checkout aborted]: cannot write CVS/Template file: No such file or
 directory
 
 Nothing hidden, totally forthright.

Except that's a different error than the one you said before.  8-).

This particular error usually when you are doing this
as root, and have an overly-anal umask set.  To correct it, you
should delete the subtree from that point, and at an upper level,
type:

cvs update -d

The subdirectories that would have been included in the original
checkout will be brought in and created (-d), without you
needing to repeat the checkout.


  Probably you can get around the problem by updating your 'cvs',
 
 Running 'cvs -v' on FreeBSD 4.5:
 Concurrent Versions System (CVS) 1.10 `Halibut' (client/server)
 
 This version breaks on checkout of src/contrib/cvs
 
 Running 'cvs -v' on FreeBSD 4.7:
 Concurrent Versions System (CVS) 1.11.1p1-FreeBSD (client/server)
 
 This version works.
 
 Thanks. I'll update my cvs.

I still find it hard to believe you aren't using a particular tag;
the other procedure outlined above should work for you with the
old CVS against the error message you are getting now.

One possibility is that the source tree you are doing has a stick
tag set?

In any case, if you have a workaround, you're probably more
interested in the fact it works than in why.  8-) 8-).

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message

Re: ACPI problem with laptop?

2002-11-26 Thread Terry Lambert

John Angelmo wrote:
 Terry Lambert wrote:
  Is this a Dell Lattitude?  They are known to have heat problems.
 
  There's also the possibility that the CPU is a desktop CPU in
  the laptop; people aren't supposed to do that, either, but it
  can crank up the heat.
 
 No it's a Evo N114 with an Athlon 4 in it, I think that this is a mobile CPU

It may be that Windows ensures that the computer runs cooler
by down-clocking it.

Have you applied the most recent ACPI patches, and turned on
debugging output (at least hw.acpi.verbose=1) to see if it
fixes the problem (and if it doesn't, at least report what's
going on)?

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message

Re: ACPI problem with laptop?

2002-11-26 Thread Terry Lambert

Terry Lambert wrote:
 Have you applied the most recent ACPI patches, and turned on
 debugging output (at least hw.acpi.verbose=1) to see if it
 fixes the problem (and if it doesn't, at least report what's
 going on)?

It looks like the author of the ACPI code has already replied
to your post; apply the patch he suggests, and turn on the
debugging he suggests.  He knows far mor about ACPI than the
rest of us.

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message

[PATCH] Searching for users of netncp and nwfs to help debug 5.0 problems

2002-11-26 Thread Terry Lambert

Nate Lawson wrote:
  It's not so much that I volunteered as I said that I'd help with
  thread/proc issues..
  The trouble was that there are places where it used a proc in the old
  code, but in some cases it needs to be a proc, and in other cases it now
  needs to be a thread. But all they stored was the proc. Also, from
  my memories of the code you needed to understand the protocol to know
  which needed to be which, and I don't know that protocol.
 
  In addition whoever does it needs to remember that any structure that
  stores a thread poitner is probably in error, as threads
  are transient items and any stored thread pointer is probably a wild
  pointer within a few milliseconds of being stored. :-)
 
 I'll take a whack at it and send it out by tomorrow, working or not.

Don't bother.  8-).

The attached patch makes it compile, and takes a shot at doing the
right thing.

The threasd stuff is problematic; it's useful only for a blocking
context.  The process stuff is there to identify the connection,
actually, which can mean huge latencies (hence the caching of
procp).

It helps to know that the protocol is exclusively request/response
per session, the current code handles only a single session per
process (not one per thread), and that lock requests are answered
bith synchronously and asynchronously (request/response, then async
message on timeout or success).

-- Terry


smbfs_thr.diff.gz
Description: GNU Zip compressed data

Re: [PATCH] Searching for users of netncp and nwfs to help debug 5.0 problems

2002-11-26 Thread Terry Lambert

Terry Lambert wrote:
  I'll take a whack at it and send it out by tomorrow, working or not.
 
 Don't bother.  8-).
 
 The attached patch makes it compile, and takes a shot at doing the
 right thing.


Just a followup... select definitely won't work (IMO), but needs
someone who is threads-savvy with kernel locks to deal with it;
I cribbed lock flow from elsewhere, and it looks wrong to me.

So this is *definitely* 56K of diffs that *only* address compilation
completely, and not full function.

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message

Re: [PATCH] Searching for users of netncp and nwfs to help debug5.0 problems

2002-11-26 Thread Terry Lambert

First of all, the patch was just to get to the point of compilability,
which other prople said they would take it from there.  I don't
have a NetWare server to test against in my apartment.  I'd be just as
happy to _let_ the other people who wanted to take it from there do
do, now that I made it compile.

That said...

Julian Elischer wrote:
 some comments:
 firstly:
 
   ncp_conn_locklist(int flags, struct proc *p)
   {
 !   struct thread *td = TAILQ_FIRST(p-p_threads); /* XXX
 */
 !
 !   return lockmgr(listlock, flags | LK_CANRECURSE, 0, td);
   }
 
 can't you use unidifs? :-)

Only if I want the code to be unreadable.  8-) 8-).  Can't you
apply the patch to a -current tree checked out for that purpose,
and get whatever flavor of diffs you want, e.g. cvs diff -u?
Unidiffs don't support fuzz, and uless you are committing the
thing, I'd rather not have to recreate it interatively until it
passes someone's filter.  A context diff gives me that.


 ok
 there is a Macro to find the first thread in a process.
 FIRST_THREAD_IN_PROC(p)

I didn't see this.  It's definitely the way to go.

Any chance of that being documented any time soon, so that no
one else does the obvious thing, instead, like I did?


 I use it because it makes it easy to find the
 places that are DEFINITLY BROKEN. The marker for a KSE breakage is:
 XXXKSE, but places that use FIRST_THREAD_IN_PROC(p) are marked
 implicitly since nearly any time it is used, the code is trying to
 do something that doesn't make sense. (except in fork and
 maybe exit and exec, and in things like linux emulation
 where you KNOW there is only one thread).
 
 if you see TAILQ_FIRST(p-p_threads) (or FIRST_THREAD_IN_PROC(p)) you
 can pretty much guarantee that what is happenning there need to be
 completely rewritten, possibly to the extent of rerwiting it's callers
 and the caller's callers. That is the problem I hit when trying to
 convert it to start with...

Actually, I really wanted a LAST_THREAD_IN_PROC(p).  The
reality is that I want the most recently inserted thread to use
as a blocking context, on the theory that it would not be used
until much later, and that all the pages it cares about are much
more likely to be in core, particularly on a process with tons
of threads.

The only reason it's being used at all is that the lockmgr()
class need a blocking context for their calls, and it's an
explicit parameter (arguably, it should be curthread).

I can edit the patch (since it's not a unidiff 8-) 8-) 8-)),
or I can post a new one, if you want (it's only 9K, compressed).

But realize that all it means is give me a thread that I can
use as a blocking context while I'm waiting on lockmgr().


 Since I don't know the way that process IDs come into the session
 control (you have to understand the protocol for that) I basically hit
 a wall on trying to work out what to rewrite, and how.

The wire protocol is always request/response.  Always.  As I
stated before, the only exception is a lock, with/without a
timeout.  In that case, you get the synchronous response to
your synchornous request, which basically means request has
been queue for servicing, and you later get a seperate
notification.

The notification is by connection.  The connection is per process,
because we are talking about a connection where the credentials
are associated with the connection in question.  The connection
provides both a state context (waiting for request vs. request
in progress).  Making additional requests over the same VC will
result in a request in progress, go away you dork response to
the client.

When an async response to a lock request comes in, in comes in on
a seperate VC (each connection has 3 VCs, only 2 of which are
normally used: the one for request/response, and the one for
async notifications, which is overloaded to handle lock responses).
The connection is mapped back to a process to map it back to the
blocker.

What this basically means is that NCP's can't be handled as
multithreaded, without establishing a VC per thread.  It just
does not work.  Therefore NCP request are going to serialize in
the kernel, no matter what you do in happy-thread-town.

If you are asking for the code to be thread safe, then you are
basically talking about multithreading the whole stack:

NWFS - NCP Client - IPX - IP

Probably it would be a better idea if TCP/IP was multithreaded
first?


 BTW the obnoxious FIRST_THREAD_IN_PROC will go away when
 we have got rid of most of the broken code and be replaced in places
 for which it is ok with p2td() which will be:
 
 __inline struct thread *
 p2td(struct proc *p)
 {
 KASSERT(((p-p_flag  P_KSES) == 0),
 (Threaded program uses p2td()));
 return (TAILQ_FIRST(p-p_threads));
 }

Uh, how exactly is that less obnoxious, given it's the same code
with a different name and an obnoxious inline instead of a macro?
8-).


 You can always get from a thread to a single process but the reverse
 always presents

Re: [PATCH] Searching for users of netncp and nwfs to help debug5.0 problems

2002-11-26 Thread Terry Lambert

Terry Lambert wrote:
 Did you want me to update the patch to use your FIRST_THREAD_IN_PROC
 macro and resend it?

OK; here it is, whether you wanted it or not.

-- Terry


smbfs_thr.diff.gz
Description: GNU Zip compressed data

Re: [PATCH] Searching for users of netncp and nwfs to help debug5.0problems

2002-11-26 Thread Terry Lambert

Julian Elischer wrote:
  The answer is that the code doesn't care what thread; it would
  prefer to not have to think in terms of threads at all, but if
  you want to force it to, then it's going to think in terms of
  blocking contexts for the benefit of FreeBSD code it calls,
  and nothing else.
 
 Hense the confusion as to whether to use a thread or a proc..

Not confusing at all.  The only issue is references to the
connection structure caches proc, which uses the first thread
on the cached proc; otherwise, it uses the thread that was
passed in.


  Did you want me to update the patch to use your FIRST_THREAD_IN_PROC
  macro and resend it?
 
 you could but the fact that FIRST_THREAD_IN_PROC() is used indicates
 that the whole thing is broken anyway. Your edits are mostly mechanical
 and don't actually solve the problem. To do that you probably need
 to actually rewrite some of it I think.

They were _intended_ to be mechanical edits.  It fixes the problem
for the people who were willing to fix it, but didn't have any
idea of how to do the edits.

I can't really rewrite the code for you, without risking that
Novell would claim that I did it with knowledge of the NUC
implementation... you _do_ remember the last time Novell and BSD
had an issue over code, right, back in 1994, after they bought
USL?

It's probably better that the patch I've done get to the people
who volunteered to fix the code, once it could be compiled, and
that the people who volunteered to help them with the threads
issues do so.  I've done as much as I can without legal risk.

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message

Re: ACPI and apm_saver?

2002-11-25 Thread Terry Lambert

John Baldwin wrote:
 On 22-Nov-2002 Terry Lambert wrote:
  Someone needs to write an acpi_saver.ko.
 
 No, they need to write a dpms_saver.ko instead. :)  acpi doesn't
 really have the same functionality as far as screen blanking IIRC.

You're right.

Is it just me, or is there a lot of month old mail stuck in a
queue somewhere?

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message

Re: -current unusable after a crash

2002-11-25 Thread Terry Lambert

Kris Kennaway wrote:
 On Mon, Nov 25, 2002 at 10:24:46AM -0500, Robert Watson wrote:
   I thought, this might be due to the priority of the background fsck and
   have once left it alone for several hours -- with no effect. The usual
   fsck takes a few minutes.
 
 We really need to disable background fsck if the system panicked.
 I've seen far too much bizarre filesystem behaviour that went away the
 next time I did a full fsck.

I don't think this is really possible.

I went looking for a generic application use CMOS are for this
sort of thing a while back, and I was unable to find one.

If you made system dumps mandatory (or marked swap with a non-dump
header in case of panic), this still would not handle the silent
reboot, double panic, or single panic with disk I/O trashed
cases.  8-(.

There was a discussion about these issues when background fsck
first went in.  My opinion of having it on by default is that if
you are going to play that loose, you might as well mount the FSs
async, and be done with it.

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message

Re: -current unusable after a crash

2002-11-25 Thread Terry Lambert

Mikhail Teterin wrote:
 On Monday 25 November 2002 12:24 pm, Kris Kennaway wrote:
 = On Mon, Nov 25, 2002 at 10:24:46AM -0500, Robert Watson wrote:
 =
 =   I thought, this might be due to the priority of the background
 =   fsck and have once left it alone for several hours -- with no
 =   effect. The usual fsck takes a few minutes.
 =
 = We really need to disable background fsck if the system panicked.
 
 Otherwise, is there a need for fsck at all? Can sudden powerloss be
 reliably distinguished from a panic?

No, nor from hardware failures (disk/controller/other), without
NVRAM to save the crash reason in the case there is one.

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message

Re: mbuf header bloat ?

2002-11-25 Thread Terry Lambert

Bosko Milekic wrote:
   This is not entirely true.  You can allocate an mbuf chain without
   holding Giant if the caches are well populated - and they should be
   in the common/general case.  You can in fact modify the allocator to
   just not do a kmem_malloc() if called with M_DONTWAIT, but I don't
   think you'd want to do this at this point.

In fact, one of the first changes I make in a kernel when I go
to do a networking product of any kind is to allocate the mbufs
in machdep.c out of physical RAM, and then pre-link them onto a
free-list, instead of using the standard (comparatively very
slow) mbuf allocator.


   The gist of the argument boils down to the fact that network buffer
   allocations have different requirements than general all-purpose
   allocations (by design, the last time I checked), and that is why
   an mbuf/cluster allocator exists.

Everything allocated at interrupt has pretty much the same
requirements.  The only real difference in mbuf's is that the
allocation failure cases are generally better handled than all
other allocation failure cases within the kernel (or people
would not have been beating up Jeff about a month ago for the
kmem_map space issue).

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message

Re: mbuf header bloat ?

2002-11-25 Thread Terry Lambert

Bosko Milekic wrote:
[ ... packet size distribution ... ]
   I am equally curious about this.  One of the design assumptions for
   mbufs and clusters, according to McKusick et al. (and I believe
   another text which currently escapes me) is that packets are typically
   either very small or fairly large.  Given the MAC label additions
   (yes it would be nice if this was done using the m_tag interface but
   at the very least one can say that they are implemented fairly
   'consistently' despite the fact that they appear imposing to the
   general mbuf structure), and the currently available data region in
   the mbuf, it is absolutely necessary to know whether the assumption of
   packet size distribution still holds before a decision is made on how
   to modify the MAC label implementation - if at all.

In fact, it is even more useful to consider the idea of variable
sized mbufs.  The actual size you want is whatever size is needed
for the incoming packets for the MTU of the sender.  Practically,
this means 8K (a compromise on the 9K jumbograms vs. page size),
1536 (512*3), etc..

I get concerned with all this decoration of mbufs (MAC vs. m_tag
vs. whatever) that people are doing, since this type of thing is
going to reduce overall capacity more than m_pullup(), etc..

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message

Re: I'm impressed, but ...

2002-11-25 Thread Terry Lambert

Philip Paeps wrote:
 On 2002-11-25 14:41:22 (-0500), Hiten Pandya wrote:
  On Mon, Nov 25, 2002 at 01:49:34AM +0100, Philip Paeps wrote:
 | unknown: PNP0401 can't assign resources (port)
 | unknown: PNP0501 can't assign resources (port)
 
  Can you try changing the hardware tunable,
  hw.pci.allow_unsupported_io_range, to the value of 1 in your loader.conf.  I
  think this should do it.  You can then check this value after you booted by
  `sysctl hw.pci`.
 
 I'm afraid that doesn't cure the 'problem'.

I think Hiten responded based on the can't assign resource
messages, without reading all the way through; I sometimes do
kneee-jerk responses to problem reports, as well.  The reason
his advice didn't help you suppress the messages is that the
failure is in port and IRQ assignments, not in memory window
assignments.

The problem is related to multiple claimants for the device: the
BIOS, vs. the OS.  If you change the BIOS settings for PnP OS,
the messages should go away.  Note that the messages are just
warnings; they will not make anything not work, given your
configuration.


The maildirs issue, I won't comment on, at this time.

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message

Re: mbuf header bloat ?

2002-11-25 Thread Terry Lambert

Bosko Milekic wrote:
[ ... memory allocator ... ]

FWIW: The Sequent Dynix allocator paper has been converted, and
is now available online:

Experience With an Efficient Parallel Kernel Memory Allocator
Paul E. McKenney, Jack Slingwine, Phil Krueger
Sequent Computer Systems, Inc.
Software Practice And Experience
http://citeseer.nj.nec.com/484408.html

This is the same reference that is in the books UNIX Internals:
The New Frontiers and UNIX For Modern Architectures.

It's the reference I always give out, when locking in allocators
comes up, but now I have something other than a ratty photocopy
to give people.  8-).

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message

Re: -current unusable after a crash

2002-11-25 Thread Terry Lambert

Kris Kennaway wrote:
 On Mon, Nov 25, 2002 at 02:02:14PM -0800, Terry Lambert wrote:
  I don't think this is really possible.
 
 Yeah :(
 
  If you made system dumps mandatory (or marked swap with a non-dump
  header in case of panic), this still would not handle the silent
  reboot, double panic, or single panic with disk I/O trashed
  cases.  8-(.
 
 And the panics that affect the disk/filesystem are likely to not give
 a crashdump, but at the same time are likely to cause FS problems for
 bgfsck :-(

Actually, the worst problems come when the corruption does not
result in a crash subsequently.

If you just crashed again, you could simply set in the superblock
a flag that said background fsck in progress, and if that flag
was set at boot time, then do a full fsck (knowing you died during
a background fsck).

If you don't get a second crash, and you reboot, you're screwed.

You could add another utility to say force full fsck -- basically,
to set the flag manually.  This is a pain because you have to do it
through an fcntl() or ioctl(), since there are no block devices to
use to do the work, and you can't open a mounted device to write it,
even if you know what you are doing, the OS enforces like it's
smarter than you.

We ran into exactly this same problem in the InterJet, when we first
paid Kirk to have soft updates ported to FreeBSD (I actually did the
preliminary make it compile work, and Julian did most of the
debugging; I helped some after that, but my boss didn't like me
doing it).  The point was to get rid of the need for a UPS in the
InterJet.

A log structured FS doesn't actually have this problem, but is a
real pain because of the need for a cleaner to run constantly,
to garbage collect, which makes thing that used to be deterministic
time take variable time.  Not very good for multimedia or streaming
content serving.

The InterJet handled this by having a DC holdup time following AC
failure notification, which was enough to throw a stick into the
spokes, to prevent the wheels from turning, and the bicycle falling
over the cliff.

Another way to handle it would be CMOS, with a BIOS initialization
(e.g. set bit 1 of the crash state) that didn't effect the bits
that indicated the failure mode.

Unfortunately, the computer manufacturers have not really agreed
on a standard for this sort of thing, nor do they think anyone in
OS space or userland should be able to own a section of CMOS
memory (no OS allocation policy, tagging, etc.).

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message

Re: -current unusable after a crash

2002-11-25 Thread Terry Lambert

Brad Knowles wrote:
 At 2:02 PM -0800 2002/11/25, Terry Lambert wrote:
   If you made system dumps mandatory (or marked swap with a non-dump
   header in case of panic), this still would not handle the silent
   reboot, double panic, or single panic with disk I/O trashed
   cases.  8-(.
 
 How about we do the safe thing, and only do background fsck if we
 can prove that the system state is something where it would be
 suitable?  Or would that mean that we almost never do background fsck?

It would mean that you can *never* background fsck safely.

The problem is that you need to distinguish a power failure,
which is technically the only safe time to do it, from all
other failure modes.

You can distinguish, at least on R/W FS's, whether or not to
do any fsck (by looking at the clean bit), but all other bets
are off.

One approach that works well for desktop systems is to implement
a soft read-only.  We did this at Artisoft in 1995/1996, when
we ported the VFS stacking framework to Windows 95, and first
implemented a soft updates for FFS/UFS, which we ported to run
on Windows 95 under the stacking framework.

The way a soft read-only works is to leave the FS mounted
read/write, and then insert at write attempts, everywhere that
read-only is checked, a check for a soft read-only bit on
the in-core superblock.

Basically, we flush out all writeable state to the FS, and then
set the clean bit in the superblock, and flush it to disk, if
I/O on the FS has been idle for a while.

Then, when someone wants to write it, we reset the dirty bit,
flush the superblock back out to disk, and, once we know that
the change has been committed to stable storage, we permit the
write operation to continue.


There's actually some problems that now exist in the sync code
in FreeBSD that result in unnecessary writes to the disk, these
days, which make it hard to implement this (the system basically
sync's disk buffers that don't need to be sync'ed, at intervals);
that would have to be fixed before such a system can be used.

The result is a box you can just turn off, without trashing the
FS, assuming it's relatively quiescent, relative to FS writes
(e.g. desktop systems, as I said at the top).

Similarly, if the system were to panic, lose power, whatever, at
this point, then the FS's would be clean, and come back up with
no need to fsck.

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message

Re: -current unusable after a crash

2002-11-25 Thread Terry Lambert

Marcin Dalecki wrote:
  I don't think this is really possible.
 
  I went looking for a generic application use CMOS are for this
  sort of thing a while back, and I was unable to find one.
 
 Well you should please take a look at the fast boot option
 of moderately modern BIOS-es. Somthing along those lines went right now
 in to the linux kernel. Seems pretty adequate to me, since you would
 be even able to controll it through the BIOS setup...

Is there documentation available for this anywhere?  The BIOS
vendor documentation, not the Linux source code.

My gut feeling is that this isn't going to be too helpful,
without AC failure notification with a DC holdup time.

The problem is that the best case is power failure, and the
worst case is a corrupted GDT and a double panic off a trap 12
in the trap 12 handler (such that you would get a trap 12 when
you tried to write back to the CMOS that this was the worst
case, not the best case).

Basically, you are still stuck needing power failure notification,
so you can write the cause of the failure back.

At startup, you have to set the saved state to worst possible
failure: no way to update cause of failure in CMOS, and then
back off to softer failure modes from there.

I think this Fast boot stuff is useful, but the way it's
useful is if your main memory is reflected to a seperate area
of the disk, so that you can bring up the system image very
quickly.

Basically, it means that it's not at all useful for the problem
at hand, unless it provides for power fail notification.

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message

Re: I'm impressed, but ...

2002-11-25 Thread Terry Lambert

Philip Paeps wrote:
  The maildirs issue, I won't comment on, at this time.
 
 I hope I can provide enough information for someone to solve it though :-)  It
 would be nice to be able to read my mail 'reliably' :-)

The problem is not the amount, but the type of information.

You need to characterize the problem well enough that you can
write a little program that can repeat it on someone else's
machine, without them having to create an installation identical
to yours on a scratch box ...particularly when it looks like if
they tried that, it would work for them.

Right now, there are other people using the same software that
can't repeat the problem.

Without knowing whether or not you are both/neither/or-or-the-other
using NFS, etc., it's really impossible to even point you in the
right direction (NFS is my hunch, in this case; it's a common reason
for use of maildirs, to try and side-step locking issues).

You probably need to get together with the other person who
said they were *not* having a problem, and do a detailed
compare on system configuration, if all other things are equal.

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message

Re: -current unusable after a crash

2002-11-25 Thread Terry Lambert

Dan Nelson wrote:
  Is there documentation available for this anywhere?  The BIOS vendor
  documentation, not the Linux source code.
 
 http://www.microsoft.com/hwdev/resources/specs/simp_bios.asp
 http://www.microsoft.com/hwdev/resources/specs/simp_boot.asp
 
 is the best I could find; you'll need a Word doc viewer.  It's mainly
 geared toward detecting boot failure rather than abnormal shutdowns,
 though. What we need is a matching Simple Shutdown Flag variable.

The license you have to agree to to download it permits implementation
for firmware, but not for the OS: 1(a)(iii), 1(b), 2(b)(b), 3.

According to the documentation at the end of the page of the URL
you posted above, the OS has to be full PnP compliant for it to
work as expected, and multiboot is not supported.

For thise interested in pursuing this, more information (no license
agreement required) is available from:

http://www.microsoft.com/hwdev/platform/performance/fastboot/fastboot-winxp.asp

Though I expect you won't be able to implement without the specification.

I guess looking at the Linux code as a reference is OK... they
violated the license, not you, so it's not the same thing as you
violating the license.

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message

Re: Objective-C threads

2002-11-24 Thread Terry Lambert

[ ... Objective C ... ]

Chad David wrote:
 And I thought this thread was dead :).

It just showed up in the inbox last night; it must have been stuck
in your mail server.  Sorry about that.

 I don't really feel a need to convince.  If people are too busy (or
 just do not care) to maintain ObjC within FreeBSD, then I'll just have
 to do it locally.

That's kind of what I was implying would be the correct course
of action for a while.  8-).


  I have gotten literally hundreds of patches into FreeBSD by
  ignoring the FreeBSD process, and submitting the patches back
  to the vendor from which FreeBSD obtains the code, so this is
  a success strategy.
 
 Manipulation is a life stategy :).

Anything that works is better than anything that doesn't.  8-).

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message

Re: DP2 Fatal Trap

2002-11-23 Thread Terry Lambert

Scott Sipe wrote:
 I didn't make a backup copy (or mark down the errors) of the bad file or try
 rebooting which in retrospect would have been a good idea..sorry--I just
 fixed the file and saved it so I could compile some ports--and that worked.

Just FWIW: if it was a transfer error, a backup copy would have
been made from cache.  As such, it would only be useful in seeing
if the data was zeroed.  It wouldn't help in the lost word during
burst transfer case, or the reboot to test for self-healing case.
If it happens again, then: (1) hd the file, and determine if the data
was lost, or is merely not displaying because it was zeroed, and (2)
reboot to see if it heals.  Both of these are very imporant tests.


 I have an IWILL KK266 motherboard which has a AMI MegaRaid controller and a
 VIA Apollo KT133A chipset.  The FreeBSD drive is primary master ad0 on the
 via ide line (both Current and Stable are on the same disk).  I have a dvd
 drive and a cdrw on the secondary channel.  Then 2 harddisks, one each on the
 RAID controller (I use the bios to alternate which drives are used for
 booting--the RAID or the IDE)

[ ... other information ... ]

OK.  None of this resembles hardware for which bugs have been
reported to the -current or -hackers mailing lists.  This is
an affirmation (but not positive confirmation) that the problem
is not in the disk controller, disk, or FreeBSD driver.  The
fact that you'v not had strange probelms with -stable indicates
for certain that it's not a disk or controller problem.  That
leaves other bug (which is what I thought in the first place)
or driver bug.

I don't think it's a driver bug, but I can't prove it isn't.


 1) Yes it happened with a generic kernel straight off the DP2 install CD.

OK.  No recompiles, fancy driver load directives, etc..  If
John Baldwin wants to try and repeat your problem, he *may*
need a copy of your rc.conf.  DO NOT SEND THIS UNLESS REQUESTED
TO DO SO.


 2) I had the problems directly off DP2 iso image burned cd install, so can
 that tell you what you need to know about the cvs date or do you want me to
 do more?

OK; what this means, because there was no tag laid down, and
there was not a published checkout datestamp that can be used
to duplicate a -current system (according to John, it's a
checkout of -CURRENT, hacked to change the name to DP2 for
the build), is that I will have to build a known kernel locally,
so that I have  source tree that duplicated the failure for you.

Do you have it booting DP2 enough to replace the kernel, or is
it fully reverted to -STABLE at this point?  It would be very
hard for me to build a full release CDROM ISO image and transfer
it, without sending it through the mail.

 3) Yes, I'm at college on a fast connection (though with a limited upload) so
 if you need to I can setup an ftp login for you on my computer.

If you can live with just kernel replacements, then if you can
set this up, I can give you a kernel which we will then hope
*that it fails* as soon as tomorrow, or whenever is convenient,
and, after you verify that it does, indeed, fail, then I can do
the fix and give you another kernel 2-3 days after that (depending
on the porting required, since it involves assmebly language).

Let me know.

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message

Re: Lots of swapping from 'kldload acpi'

2002-11-23 Thread Terry Lambert

Bruce Evans wrote:
 The existence of SYSINIT is a bug, but shouldn't the order of SYSINITs be
 such that they are run before MOD_LOAD events?
 
 SYSINITs have no way to communicate failure, so they are especially broken
 when they are used to allocate resources.  Users of the resources have no
 way to tell whether the SYSINITs worked except to check for the existence
 of the resources before using them, but if they can do that properly
 then they can just call the resource allocation functions.

The SYSINITs in the module load case can communicate failure,
since it's the module load routine that handles them.

The real answer here is that SYSINIT is intended for system
initialization time operations.  IF the system fails to
initialize properly, there's really no way to communicate
failure (for example, a failure prior to the console being
up is very hard to report on a still-born console that has
not be setup or initialized to do reporting).  The function
is void, because it makes no sense to trap an error condition
you can't handle (Dennis Ritchie said this).

It's true that the functions are void; however, in the context
of the driver, they call functions which are locally defined,
and therefore could update a static errno-type value, very
easily, to report their status.  Alternately, they could call
a callback function, which was defined only in the module load
case.

I can provide sample code, if necessary.

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message

Re: Autodefaults in disklabel on 5.0dp2 install

2002-11-23 Thread Terry Lambert

Garance A Drosihn wrote:
 This is something I noticed while installing 5.0-dp2.  I'm not sure
 how much we'd want to change it.
 
 I'm installing dp2 on a 4-gig disk.  I want to split that in two,
 with dos for the first 2 gig and freebsd in the last 2 gig.  When
 I got to the disklabel step, I tried the Auto Defaults option to
 split up the freebsd partition.  It picked partition sizes of:
 
  128 meg   - /
 1231 meg   - swap space
  208 meg   - /var
  208 meg   - /tmp
   83 meg   - /usr
 
 This is a machine with 768 meg of memory, but I think the install
 is more likely to work with a less swapspace and something more
 than 83 meg for /usr.  I know it's tricky to come up with an
 algorithm which will pick decent sizes for every combination of
 disk and memory sizes, but perhaps it should wire in some kind of
 minimum size for /usr.  Also, maybe something to the effect that
 neither /var nor /tmp should end up larger than /usr.
 
 I have not looked at the source, so maybe it's just a simple case
 of the swap calculation being done based on the size of the hard
 drive instead of the size of the freebsd partition.

The default swap size calculation is done on the basis of a multiple
of the physical memory size.  Specifically, the physical memory may
be completely consumed by kernel structures, up to the KVA size, and
therefore in the case of a system dump, it can require up to the
size of the physical meory, plus 64K, at a bare minimum for a
successful system dump.

So even if you were to reduce the swap size, you should not reduce
it below 768M + 64K.

You will, of course, agree that a prerelease named DP2 should
have the ability to successfully system dump, as that is one of
the primary reasons it's being handed out: to catch problems, and
to provide detailed bug reports about them, sufficient to correct
them before the official release.

You should be able to take memory away from swap, even after the
Auto, and give it to /usr.

If you choose to give less swap than is necessary for a system
dump, expect no help with problems with the system which may
arise during your testing: not because no one wants to help,
but because you aren't going to be able to provide sufficient
information to enable them to help.

If you can accept that limitation (i.e. you are trying DP2 to
find problems in user space software which needs to run on it,
ONLY), then fine; otherwise, you might want to consider using
a bigger disk and/or removing some RAM.

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message

Re: DP2 Fatal Trap

2002-11-23 Thread Terry Lambert

Vallo Kallaste wrote:
 You got me wrong. I'm user and do not know and don't want to know
 about any CPU architecture and bugs. But I've got problems and
 simply trying to provide any data possible to gather by myself.
 Either CPU hardware or software bug, fine. You're claiming to know
 the bug and possible fix, but don't want to publish it, fine.

I do not object to publication of code that embodies a workaroun
to the poblem, so long as that workaround doesn;t specifically
disclose the root cause problem itself.


 I don't want to think about it because with my knowledge this is going
 to nowhere and only wasting my time. Things you see above are my
 results using consistent testing environment, take it or leave it.
 I'll stick with DISABLE_PSE enabled and DISABLE_PG_G disabled for
 the time being.

I'll make the same offer of a fixed kernel binary, for testing
purposes, if you are willing to test two: one to be sure that
there is no serendipity involved, and one with the patch.  We
can skip the first one if you can give me a CVS date or tag to
checkout to get code identical to code you have locally, which
has the problem.  E.g. if you have a local copy of the CVS tree,
and you check out with a date tag of, say, last Wednesday, and
the kernel you build from that coe ha he problem, then I can check
out identical code, patch it, and give you a binary to try.

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message

Re: Autodefaults in disklabel on 5.0dp2 install

2002-11-23 Thread Terry Lambert

Garance A Drosihn wrote:
 Hmm.  I hadn't really thought much about the specifics of what
 is needed.  I was just wondering if we might want to think about
 the auto size algorithm a bit more.

The value, by default is 2 * `sysctl hw.physmem`.

This is kind of ugly, because it doesn't include the space for the
kernel itself.  There isn't a sysctl variable with the actual size
of the real physical RAM, without the kernel and page table space
subtracted out of it, unfortunately (I tried to get a patch in on
this a year ago, last June).


 So even if you were to reduce the swap size, you should not reduce
 it below 768M + 64K.
 
 Well, I can see that the auto default partition mechanism should
 probably take that into account too.  I'm just saying that the
 current algorithm gives the user (any generic user, not me specifically)
 a useless result.  It would be nice if it came up with more usable sizes.

The 2 * physical memory *is* a useful size, for most cases.  The
base assumption here is that you installed that much memory for a
reason, and you intend to use it.  That yields a proper swap size.

It also permits you to (almost) double the amount of physical RAM
installed on the machine, and still successfully crash dump, so it
gives you room to upgrade your hardware.


 You will, of course, agree that a prerelease named DP2 should
 have the ability to successfully system dump, as that is one of
 the primary reasons it's being handed out: to catch problems, and
 to provide detailed bug reports about them, sufficient to correct
 them before the official release.
 
 Well, I mentioned this now with an eye towards 5.0-release, although
 I realize my original message didn't indicate that.  This same issue
 comes up when installing recent releases from 4-stable.  My point is
 that the resulting partition sizes (in this case) are unusable.  There
 is no point in worrying about the ability to save a system dump on a
 system where the initial install has pretty close to zero chance of
 succeeding.

It's very difficult to get disks that small these days, without
partitioning for multiboot, or going to a back room somewhere, and
blowing dust off a box.  8-).  I think the most common place this
would happen is someone trying out FreeBSD. That makes it an
important problem for -RELEASE, I think.

Not knowing what people intend to install up front, it's hard to
know how much space is ncessary for various files.

I think someone needs to come up with minimum ratios and hog
partitions -- like swap -- that can be stolen from in order
to get a running system (it should also spit out an alarming message
with justanOK button (to steal from Windows 8-)) about not being
able to support crash dumps.

I don't think anyone would object to patches; the code you want
to hack will be in /usr/src/usr.sbin/sysinstall/label.c.


 Let me repeat that I'm not sure how much we'd want to change it.  I
 just wanted to point out how the current algorithm behaves when given
 this particular combination of disk and memory sizes.

Spitting out Insufficient resources is a sure way to scare
someone off.  8-(.

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message

Re: fsck's, current vs earlier releases

2002-11-23 Thread Terry Lambert

Garance A Drosihn wrote:
 I have 4.6.2-release, 4.7-release, and 5.0-dp2-release on a single PC.
 After some bouncing between versions, and an occasional 'disklabel'
 command, I seem to have the partitions for 4.6.2 in an odd state.
 Both 4.7 and 5.0-dp2 have no problem mounting them, but if I try to
 boot up the 4.6.2 system it fails because 4.6.2 finds that values
 in super block disagree with those in first alternate.  4.6.2 wants
 me to 'fsck' the partitions manually, but I *think* I remember that
 using the older fsck might cause trouble.

Yes.

You need to recompile a -STABLE fsck on the older version of
the OS, so that it can do the right thing about the don't care
areas of the superblock.

A generic fix would grow don't care regions down, and care
regions up, with a boundary offset (which started in the top
25% for forward compatability, and grew down), above which you
didn't complain about differences.

It's too late to implement that, now that people have hacked
up the superblock, an there is an existing installed base with
it hacked the wrong way for automatic binary compatability.

Right now, all you can do is compile a new version of fsck for
your old version of the OS, to make it ignore differences in
those areas.

- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message

Re: ACPI problem

2002-11-23 Thread Terry Lambert

Ertan Kucukoglu wrote:
 I want to use the power key to shutdown the system. It is
 compaq evo 300, P4 1.6ghz, 368MB RAM
 
 Yesterday OS was 5.0DP2. I can not power it off. It comes
 to a point when it should cut the power off 'System is
 shutting down using ACPI' like message is displayed and
 after a while. It just panics at free().
 
 This morning I cvsuped and buildworld the machine. This
 time it do not panic, but 'Timeout' error message comes and
 system reset itself leading a new boot.' Is this problem
 because of my hardware or something else?

You will most likely need to dump your ACPI table and file
a bug report, assigning it to the ACPI code owner, who will
then tell you what is wrong with your BIOS, an ha yuned to
do to fix it (or give you a patch to th ACPI code in FreeBSD,
to make it tolerate your BIOS).

The first step will be a bug report.

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message

Re: gcc 3.2.1 release import?

2002-11-23 Thread Terry Lambert

Vallo Kallaste wrote:
 On Fri, Nov 22, 2002 at 10:03:50AM -0800, David O'Brien
  I would like to see GCC 3.2.1 release be our 5.0-R compiler.  However,
  the GCC 3.2.1 release date kept slipping and in fact was nebulous for a
  while.  The same for our 5.0-R.  So this has made it hard to decided what
  to do.  I suspect GCC 3.2.1-R wouldn't cause us much or any problems.
  But the question is does the project as a whole have the resources to
  deal with any problems that do creap up?  It is a hard judgement call.
 
 Somebody with knowledge and time should generate patches, so it's
 possible to at least test and report problems (or success). Given
 that enough people give it a try and report, there's possibility for
 import, IMHO.

But who will bell the cat?  I vote for Snuffles.

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message

Re: gcc 3.2.1 release import?

2002-11-23 Thread Terry Lambert

Vallo Kallaste wrote:
 On Sat, Nov 23, 2002 at 02:52:27PM -0800, Terry Lambert
 [EMAIL PROTECTED] wrote:
   Somebody with knowledge and time should generate patches, so it's
   possible to at least test and report problems (or success). Given
   that enough people give it a try and report, there's possibility for
   import, IMHO.
 
  But who will bell the cat?  I vote for Snuffles.
 
 Don't understand. Some inside joke or something based on US centric
 TV? What are you trying to tell me? Remember I'm not native.

1930's/1940's cartoon character, Snuffles, the mouse.  Was in
a lot of cartoons.  Played the Why? game that children play,
to the annoyance of his chosen victim, and the amusement of the
people.

One story was a script based on the Aesop's Fable about belling
the cat.  Here is a short reprise:

1)  All of the mice decided that Something Needs To Be Done
About The Cat(tm)
2)  They had a big meeting
3)  Finally, one mouse, who no one listened to very often,
suggests that they put a bell around its neck, so they
will be able to tell whic it's coming, and escape, to
live in peace, in their mousey ways
4)  No one wants to bell the cat; it's a perfect idea, which
lacks for implementation, and there are no potential
implementors to take the risk on behalf of the group

In the cartoon version, at this point, the mouse who made the
suggestion is volunteered by his comrades for the deadly duty
(Snuffles).

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message

Re: gcc 3.2.1 release import?

2002-11-22 Thread Terry Lambert

Steve Kargl wrote:
  Don't worry about it; it's being totally blown out of proportion;
  there's no way anyone will commit to importing a 2 day old 3.2.1,
  which is why I put the smiley's there.
 
 Well, the 2-day old 3.2.1 fixes numerous problems
 with our 3.2.1 [FreeBSD] 20021009 (prerelease).
 
 Compiling this
 
 void ice(int m, int n, double *f) {
 int i, j;
 for (j = 0; j  n; j++) {
  for (i = 1; i  m; i++) {
  f[i] = (double) (i * j);
  f[i + j] = (double) ((i + 1) * j);
  }
  }
  }
 
 with gcc -O2 -c yields an ICE in FreeBSD-current.
 The 2-day old gcc 3.2.1 does not blow chucks on the
 above code.

What does it do for all the other code in -ports, and in the
comp.source.* archives, and that anyone else has ever written,
such that you know it doesn't cause more problems than it
solves?

Supposedly, bringing in 3.2 was going to solve more problems
than it caused.  It turns out the 4.x compiler, GCC 2.95.3,
also does not have an ICE as a result of compiling that code.

What is food to one, is to others bitter poison.
-- Titus Lucretius Carus

When you are updating tools, it's actually about risk/reward;
the risk of not supporting IA64, and the risk of the object
file compatability has (supposedly) be addressed.

The only other reasonable path would be to tie FreeBSD releases
to GCC releases, plus some period of time for burn-in, and that
really isn't reasonable: 3.3 was supposed to be out already;
should FreeBSD's release schedule slip every time GGC's slips?

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message

Re: Searching for users of netncp and nwfs to help debug 5.0 problems

2002-11-22 Thread Terry Lambert

Robert Watson wrote:
 It sounds like there are a couple of problems here
 -- that we need a debugging guide (How to prepare a useful bug
report for a kernel panic, How to prepare a useful bug report
for a sysinstall failure, etc)

A bug-filing wizard would be useful.  The send-pr system
doesn't cut it, and most people are unaware of how to file a
decent bug report.  It doesn't help when the process involves
another computer, a serial cable, recompiling a kernel to use
a serial console and turn DDB support on, special configuration
for system dump images, and changing the size of your swap
partition to support the amount of RAM you have put into the
machine.

The bug-filing process has to be self-contained, and would be
best served by a literal transcription of the message that comes
up as a result of the problem.

The number one thing that can be done, though, is completely and
totally within FreeBSD's control: unique error reports for wach
possible error.  This is more or less a design issue.


 -- that
 we need a better way to find developers on a particular topic who are
 willing to pick up more debugging burden.

Most developers do not like to clean up after the messes they
make; this isn't unique to FreeBSD, but FreeBSD does seem to
have a larger number of prima donnas than other projects.  There
has also been a lot of kingdom-building; John Dyson, who I
still greatly admire, used to squat on six or seven distinct areas
of the kernel, but only had time to work on one or two of them,
even when Oracle was paying him to work on FreeBSD full time for
their Network Computer product.  There were warm bodies willing
to work on several of the areas he felt he owned.  Simultaneous
progress was possible in all of these areas at the same time...
IFF the people had been allowed to work on the code, rather than
being told No, I know the answer there, and I will fix it soon.

The time of soon never came, and the effort proceeded serially,
and less progress was made.  Respectfully, I submit that the same
thing happens on a daily basis, and that John Dyson was not the
only one who was trying to juggle too many balls at the same time,
though I will not name names: you all know who you are, and most
of us in the peanut gallery know, too.


 I would have guessed that, in general, problems with finding a
 responsible party developer would lie more in the areas of the
 system that don't have an active maintainer (vis owner), which
 is a harder problem to address.  If that's not a correct impression,
 then it's something that's probably easier to fix :-).

I think, in general, that FreeBSD attracts just as many developers
as Linux, or any other project, but fails to let them in.

One approach might be to decide rough ownership of areas of the
system: if people are going to act like they own the areas, then
make them explicitly responsible.  Do this at a sufficiently high
granularity, and you'll see that certain individuals own perhaps
dozens of areas.

Then add a field to the PR: area owner.  Go through each of the
PR's, and add this field.

Don't let an owner do any additional work until they've closed that
PR, one way or another, to the satisfaction of the submitter (this is
to ensure that screw you is not a satisfactory resolution).  Put a
time limit on it.

If the PR contains a patch, and the owner does nothing in the
allotted time, then give the PR submitter a commit bit, and give
ownership of the area over to them.

At the very least, PR's will be closed, and more people actively
writing code will end up with commit bits.

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message

Re: No entries in /proc :: feature or problem ??

2002-11-22 Thread Terry Lambert

Robert Watson wrote:
 (2) truss currently relies on procfs, albeit not working very well.  There
 were a set of patches floating around to make truss use ptrace(),
 which is the direction we probably do want to take this.  If someone
 could finish up that work, it would be great.
 
 The reasons to deprecate procfs are many-fold -- not least that there are
 existing interfaces in the kernel that provide most or all of its features
 at a substantially lower risk.  You just have to see the kernel-related
 security advisories for FreeBSD, Linux, Solaris, etc, over the last five
 years to understand why we want to turn it off if we can.  :-)

It would be nice if a condition of turning it off were a working
truss.  A priori.

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message

Re: ACPI and apm_saver?

2002-11-22 Thread Terry Lambert

Donn Miller wrote:
 ACPI works pretty well on my HP Pavilion Notebook.  But is it possible
 to get the apm saver (apm_saver.ko) to work with ACPI? From what I
 understand, acpi is a superset of apm, and is able to provide some
 emulation of apm functionality.  So, by this principle, shouldn't
 apm_saver work with acpi?

APM and ACPI are mutually exclusive.

A similar question to yours might be:

I had a Toyota Corolla, and I've traded it in on a Mack
 Semi Tractor; can I use the floor mats from my Corolla
 in the new Semi?

8-).

Someone needs to write an acpi_saver.ko.

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message

Re: gcc 3.2.1 release import?

2002-11-22 Thread Terry Lambert

Steve Kargl wrote:
  Supposedly, bringing in 3.2 was going to solve more problems
  than it caused.  It turns out the 4.x compiler, GCC 2.95.3,
  also does not have an ICE as a result of compiling that code.
 
 You know the reason why 3.2 pre-release was brought into
 the tree, right?  GCC has changed the C++ ABI between
 3.1.1 and 3.2.  If FreeBSD 5.0 shipped with 2.95.3, then
 5.x would use 2.95.3 until 6.0 was released.  Try getting
 support from the GCC folks for 2.95.3.

I'm well aware of that.  I was merely pointing out that all
compiler versions have different bugs, and you might as well
suggest a known quantity instead of an unknown one, if your
sole goal in life is to avoid a particular internal compiler
error, instead of looking at all the code involved.


 I respect David's judgement about bringing 3.2.1 into the
 tree, but your statement above (totally blown out...)
 suggests you don't follow GCC development.  Several
 significant bugs were fixed between our pre-release version
 and 3.2.1.

I *understand* that they fixed several bugs that are present
in the pre-release, and they *hope* they didn't introduce any
new ones.  Given their track record in this regard (e.g. the
internal compiler error in 3.2.1-prereelease that wasn't there
in 2.95.3), I have little faith in their hope.

Unless someone is willing to stand up as a shield to personally
take the slings and arrows from any new compiler bugs, which
*might* range up to and including delaying the 5.0-RELEASE as
a result of it, after import and bmake, not compiling some
things that worked with 3.2.1-prerelease, it can wait until
after the 5.0-RELEASE.

As you yourself pointed out: the C++ ABI change is in already,
so it's no longer the substantial risk it used to be.  Unless
there's another ABI change (which the advocates of importing
the prerelease assured us there would not be), then the only
thing that not updating breaks is the example code that was
posted, and I think we can all live with that until at least
the day after the 5.0-RELEASE.  8-).

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message

Re: DP2 Fatal Trap

2002-11-22 Thread Terry Lambert

John Baldwin wrote:
   Is it any help to know that my problems on P4 stopped after enabling
   DISABLE_PSE? Initially I had both of these enabled, but seems that
   one is enough. Just FYI.
 
  If we can verify that DISABLE_PG_G has no effect then that would be
  nice.
 
  It has an effect: writing CR3 or a TSS resulting in a changed CR3
  will not invalidate TLB entries with the G flag set, if PGE is set
  in CR4.
 
 I know what PG_G does, Terry.  My question is that if DISABLE_PG_G has
 no effect on the _problems_ people are having.

It can have an effect, if the problem is being exhibited on a P3
or an AMD processor, but not on a P4, unless it has 512M of memory;
the jury is out on other memory sizes, after Matt Dillon's dynamic
sizing changes (my own suggestion in this area was to conservatively
not go overboard in allocating a multiplier of physical memory for
page mappings, when doing so would push the space the mappings could
cover well over the available physical address space, if you'll
remember).

I think the processor in the bug report that started this thread
was an AMD K6?

There really is a CPU bug, John, and the new FreeBSD locore.s code
is triggering it.  A stock FreeBSD 4.4, for example, will not
exhibit this problem, on the same hardware.

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message

Re: DP2 Fatal Trap

2002-11-22 Thread Terry Lambert

Vallo Kallaste wrote:
 I have now definitive answer for _my_ case and environment. Just
 finished full package build for my workstation bundle port (92
 ports), including XFree-4, KDE3, mozilla-devel and whatnot. It all
 went very well running kernel which had:
 
 DISABLE_PSE enabled
 DISABLE_PG_Gdisabled
 
 Are you interested of the reverse? Can it be that enabling
 DISABLE_PSE incorporates DISABLE_PG_G somehow?

I give up.

You guys obviously still think it's a software problem that you
can characterize and fix using binary elimination to find the
offending code.  It's not.

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message

Re: DP2 Fatal Trap

2002-11-22 Thread Terry Lambert

Scott Sipe wrote:
 It was a trap 12, and definitely that address...I think something more
 overarching must be going on though.  I'm able to login with /bin/sh (not
 csh/tcsh) and so I've been trying various things--I can't compile a kernel
 because I get bus errors, same with many ports I've been trying to install.
 pkg_adding seems fine.  Any chance this could be acpi related?

How about this...

o   Are you using a GENERIC kernel?

o   Do you have a timestamp that can be used to check out a
/usr/src/sys from CVS that will let me build the same
kernel?

o   Do you have a place I can upload two or more 3/4MB kernel
files for you to try?

Let's say the answer to all three questions is yes.

Assuming I can build you a binary kernel from your sources which
then fails on your machine, I believe I can fix the problem, and
give you a new binary kernel that fixes it, if it's the problem I
think it is.

That way, we all win: you get a working kernel, and I get to
convince people that the problem is what I said it was in the
first place: a CPU bug that has to be specifically worked around.

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message

Re: gcc 3.2.1 release import?

2002-11-22 Thread Terry Lambert

David Schultz wrote:
  It really comes down to a question of living with known bugs, or risking
  gaining a new set of unknown bugs.
 
 In theory, the set of bugs in an actual release should be smaller
 than the set of bugs in a prerelease.

In theory, practice will be the same as theory.  8-) 8-).

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message

Re: Searching for users of netncp and nwfs to help debug5.0 problems

2002-11-22 Thread Terry Lambert

[ ... bug filing wizard ... ]

Brad Knowles wrote:
 Speaking as someone who is about to step off the deep end and
 start trying to actually run and test -CURRENT on my system here at
 home, I believe that this kind of resource would be vitally important.
 
 In contrast, I've had a few crashes this past week from other
 programs here on my PowerBook G4 running MacOS X (primarily Chimera,
 based on the Mozilla Gecko engine with native Aqua interface), and
 they have made it very easy for me to report crashes.  They have
 integrated tools to extract the maximum amount of information from
 the system as to exactly what other programs were running, what the
 program stack was, and a whole host of other things.  All I have to
 do is type in my e-mail address, optionally describe what I was
 trying to do at the time, and have a functioning Internet connection
 so that they can upload the reports.  I'd share some examples with
 you, but they are *huge*.

The amount of information is much less important than the utility
of the information to someone who is attempting to resolve the bug.
Most bug reports contain a lot of information that is really useless,
not topical to the bug in question (I was running XYZ and the kernel
paniced!), etc..

In terms of kernel problems, the absolutely most useful information
is the DDB traceback, followed by a DDB traceback mapped against a
debug kernel, followed by a system dump image and a debug kernel, etc..
By default, the system, as distributed, is not setup to supply this.

In terms of general problems, diagnostic messages are pretty lame;
setting aside the argument against data interfaces not being ripped
out being in itself lame, one bix example is the libkvm error message
kinfo_proc size mismatch (expected %d, got %d).

An error message should tell give you enough information that you
could deal with the problem reasonably; for the example message, a
better message would be:

 _kvm_err(kd, kd-program,
kinfo_proc size mismatch (expected %d, got %d),
sizeof(struct kinfo_proc),
kd-procbase-ki_structsize);
 _kvm_err(kd, kd-program,
recompile libkvm and recompile and reinstall %s,
kd-programsrcdir);


 Now, we can say that running -CURRENT is not for people who want
 to be molly-coddled.  But I believe it's a good idea to give people
 better tools to help make a better system.  I am convinced that we
 can find a better compromise.

Plus we aren't really talking about -CURRENT, we are talking about
5.0-DP2, or almost 5.0-RC1, if we're being honest.


   If the PR contains a patch, and the owner does nothing in the
   allotted time, then give the PR submitter a commit bit, and give
   ownership of the area over to them.
 
   At the very least, PR's will be closed, and more people actively
   writing code will end up with commit bits.
 
 Gack.  I'm not sure even I would be quite this radical -- any
 moron (like me ;-) can generate a PR that might include a patch.
 IMO, better would be to give the area to another person who is
 suitably qualified, has the available cycles, and presumably already
 has a commit bit.

Moron PR's are easily filterable: they are closable by the owner
with little effort.

PR's that are ignored, on the other hand, rather than being closed,
are most likely legitimate, but inconvenient.

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message

[PATCH] Re: smbfs panic

2002-11-21 Thread Terry Lambert

Tim Robbins wrote:
 Here's a backtrace of a smbfs panic. Looks like it does not correctly
 handle the smbfs_getpages error it is encountering and leaves garbage
 vnodes lying around. The panic probably comes from the VI_LOCK macro
 call on smbfs_node.c line 321.
 
 # cp blah.tar.gz ~tim
 cp: /home/tim/blah.tar.gz: Bad address


Read the list archives.  We discussed this to death.  The correct
thing to do is to back off and retry -- that's decimal 60, which
is hex 0x3c, which, according to /sys/netsmb/smb_rq.h is:

#define SMBR_REXMIT0x0004  /* request should be retransmitted */
#define SMBR_INTR  0x0008  /* request interrupted */
#define SMBR_RESTART   0x0010  /* request should be repeated if possible */
#define SMBR_NORESTART 0x0020  /* request is not restartable */

If you don't want to try to implement this, note that the read/write
works (try 'dd' instead of 'cp', so you aren't using mmap'ed pages,
and see that 'dd' works).

I don't think it's worth writing the code to attempt the retry, until
you know that the code will work -- that it's backed up far enough
that the retry won't fail.  This basically means you need to know why
it's failing in the first place, which means you need to be familiar
with how the server is implemented (not likely, unless your name is
Luke Howard, you're a Microsoft employee, or you have a Windows
source license -- otherwise it's probably about two weeks worth of
work).

The attached patch works around the problem by disabling the getpages
and putpages code in the smbfs.  This basically turns paging operations
into reads and writes, which we know from using 'dd' instead of 'cp'
will work.

Note: this is only a workaround: it disables obviously incorrect code,
but doesn't provide replacement code for the bogus code.

-- Terry
Index: smbfs_io.c
===
RCS file: /cvs/src/sys/fs/smbfs/smbfs_io.c,v
retrieving revision 1.13
diff -c -r1.13 smbfs_io.c
*** smbfs_io.c  4 Aug 2002 10:29:30 -   1.13
--- smbfs_io.c  21 Nov 2002 05:53:23 -
***
*** 400,405 
--- 400,406 
return error;
  }
  
+ #if BROKEN_PAGE_IO
  /*
   * Vnode op for VM getpages.
   * Wish wish  get rid from multiple IO routines
***
*** 655,660 
--- 656,662 
return rtvals[0];
  #endif /* SMBFS_RWGENERIC */
  }
+ #endif/* BROKEN_PAGE_IO */
  
  /*
   * Flush and invalidate all dirty buffers. If another process is already
Index: smbfs_node.h
===
RCS file: /cvs/src/sys/fs/smbfs/smbfs_node.h,v
retrieving revision 1.2
diff -c -r1.2 smbfs_node.h
*** smbfs_node.h18 Sep 2002 09:27:04 -  1.2
--- smbfs_node.h21 Nov 2002 05:53:47 -
***
*** 75,83 
  #define VTOSMB(vp)((struct smbnode *)(vp)-v_data)
  #define SMBTOV(np)((struct vnode *)(np)-n_vnode)
  
  struct vop_getpages_args;
- struct vop_inactive_args;
  struct vop_putpages_args;
  struct vop_reclaim_args;
  struct ucred;
  struct uio;
--- 75,85 
  #define VTOSMB(vp)((struct smbnode *)(vp)-v_data)
  #define SMBTOV(np)((struct vnode *)(np)-n_vnode)
  
+ #if BROKEN_PAGE_IO
  struct vop_getpages_args;
  struct vop_putpages_args;
+ #endif/* BROKEN_PAGE_IO */
+ struct vop_inactive_args;
  struct vop_reclaim_args;
  struct ucred;
  struct uio;
***
*** 89,96 
--- 91,100 
struct smbfattr *fap, struct vnode **vpp);
  u_int32_t smbfs_hash(const u_char *name, int nmlen);
  
+ #if BROKEN_PAGE_IO
  int  smbfs_getpages(struct vop_getpages_args *);
  int  smbfs_putpages(struct vop_putpages_args *);
+ #endif/* BROKEN_PAGE_IO */
  int  smbfs_readvnode(struct vnode *vp, struct uio *uiop, struct ucred *cred);
  int  smbfs_writevnode(struct vnode *vp, struct uio *uiop, struct ucred *cred, int 
ioflag);
  void smbfs_attr_cacheenter(struct vnode *vp, struct smbfattr *fap);
Index: smbfs_vnops.c
===
RCS file: /cvs/src/sys/fs/smbfs/smbfs_vnops.c,v
retrieving revision 1.24
diff -c -r1.24 smbfs_vnops.c
*** smbfs_vnops.c   26 Sep 2002 14:07:43 -  1.24
--- smbfs_vnops.c   21 Nov 2002 05:52:32 -
***
*** 96,102 
--- 96,104 
{ vop_create_desc, (vop_t *) smbfs_create },
{ vop_fsync_desc,  (vop_t *) smbfs_fsync },
{ vop_getattr_desc,(vop_t *) smbfs_getattr },
+ #if BROKEN_PAGE_IO
{ vop_getpages_desc,   (vop_t *) smbfs_getpages },
+ #endif/* BROKEN_PAGE_IO */
{ vop_inactive_desc,   (vop_t *) smbfs_inactive },
{ vop_ioctl_desc,  (vop_t *) smbfs_ioctl },
{ vop_islocked_desc,   (vop_t *) vop_stdislocked },
***
*** 108,114 
--- 110,118 
{ vop_open_desc,   (vop_t *) smbfs_open },
{ vop_pathconf_desc,

Re: Cross-Development with NetBSD

2002-11-21 Thread Terry Lambert

Ruslan Ermilov wrote:
 On Thu, Nov 21, 2002 at 12:10:14AM -0700, M. Warner Losh wrote:
  In message: [EMAIL PROTECTED]
  Wilkinson,Alex [EMAIL PROTECTED] writes:
  : Is FreeBSD likely to follow the in footsteps of NetBSD and create
  : a framework to do crossbuilds ?
  :
  : http://ezine.daemonnews.org/200211/xdevnetbsd.html
 
  FreeBSD already has cross builds for a while, since before NetBSD's
  cross build infrastructure.  However, NetBSD's infrastructure is a
  little more extensive because it is possible to do incremental builds
  and build full releases that work in a cross build evironment.

 What do you mean by incremental builds and full releases that work ...?

You know, like changing one line in /usr/src/lib/libstand on
a source tree on a x86 box, typing make release, and having
only the things that need to be rebuilt being rebuilt, resulting
in a working FreeBSD-Alpha or FreeBSD-SPARC64 release CDROM image.

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message

Re: smbfs problems

2002-11-21 Thread Terry Lambert

[ ... smbfs ... ]

Vallo Kallaste wrote:
 Now the writing part:
 
 After creating 5MB file using /dev/urandom, I'm trying to copy it
 over to users/vallo smb share mounted at /mnt, which fails. The copy
 is interruptible using Ctrl-C. Examination at NT4 server shows 0
 byte file. Umount of /mnt fails with device busy. Umount -f /mnt
 fails to return prompt, but after interrupting the smbfs is
 unmounted.  There is no kernel messages or something in syslog. The
 copy operation returns failure ~3 seconds after start.

Try using 'dd' instead of 'cp', or the patch I posted last night.

The shows 0 byte file is a normal artifact of how file
metadata is handled in Windows filesystems: unlike UNIX, a
partial file does not have the metadata, including the file
size, updated until the file is closed.  Therefore FTP restart
is pretty meaningless on Windows, unless you have an FTP client
that closes and reopens for writing the file it is transferring
at intervals, among other things -- one of which is that any
interrupted create operation will leave a 0 length file.

I don't know why your umount fails with device busy; what you
need to do is look at the connections which are open, and why it
cares about whether or not they are abandoned, in the unmount
case.  I rather expect that the connection(s) are jammed up, so
you can't close them so you have virtual circuit instances that
you can't get rid of.  I would expect even a force to take
whatever time it takes to dump the open handles, plus 2MSL plus
however much time it keeps the connection in the half-close
state, waiting for a FIN/ACK from the server.

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message

Re: smbfs problems

2002-11-21 Thread Terry Lambert

Vallo Kallaste wrote:
 Sorry forgot to add one detail. Althought dd'ing the same file to
 smbfs mount works, it'll sometimes modify the file being copied
 (size is different). It doesn't happen reliably, sometimes the file
 is copied fine, sometimes not. At the times the file isn't copied
 right there's an error message:
 
 root:vallo# dd if=testfile of=/mnt/vallo/test1
 dd: /mnt/vallo/test1: Bad address
 9356+0 records in
 9355+0 records out
 4789760 bytes transferred in 20.350003 secs (235369 bytes/sec)
 
 It seems to me that adding conv=sync flag to dd removes the
 abovementioned failure case. 10 tries of dd with this flag added did
 fine.

The 'conv-sync' flag to 'dd' pads the operation out to a record
boundary, if the input of the operation is not a full record in
length.

This observation is consistent with an incomplete final write,
for lack of data.  Probably this has to do with the TCP_PUSH
option and/or the SMB server's connection flags.

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message

Re: Cross-Development with NetBSD

2002-11-21 Thread Terry Lambert

Ruslan Ermilov wrote:
  NetBSD builds a directory full of tools that you can later use to
  incrementally build, say, 'ls' or 'cat' because one can define
  USETOOLS to be 'yes' and have the make automatically pick them up when
  rebuilding.  There are a few of the details I'm a little unclear on,
  but that's the jist of it.

 We also can, this just requires a few really tiny tweaks to Makefile.inc1,
 and I've posted them already some time ago -- basically, for each architecture
 you should build the subset of buildworld targets (WMAKE_TGTS), up to and
 including _libraries (if you want to build roughly any bit later), and
 them you can ``make {depend|all} SUBDIR_OVERRIDE=bin/cat'' for each of
 the desired TARGET_ARCH.

Any ETA on when this will be committed?


 I know that the Alpha and sparc64 binaries produced on i386 work.

I thought that the Alpha boot blocks ended up too large in the
cross-build case?  They did, last time I tried it.


 I know that cross-compiling i386 on either Alpha or sparc64 is
 broken (GCC sometimes produces different assembler output than
 the native compiler).  I lack the necessary hardware to actually
 test/fix the issues with cross-releases.

I don't think he was attacking you, personally, to ask you to
fix the problem, I think he was just noting the problem exists.


One thing that would help a lot -- and probably be helpful in
general -- would be a binary compare tool that ignored date
stamps in things like libraries, tar images, etc., so that you
could compare where things differ, easily, allowing someone to
track down differences.

It would be helpful in general to be able to compare what you
built vs. a release version, to assemble binary only delta lists,
for preparations for upgrade tools, etc..

I keep meaning to do this, but I really don't want to have to
release the tool under the GPL, if I don't have to.

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message

Re: recommended VAIO for ACPI hacking (Re: cvs commit:www/en/releases/5.0R todo.sgml)

2002-11-21 Thread Terry Lambert

Mitsuru IWASAKI wrote:
   Also, development resources are limited.  For example, none of ACPI
   developers has VAIO.
 
  Well, I don't know enough to be a developper but I do have a VAIO (Z600TEK)
  and can test things. Just ask.
 
 BTW, I'm planning to buy VAIO (maybe used one) to improve ACPI support.
 Any recommendations?

While I personally want you to buy a PCG-XG29 (what I have 8-)),
I think the most problems have been reported on the Z505 and
Z5xx series.

If you are going to buy used, the Z5?? and PCG-XG2? (especially
the PCG-XG28, not 29) are probably what will be available to you.

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message

Re: DP2 Fatal Trap

2002-11-21 Thread Terry Lambert

John Baldwin wrote:
 On 21-Nov-2002 Scott Sipe wrote:
  On Thursday 21 November 2002 01:36 pm, John Baldwin wrote:
  Hmm, is this from a GENERIC kernel?
 
  This is from straight from DP2 iso image cd install, X-Developer install,
  first boot after the install finished, generic kernel etc.
 
 Ok, generic kernel is the only really important part. :)  Can you
 do me a favor and see if you have a /boot/kernel.GENERIC/kernel.debug
 or a /boot/kernel/kernel.debug?  If so, can you please do
 'gdb -k kernel.debug' and then at the prompt do 'l *instruction pointer'
 where instruction pointer is the second part of the instruction pointer
 from the panic message?  (I.e., w/o the leading '0x8:' part.)

It's the PSE and PGE, John.  Are you sure you won't agree to
not disclose, so I can tell you what's happening?

Bosko has a patch which he will give you if you ask him for it
that (mostly) works around the problem.

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message

Re: DP2 Fatal Trap

2002-11-21 Thread Terry Lambert

John Baldwin wrote:
  Is it any help to know that my problems on P4 stopped after enabling
  DISABLE_PSE? Initially I had both of these enabled, but seems that
  one is enough. Just FYI.
 
 If we can verify that DISABLE_PG_G has no effect then that would be
 nice.

It has an effect: writing CR3 or a TSS resulting in a changed CR3
will not invalidate TLB entries with the G flag set, if PGE is set
in CR4.

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message

Re: 5.0 DP2 on Vaio Z600

2002-11-21 Thread Terry Lambert

Richard Tobin wrote:
 I'm trying to install DP2 on a Sony Vaio Z600TEK laptop, but it hangs
 at Probing devices, please wait (this can take a while)

[ ... ]

 Any suggestions?

Tell it to not load ACPI.

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message

Re: gcc 3.2.1 release import?

2002-11-21 Thread Terry Lambert

Marc Recht wrote: 
  There is neither a gcc 3.2.1 nor a gcc 3.3 yet, so I would't use any of
  them in a stable release.
 gcc 3.2.1 has been uploaded on ftp.gnu.org at Nov. 19th.

So it's been extensively tested by the full user base for the
last two days, and you should have known about it before you
posted.  8-) 8-).

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message

Re: gcc 3.2.1 release import?

2002-11-21 Thread Terry Lambert

Marc Recht wrote:
  So it's been extensively tested by the full user base for the
  last two days, and you should have known about it before you
  posted.  8-) 8-).
 
 My original question was only if it will be imported before 5.0R. David
 O'Brien already answered it with no. That's fine with me.

Don't worry about it; it's being totally blown out of proportion;
there's no way anyone will commit to importing a 2 day old 3.2.1,
which is why I put the smiley's there.

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message

Re: DP2 (and earlier) Problems *addition

2002-11-20 Thread Terry Lambert

Scott Sipe wrote:
 Sorry, should have done this with the first email.  The dmesg from my stable
 boot:

Yank half your memory, and try it again, and let us know.

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message

Re: DP2 (I think!) crash booting from floppies

2002-11-20 Thread Terry Lambert

local.freebsd.current wrote:
 I got a pair of floppies from:
 
 ftp://ftp.freebsd.org/pub/FreeBSD/snapshots/i386/5.0-20021103-SNAP/floppies/
 
 and booted them on a Dell Dimension XPS D300 which is currently
 running 4.7. It's a PII/300 with an Adaptec 2940 SCSI and an
 STB Riva graphics card.
 
 When booting the kernel off the second floppy I get:
 
 Booting [/kernel]...
 /
 
 Fatal trap 12: page fault while in vm86 mode
 fault virtual address  = 0x9f800
 instruction pointer= 0xf000:0x8c3e

Patch which was never integrated.  Build a new kernel.

http://docs.freebsd.org/cgi/getmsg.cgi?fetch=341812+0+archive/2002/freebsd-current/20021027.freebsd-current

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message

Re: Device permissions with DEVFS

2002-11-19 Thread Terry Lambert

Kip Macy wrote:
 Sorry, if I'm repeating something already said, but
 the tone of your mail would indicate that I'm not.
 
 This doesn't sound like an intrinsic limitation of
 devfs, just an issue with how it is structured now.
 There should just be a central file for all the
 devices which devfs sucks in at build (or maybe boot)
 time specifying the appropriate permissions and any
 other configuration information.

A separate ELF data section for this information would allow
kernel modules to have this information edited with a tool,
as well permitting the kernel itself to be edited with the
same tool, so that site defaults could be persistantly changed
from the source tree defaults.

Indeed, this would allow the permissions to be listed in the
case that Bruce was complaining about, which is the inability
to see what will happen when the hardware is present, if it's
not available to the tester.

An extension of this would permit chmod's against the devfs
to be written back to the kernel or driver module affected,
assuming your secure level is low enough and your flags are
set to permit it, which also gets rid of the common complaint
about persistance (which is really just a handy thing to use
to bitch about devfs when you can't come up with any legitimate
complaints, IMO).

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message

Re: Device permissions with DEVFS

2002-11-19 Thread Terry Lambert

Marcin Cieslak wrote:
 
 What's wrong with having /etc/minor_perm et consortes
 a la Solaris? With sensible kernel defaults to allow
 booting without your favourite root partition.

What's wrong with just having /dev?

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message

Re: Device permissions with DEVFS

2002-11-19 Thread Terry Lambert

Poul-Henning Kamp wrote:
 I have to say that the ownership issue has been a pet peeve of mine for
 some time: I would really like the kernel to know about exactly two magic
 id values: uid 0 (suser uid, default uid, default devfs owner), and gid 0
 (default gid, default devfs owner).  Hard-coding of other non-0 values in
 the kernel leads to many potential (and real) problems.
 
 While you are right in principle, I think we should not overengineer
 here.
 
 People who are likely to give operator a different gid are also
 very likely to compile their own kernels (which I admit does not
 solve the 3rd party KLD issue but...)
 
 Devfs(8) provides a mechanism for setting the m/o/g and a few other
 attributes, and it would in theory be possible to let all devices
 come up 0/0/0 and have /etc/rc set the policy from /etc/rc.

One fix for this would be to have a UID/GID list that's used to
derive both the default uid/gid values in devices, and the contents
of the default passwd file, so that they matched.

It seems to me that this issue is merely one of getting the UNIX
auth database and the default device attributes to agree, right?

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message

Re: New NVidia drivers on -current

2002-11-18 Thread Terry Lambert

David Holm wrote:
 Note that the patch has already been applied so no need to patch your kernel!
 
 BTW, why hasn't anyone set the mailing list to automatically set the reply-to
 address to [EMAIL PROTECTED]?

Because the poster may not be a list subscriber, and the most
important person to reply to is the poster.  It's up to the
poster's MUA to know that the poster is a subscriber to the
list, and set the Reply-To: on the basis of that knowledge.

This could be done on the server, but there are two reasons not
to do it in the mailing list manager:

1)  It's computationally expensive, and all processing that
could be done on either the server or the client, should
be done on the client, to ensure that the deployment
scales.

2)  There is a draft RFC which is under consideration by the
IETF, and is likely to become an issued RFC, which requires
that certain headers not be altered by mailing list managers;
specifically, all non hop-to-hop headers should not be
modified by the mailing list manager, and Reply-To: is an
end-to-end header, not a hop-to-hop header.  Sof if it isn't
in violation of an RFC for the mailing list software to set
the Reply-To: now, it likely will be, in the near future.

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message

Re: Run two copies of named from rc.conf?

2002-11-18 Thread Terry Lambert

Brad Knowles wrote:
 It depends on how you do it.  You could $INCLUDE the exterior
 file inside the interior file, if that subset of information is the
 same.  You could also use BIND 9 views.  Otherwise, split-horizon
 can be a pain.

If you have a LAN behind a transient network connection, and you
want your LAN to function without degradation as a result of losing
the link (Who ever heard of DSL going out?), then you want to have
your on site DNS server be authoritative.

But.  If you are transiently connected, then if the on site DNS
server is authoritative, then there is no way to look up externally
hosted services via DNS, unless the external DNS, also a hosted
service, and therefore not transiently connected, is authoritative.

One potential answer to this is that the external DNS is a secondary
of a stealth primary running at your local site.  However, this
has the unfortunate effect that a persistant outage will become a
general outage, should it last longer than the TTL for the externally
visible records.

In addition, there are no NOTIFY updates sent to the secondaries, if
the primary is offline when it is updated.

In addition, making the primary MX on site means a 3 minute delay
on all external mail send attempts to the site domain(s)., as the
connection attempt times out and falls back to the secondaries,
which are externally hosted.

Finally, externally hosted resources may require changes as the
actual facilities are changed around.  This includes relocation
of primary and secondary external MX's, relocation of web services,
relocation of database and other outsourced services, relocation of
shopping cart services, etc..  This may include relocation of the
primary IP address of the customer site, which would also require a
change to the IP address configured into the secondaries of the
stealth primaries.

Basically, what this boils down to is that you are never fully
authoritative for a domain for which there exist externally hosted
services, and such services must have priority ofver transiently
connected services.

For this to work, you have to have a DNS server that's external
(hosted, and therefore always available), as well as being seen
to be authoritative.

For local authority, then, you must delegate authority, without
delegating it as a subdomain, to the external server.  The easiest
way to do this is to, on a local lookup miss, forward the request
to an external server, even if you are the authoritative server,
AND to replicate local DNS information to the external authoritative
server, as well.

DNS does not support this right now, even with BIND 9's views.


The entire point of people coming onto the Internet for the first
time is to make themselves appear real, clueful, etc., and
that means a virtual non-transient connection, which basically
means external hosting of visible services by a third party, so
that it looks like the company has a full time Internet connection,
rather than looking like a Mom and Pop with only a dialup or
other transient connection.

Yeah, that doesn't sit very well with you, if you are a company
who wants to sell one server to each of 100 customers, rather
than 6 servers to a hosting provider, but tough: there's no law
that requires me to protect your business model, unless you are
a member of the music or motion picture industry, and have bribed
enough senators.

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message

Re: Run two copies of named from rc.conf?

2002-11-18 Thread Terry Lambert

Brad Knowles wrote:
 Sorry, I wasn't think of transient networks.  Indeed, that does
 make things a lot uglier.  I'll have to think some more about all the
 various implications, however.

One of the draft RFC's in the FTP directory I referenced is a
Best Current Practices document.

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message

Re: Run two copies of named from rc.conf?

2002-11-18 Thread Terry Lambert

John De Boskey wrote:
This an interesting thread, but it seems to be getting
 a bit off target. I need to kick off 2 name servers. The
 first is authoritive for the domain as seen externally
 and the 2nd which is authoritive for the internal network.
 
The internal forwards to the external when appropriate.
 These networks are not transient.

Then you want a single BIND 9 install with two views, one
bound to the internal IP, the other to the external.

You don't want what I've suggested.  And You don't want what
you originally asked for.  8-).

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message

Re: Are SysV semaphores thread-safe on CURRENT?

2002-11-18 Thread Terry Lambert

Brian Smith wrote:
 Sure SysV semaphores are thread-safe.  When a thread blocks on
 one, the entire process blocks (no threads run).  You won't
 get any safer than that ;-)
 
 Yikes that isn't good.  Is that only in STABLE?  or does CURRENT
 do that as well?  I guess I'll have to protect the semop() call
 with a pthread mutex to prevent two threads locking a single
 semaphore by the same process (creating a deadlock situation).
 
 Is this the recommended method of preventing these problems?

Yeah: don't make blocking system calls for which there are no
asynchronous equivalents.  Use the POSIX interfaces for use
by pthreads, instead.


 (the SysV semaphore is protecting shared memory accessed by
 multiple processes).
 
 Thanks for the info... it explains alot!

Use mmap of a backing-store file, and then use file locking to
do record locking in the shared memory segment.

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message

Re: Run two copies of named from rc.conf?

2002-11-17 Thread Terry Lambert

John De Boskey wrote:
 It would be nice if rc.conf could start a 2nd copy
 of named (split dns). Comments on the following simplistic
 patch?

Interior and exterior DNS is a useful case; however, there
are multiple ways to set it up; in general, it's not possible
to have interior authoritative DNS at the same time you have
exterior authoritative DNS (this was a mistake we made on the
InterJet, for a long time), without modifying the DNS server
to forward requests for which it has incomplete information
(e.g. the PNS draft RFC I wrote).

See the files in:

ftp://ftp.whistle.com/pub/terry/drafts/

for details.

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message

Re: DISABLE_PSE DISABLE_PG_G still needed?

2002-11-15 Thread Terry Lambert

Vallo Kallaste wrote:
 
 Hi
 
 The kernel compiled from yesterday sources and with the
 abovementioned options disabled will not finish make -j2 buildworld
 on P4. Dies with bus error:
 
 /usr/src/lib/libc/gen/termios.c: In function `tcgetpgrp':
 /usr/src/lib/libc/gen/termios.c:104: internal error: Bus error
 Please submit a full bug report,
 with preprocessed source if appropriate.
 See URL:http://www.gnu.org/software/gcc/bugs.html for instructions.
 
 Are those options still needed? They are commented out in NOTES and
 shouldn't be necessary, right?

What happens if you add those options?  Does it still crash?

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message

Re: DISABLE_PSE DISABLE_PG_G still needed?

2002-11-15 Thread Terry Lambert

Vallo Kallaste wrote:
  This may be a bit overstated. I removed those options from my kernel a few
  weeks ago and have no problems at all. Are you certain the problem is not
  specific to a particular CPU?
 
 Sorry, this can be CPU specific, but I'm not sure. I'll try to
 reproduce it on my home P2 system and P3-SMP lying under my desk at
 work.

How much memory do these systems have?

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message

Re: DISABLE_PSE DISABLE_PG_G still needed?

2002-11-15 Thread Terry Lambert

Robert Watson wrote:
 On Fri, 15 Nov 2002, John Baldwin wrote:
  It only happens with P4's.  I haven't seen it locally on a p4 test
  machine at work that I have built test releases on.  Also, it would be
  nice to see if just adding one of the options fixed the problems.  As
  for NOTES, those options should not be enabled in NOTES as they would
  defeat the purpose of LINT since they disable code.
 
 Does this apply generally to all P4's, or just a subset?  If all, it may
 be we want to add a P4-workaround to GENERIC so that P4's work better ouf
 of the box.  If it's a select few, I wonder if there's some way to test
 for the problem early in the boot...
 
 One of the recurring themes here has (a) been P4 processors, and (b) been
 a fear that because of timing changes introduced by the DISABLE_FOO flags,
 the real bug is still there, but less visible in the tests people are
 running.

The amount of RAM will also affect it.  It can also happen on P3's
and AMD K6's.  It is a CPU bug related to the use of 4M pages.

Bosko understands the problem (I have explained it to him under
non-disclosure), and he has a patch which avoids it without really
disclosing the problem, which I'm OK with.  Using the patch cranks
the amount of base memory required for a minimal FreeBSD up to 16M,
and loads the kernel at 4M, instead of 1M.  This avoids the problem
on purpose that the older FreeBSD locore.s used to avoid by accident.

The alternative is to take up to a 15% performance hit by not using
4M and global pages, or to revert the locore.s code so that it does
not tickle the hardware bug.

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message

Re: DISABLE_PSE DISABLE_PG_G still needed?

2002-11-15 Thread Terry Lambert

Wesley Morgan wrote:
 Based on this, are you recommending that the DISABLE_* still be used? Will
 I never see the problem with 512mb of ram?

When Matt Dillon made some of the machdep.c allocation sizes
dependent on the physical RAM size, it made the problem much
less predictable, based on the amount of RAM, so without
sitting down and doing some math to find out exactly where
each byte of memory was going, I could not tell you for a
given amount of RAM and CPU.

What I will tell you is that there is a stair function involved
in the amount of RAM you can install, and there is a following
function that looks similar, for the allocations made by machdep.c
now.  The problem will occur when there is a gap between the
stair and the follower, e.g.:

RAM available
|  .
|  +..
|  | . -- bug triggered
|  `-+.
|.+..
| | .  -- bug triggered
| `-+.
|   .+..
|| .   -- bug triggered
V
RAM used ---


  Bosko understands the problem (I have explained it to him under
  non-disclosure), and he has a patch which avoids it without really
  disclosing the problem, which I'm OK with.  Using the patch cranks
 
 So basically, there is a DEFECT in something that either Intel or AMD has
 some me (you, everyone) and they will not disclose the defect, honor any
 warranties, or provide fixes for the problem?

No.  The non-disclosure was mine.  I am not an Intel/AMD employee;
I discovered the defect independently.  As far as I know, they are
aware of the problem from Microsoft, but have no idea as to its
root cause.  It is likely that AMD licensed Intel designs, in order
for AMD to have the same problem.

You should be aware that Microsoft recommends a registry setting
that disables the use of 4M pages to work around the problem on
customer systems that have the problem.  They don't have the PG_G
problem that FreeBSD 5.x has, for the same reason that FreeBSD 4.3
didn't have it: serendipity.


 How... crappy. Reminds me of the Redhat/DMCA suppressed patch. I think
 consumers have a right to know about any defects in something they have
 bought.

Argue with your congressman; it was a U.S. law that suppressed the
patch, in that case.

 And I also think that the marketer should assume some liability
 for selling defective hardware (even though software makers seem to be
 able to get away with it).

Even defects that haven't been discovered or characterized by them?
Argue with the U.S. Supreme Court and the tobacco industry on that
one.  Degree of product liability is based on the prior knowledge of
potential harm.

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message

Re: Calendar Changes.

2002-11-13 Thread Terry Lambert

Tony Harverson wrote:
 My attention was drawn a little while ago to the fact that the South
 African holidays in /usr/share/calendar/ were far out of date (most not
 having been celebrated since 1994), and so I decided to clean them up.
 As soon as I got into actually working in that directory, it struck me
 that it's hard to know just where to fix things.
[ ... ]
 I'm getting the impression the whole thing grew organically, rather than
 with a design in mind..


I'm getting the impression that South Africa's major historical
events have occurred at almost random times, with the resulting
list of official holidays growing organically, rather than with
a design in mind...

8-) 8-).

Maybe we should put a cap on major historical events?

Yes, Oingnatia, it would be nice for your people to be free, but
if you become a representative democracy, and make a holiday of it,
we will have to edit calendar.holiday, and that would be a pain;
could you keep your murderous dictator until Next July 17th? Then
we will be able to just symlink you to your neighbor, Mugataland,
since that's when they killed off their former military Junta. Or
if you could at least wait to throw off the chains of oppression
until after 5.0 is released, we'd really appreciate it.  Thanks.

Things involving people often grow organically, rather than being
planned; I think we just have to live with it...

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message

Re: -current make jdk13 with native_threads error

2002-11-13 Thread Terry Lambert

Bill Huey (Hui) wrote:
 On Wed, Nov 13, 2002 at 01:51:38AM -0800, Bill Huey wrote:
  That's all been removed from a MFC of libc_r recently. Native
 
 Uh, you're running on -current I presume (without reviewing the
 original post), but the same logic still applies.

They didn't say; I assumed they were, because of the line number
in the header fole for the undefined timeval struct matching
the -current source code, but not 4.7, and because they posted
to the -current list.  8-).

Thanks for the HotSpot info, BTW; it was worth squirreling away
for me, and I'm sure they will find it useful...

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message

Re: binutils symbol hiding and versioning (was Re: [PATCH] note the __sF change in src/UPDATING)

2002-11-12 Thread Terry Lambert

Loren James Rittle wrote:
 FYI, the libstdc++-v3 maintainers on the FSF side are only
 guaranteeing forward ABI compatibility of any sort if libstdc++.so is
 built with symbol versioning and symbol hiding.

FWIW: symbol versioning is incredibly broken.  It attempts to
do in UNIX what interface versioning does in Windows, through
the use of class factories accessed via IUnknown.  The point of
the exercise is to allow multiple simultaneous versions of an
interface to be exported by a single library.

The main reason this is bas is the same reason that Novell must,
in their SDK's, support interface versions all the way back to
NetWare 1.x: in order to hve the largest possible user base, a
software vendor would have to be stupid to write to anything
other than the lowest common denominator of interfaces: it's
really stupid to limit your customers to running NetWare 4.2 or
above, when there is still such a large installed base of 3.x,
4.0, and 4.1 versions of NetWare.  The only thing you do when
you do that is to disenfranchise postential customers for your
product.

Windows uses this approach because they do not have the concept
of shared object versioning; VCRTL32.DLL is VCRTL32.DLL, no
matter what, so a version change that permits old applications
to continue working is a the same DLL, plus extensions (since
there is no version in the file name,multiple versions can only
exist simultaneously if they exist in the same file).

It would be a very big mistake to actually utilize symbol name
versioning on a UNIX system, and buy into this model, even if
the idea was supported by the tools.  That Linux has bought into
the idea of using this is, frankly, Linux's problem, and they are
going to regret it in the future as much, or more, than they
regretted implementing shared library support in the SVR3.2 way,
of linking libraries to fixed locations, and carving up chunks
of the user virtual address space to implement them, back when
they first supported shared libraries.

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message

Re: building -STABLE from within -CURRENT

2002-11-12 Thread Terry Lambert

Wilkinson,Alex wrote:
 In a dual boot situation, is it possible to be logged into -CURRENT and
 build -STABLE ( ie -STABLE filesystems live on separate fdisk partitions
 and are exported ) ?

Only if you unpack the contents of CDROM #2 from a -STABLE system
into a chroot environment.

Specifically, the compiler and FFS changes have added a number of
incompatabilities that are insurmountable for cross-building, from
my experience.

Check the -current mailing list archives, as it applies to being
able to cross-build versions of FreeBSD; this was covered in detail
last week (and about every two weeks, previous to that).

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message

Re: Samba on -current

2002-11-12 Thread Terry Lambert

Jason Vervlied wrote:
 I am having problems with a Samba share on my -current box, I just
 installed from 20021103-SNAP. I did recompile my kernel with the following
 options.

[ ... ]

 I also added the SMP options to the kernel. I used the same options under
 -stable and experineced no issues. Here is the error I get when I try to
 copy a file from my samba share
 
 [jvervlied@current 80-85]$ cp bad_religion-yesterday.mp3 ~/
 cp: /home/jvervlied/bad_religion-yesterday.mp3: Bad address

This was discussed in detail about 3 weeks ago.  I suggested one
workaround, which would be to disable the Samba-specific
getpages/putpages code, since the timeout is in the getpages,
where an operational status code says that the attempt is both
not recoverable and should be retried.

Another partial fix is to retry for a count, but the unrecoverable
part of the error indicates that the operation need to be retried
at a much higher level (potentially, all the way to the point of
reestablishing the session).

See the archived previous discussion for details.

One alternative is to use dd instead of cp to copy the file,
thus avoiding the mmap'ed data failts that come from cp to the
SMBFS.

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message

Re: Samba on -current

2002-11-12 Thread Terry Lambert

Sheldon Hearn wrote:
 It's a known problem.  Consider reading the -current mailing list to
 keep up to date with known problems.  It was discussed last week.
 
 No solution is known at this time.  Use dd(1) instead of cp(1) as an
 interim workaround.

Actually, it's fixable 3 ways:

o   Full fix for cp, ugly

Remove the getpages/putpages from the implementation of
the SMBFS' VOPs table.  This will force it to fall back
to code that uses read/write instead, which doesn't have
the problem.  Performance will suck, but the copies will
work as expected, though mmap won't.

o   Incomplete fix, ugly, may be enough anyway

Put a retry loop in the getpages/putpages code (mostly,
the getpages code), to retry the operation at that level;
if the failure does not occur at the session or handle
level, then this will cover up the problem.  If the session
or handle reference is failing, there is insufficient data
to rewind state to the point where it can be retried, due
to the fact that you would need to go up several calls, and
then back down into the VOP, to reestablish the handle to
retry the call again.  If you had to reestablish the session,
you'd have to go even higher.

o   Complete fix, a lot of work

The code needs to be refactored, so that a restart with the
handle or session invalid works.  This means seperating out
the session and handle management from the standard code
path, so that it can be restarted at any point, so that the
state doesn't need to be unwound.  The problem here is that
you are in a trap handler of a write on another FS, faulting
on a page that's backed by the SMBFS, so it's not like you
can recover enough information otherwise to recreate the
handle or session, if necessary.  So you would have to ask
for the handle from the cache, and then the session for the
handle from the cache, if the handle was not valid.

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message

Re: binutils symbol hiding and versioning (was Re: [PATCH] note the __sF change in src/UPDATING)

2002-11-12 Thread Terry Lambert

Loren James Rittle wrote:
FWIW: symbol versioning is incredibly broken. It attempts to
do in UNIX what interface versioning does in Windows, through
the use of class factories accessed via IUnknown.

You might be absolutely correct in general. However, please read
http://gcc.gnu.org/onlinedocs/libstdc++/abi.txt . It is clear that
symbol versioning is not being used at all like you supposed it might
have been (mis-)used.

To provide multiple API's in a single shared library image, to
avoid a number of images being necessary?

That's exactly how it's being used; they call it age gracefully,
but what that means is that multiple API versions remain supported
for a very, very long time, without having seperate libraries
involved.

Technically, the symbol decoration used in C++ is also in error;
it's there simply to avoid having to change the object file format
to accomodate interface attributes seperately from the symbol name
(just as adding an @@version name suffix does). If this were
not ther, the linker could automatically create glue code for doing
argument coercion, and it would save a lot of code that currently
has to be supplied by programmers.

FWIW: There is no concept of IUnknown or
implementation factories (and, yes, I do understand those concepts) in
how libstdc++.so (v3) is using symbol versioning. I invite you to
take a close look at how that library is actually using symbol hiding
and versioning before you attempt to cast your judgment. If you have
informed comments, then please direction them to [EMAIL PROTECTED]
not me personally (as a libstdc++-v3 maintainer, I will read them over
there like all others).

I'm well aware of how it's used; the IUnknown reference was an
analogy; the document you refer to specifically states that it's
to avoid a proliferation of shared library files. That's exactly
the purpose of IUnknown version information for class factories,
as well.

Part of the problem here is that GCC dropped the minor version
number from shared libraries in binutils, and FreeBSD and Linux
followed suit. Now this turns out to be a mistake, and rather
than admitting the mistake, instead now we have more decoration
occurring in the symbol name to fake up another orthogonal namespace.

Traditional UNIX systems have minor versions on shared libraries
to address this, and bump major versions only if existing interfaces
change, not when interfaces are added (thus program foo linked
against lib.so.M.N works just fine against lib.so.M.N+K).

If you don't like the Microsoft DLL version analogy, a different
analogy is the Aztec C support for directories, by naming files
with their complete path, and treating the character / as a
path component seperator in order to achieve a namespace escape,
when the Mac FS didn't support directory hierarchies.

In all cases, what's happening is a namespace incursion in order to
permit coexistance of otherwise conflicting implementations.

Short summation: We only mark the first version of the library that a
new symbol is added. E.g. there will never be [EMAIL PROTECTED] and also [EMAIL PROTECTED]
The first release after an ELF library version number bump resets all
tags to be the same. As clearly documented, libstdc++.so (v3) will bump
the major version number just like FreeBSD does on installed shared
libraries to express other sorts of C++ compiler or library ABI change.

This still fails when the OS version does not bump each time the
compiler version bumps. I guess this is OK, if you are a compiler
vendor, but less OK, when you are an OS vendor. 8-).

If the system compiler of FreeBSD still wanted to install multiple
versions of libstdc++.so (v3) with major number bumps for other
reasons (because it is considered safer for compatibility by the
system designers), that would be quite fine. But completely ignoring
the symbol hiding features will make the FreeBSD C++ system compiler
and environment worse than the Linux version and worse than a g++
installed from equivalent FSF sources IMHO (since we will leak all
sorts of internal implementation symbols that are not suppose to
influence user application link behavior). Anyways, Alex was already
going to look into trying this for the FreeBSD 5.0 system compiler so
hopefully this will not be the case.

No, it will make it incompatible, which is rather annoying, but
it's an introduced incompatability that came from the compiler,
and we shouldn't pretend it isn't.

In any case, the issue was in attempting to prevent the exposure
of data interfaces, and symbols not part of the defined API; this
is still goinf to cause problems for the reasons this discussion
came up in the first place: other language compilers that need to
link against system libraries, and share implementation instances
so that they can be linked against foreign object files that use the
same underlying implementation. For the purposes of this discussion,
that's the stdio implementation, as exposed

< 1 2 3 4 5 6 7 8 9 10 >

501 - 600 of 1538 matches

Mail list logo