Re: [PATCH] Searching for users of netncp and nwfs to help debug5.0 problems

2002-11-26 Thread Julian Elischer
some comments:

firstly:

  ncp_conn_locklist(int flags, struct proc *p)
  {
!   struct thread *td = TAILQ_FIRST(p-p_threads); /* XXX
*/
!
!   return lockmgr(listlock, flags | LK_CANRECURSE, 0, td);
  }

can't you use unidifs? :-)

ok
there is a Macro to find the first thread in a process.
FIRST_THREAD_IN_PROC(p)

I use it because it makes it easy to find the
places that are DEFINITLY BROKEN. The marker for a KSE breakage is:
XXXKSE, but places that use FIRST_THREAD_IN_PROC(p) are marked 
implicitly since nearly any time it is used, the code is trying to 
do something that doesn't make sense. (except in fork and 
maybe exit and exec, and in things like linux emulation
where you KNOW there is only one thread).

if you see TAILQ_FIRST(p-p_threads) (or FIRST_THREAD_IN_PROC(p)) you
can pretty much guarantee that what is happenning there need to be
completely rewritten, possibly to the extent of rerwiting it's callers
and the caller's callers. That is the problem I hit when trying to
convert it to start with...

Since I don't know the way that process IDs come into the session
control (you have to understand the protocol for that) I basically hit
a wall on trying to work out what to rewrite, and how.

BTW the obnoxious FIRST_THREAD_IN_PROC will go away when
we have got rid of most of the broken code and be replaced in places
for which it is ok with p2td() which will be:


__inline struct thread *
p2td(struct proc *p)
{
KASSERT(((p-p_flag  P_KSES) == 0),
(Threaded program uses p2td()));
return (TAILQ_FIRST(p-p_threads));
}

You can always get from a thread to a single process but the reverse
always presents the question which thread?. This question can only be
answered with external information or if you know there is only one
thread at this moment.

julian

On Tue, 26 Nov 2002, Terry Lambert wrote:

 Nate Lawson wrote:
   It's not so much that I volunteered as I said that I'd help with
   thread/proc issues..
   The trouble was that there are places where it used a proc in the old
   code, but in some cases it needs to be a proc, and in other cases it now
   needs to be a thread. But all they stored was the proc. Also, from
   my memories of the code you needed to understand the protocol to know
   which needed to be which, and I don't know that protocol.
  
   In addition whoever does it needs to remember that any structure that
   stores a thread poitner is probably in error, as threads
   are transient items and any stored thread pointer is probably a wild
   pointer within a few milliseconds of being stored. :-)
  
  I'll take a whack at it and send it out by tomorrow, working or not.
 
 Don't bother.  8-).
 
 The attached patch makes it compile, and takes a shot at doing the
 right thing.
 
 The threasd stuff is problematic; it's useful only for a blocking
 context.  The process stuff is there to identify the connection,
 actually, which can mean huge latencies (hence the caching of
 procp).
 
 It helps to know that the protocol is exclusively request/response
 per session, the current code handles only a single session per
 process (not one per thread), and that lock requests are answered
 bith synchronously and asynchronously (request/response, then async
 message on timeout or success).
 
 -- Terry


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: [PATCH] Searching for users of netncp and nwfs to help debug5.0 problems

2002-11-26 Thread Terry Lambert
First of all, the patch was just to get to the point of compilability,
which other prople said they would take it from there.  I don't
have a NetWare server to test against in my apartment.  I'd be just as
happy to _let_ the other people who wanted to take it from there do
do, now that I made it compile.

That said...

Julian Elischer wrote:
 some comments:
 firstly:
 
   ncp_conn_locklist(int flags, struct proc *p)
   {
 !   struct thread *td = TAILQ_FIRST(p-p_threads); /* XXX
 */
 !
 !   return lockmgr(listlock, flags | LK_CANRECURSE, 0, td);
   }
 
 can't you use unidifs? :-)

Only if I want the code to be unreadable.  8-) 8-).  Can't you
apply the patch to a -current tree checked out for that purpose,
and get whatever flavor of diffs you want, e.g. cvs diff -u?
Unidiffs don't support fuzz, and uless you are committing the
thing, I'd rather not have to recreate it interatively until it
passes someone's filter.  A context diff gives me that.


 ok
 there is a Macro to find the first thread in a process.
 FIRST_THREAD_IN_PROC(p)

I didn't see this.  It's definitely the way to go.

Any chance of that being documented any time soon, so that no
one else does the obvious thing, instead, like I did?


 I use it because it makes it easy to find the
 places that are DEFINITLY BROKEN. The marker for a KSE breakage is:
 XXXKSE, but places that use FIRST_THREAD_IN_PROC(p) are marked
 implicitly since nearly any time it is used, the code is trying to
 do something that doesn't make sense. (except in fork and
 maybe exit and exec, and in things like linux emulation
 where you KNOW there is only one thread).
 
 if you see TAILQ_FIRST(p-p_threads) (or FIRST_THREAD_IN_PROC(p)) you
 can pretty much guarantee that what is happenning there need to be
 completely rewritten, possibly to the extent of rerwiting it's callers
 and the caller's callers. That is the problem I hit when trying to
 convert it to start with...

Actually, I really wanted a LAST_THREAD_IN_PROC(p).  The
reality is that I want the most recently inserted thread to use
as a blocking context, on the theory that it would not be used
until much later, and that all the pages it cares about are much
more likely to be in core, particularly on a process with tons
of threads.

The only reason it's being used at all is that the lockmgr()
class need a blocking context for their calls, and it's an
explicit parameter (arguably, it should be curthread).

I can edit the patch (since it's not a unidiff 8-) 8-) 8-)),
or I can post a new one, if you want (it's only 9K, compressed).

But realize that all it means is give me a thread that I can
use as a blocking context while I'm waiting on lockmgr().


 Since I don't know the way that process IDs come into the session
 control (you have to understand the protocol for that) I basically hit
 a wall on trying to work out what to rewrite, and how.

The wire protocol is always request/response.  Always.  As I
stated before, the only exception is a lock, with/without a
timeout.  In that case, you get the synchronous response to
your synchornous request, which basically means request has
been queue for servicing, and you later get a seperate
notification.

The notification is by connection.  The connection is per process,
because we are talking about a connection where the credentials
are associated with the connection in question.  The connection
provides both a state context (waiting for request vs. request
in progress).  Making additional requests over the same VC will
result in a request in progress, go away you dork response to
the client.

When an async response to a lock request comes in, in comes in on
a seperate VC (each connection has 3 VCs, only 2 of which are
normally used: the one for request/response, and the one for
async notifications, which is overloaded to handle lock responses).
The connection is mapped back to a process to map it back to the
blocker.

What this basically means is that NCP's can't be handled as
multithreaded, without establishing a VC per thread.  It just
does not work.  Therefore NCP request are going to serialize in
the kernel, no matter what you do in happy-thread-town.

If you are asking for the code to be thread safe, then you are
basically talking about multithreading the whole stack:

NWFS - NCP Client - IPX - IP

Probably it would be a better idea if TCP/IP was multithreaded
first?


 BTW the obnoxious FIRST_THREAD_IN_PROC will go away when
 we have got rid of most of the broken code and be replaced in places
 for which it is ok with p2td() which will be:
 
 __inline struct thread *
 p2td(struct proc *p)
 {
 KASSERT(((p-p_flag  P_KSES) == 0),
 (Threaded program uses p2td()));
 return (TAILQ_FIRST(p-p_threads));
 }

Uh, how exactly is that less obnoxious, given it's the same code
with a different name and an obnoxious inline instead of a macro?
8-).


 You can always get from a thread to a single process but the reverse
 always presents 

Re: [PATCH] Searching for users of netncp and nwfs to help debug5.0 problems

2002-11-26 Thread Terry Lambert
Terry Lambert wrote:
 Did you want me to update the patch to use your FIRST_THREAD_IN_PROC
 macro and resend it?

OK; here it is, whether you wanted it or not.

-- Terry


smbfs_thr.diff.gz
Description: GNU Zip compressed data


Re: Searching for users of netncp and nwfs to help debug5.0 problems

2002-11-23 Thread Brad Knowles
At 1:04 PM -0800 2002/11/23, Nate Lawson wrote:


 I'd like to see this on in 5.0 for a while, given the number of new users
 and problem reports we are already getting and will soon get even more of.
 If it's too hard to complete this work, could we add DDB in by default in
 GENERIC?  (I shudder to think at the amount of debate such a request would
 cause but I'm making it nevertheless).


	IMO, for -CURRENT, we should turn on all possible debugging by 
default in the kernel.  Then, we should make it as easy as possible 
to capture and report the kind of trace information we need to help 
locate and fix the problem.

	Hmm.  I was thinking that I needed just one Cyclades terminal 
server box for my old Sun SPARC 5 clone (break-safe, of course). 
Now, I'm thinking that I'll need a multi-port model, which I can also 
connect to the Compaq Armada 4131T that I'll be running -DP2 on, and 
given all the network problems I've had lately, I may need to also 
connect it to the cisco ISDN/Ethernet router

--
Brad Knowles, [EMAIL PROTECTED]

They that can give up essential liberty to obtain a little temporary
safety deserve neither liberty nor safety.
-Benjamin Franklin, Historical Review of Pennsylvania.

GCS/IT d+(-) s:+(++): a C++(+++)$ UMBSHI$ P+++ L+ !E W+++(--) N+ !w---
O- M++ V PS++(+++) PE- Y+(++) PGP+++ t+(+++) 5++(+++) X++(+++) R+(+++)
tv+(+++) b+() DI+() D+(++) G+() e++ h--- r---(+++)* z(+++)

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message


Re: Searching for users of netncp and nwfs to help debug5.0 problems

2002-11-22 Thread Brad Knowles
At 6:45 PM -0500 2002/11/21, Robert Watson wrote:


 I appreciate the effort, and am interested in the idea, but in this case
 it was as much a solicitation for a developer as for the testing
 environment itself.  This won't just be testing of netncp and nwfs, this
 will probably require a developer to have local access to a netware
 configuration that they can do nasty things to in order to exercise the
 code properly.  Unfortunately, those seem to be in short supply.


	I understand.  My understanding is that part of the job of this 
team is to find people that have various different types of 
environments that could be used for testing like this, and put them 
together with the necessary development personnel.

 If I might suggest: there's a freebsd-qa mailing list.  It's a great place
 to organize QA efforts, whereas freebsd-chat is notorious for its lack of
 signal (it's where dead signals go to rot).


	There's been some talk of freebsd-qa, but so far the only thing I 
recall being said is that the sort of thing we're talking about doing 
is not what this list has been used for in the past.  We were kind of 
wondering where we could take this particular effort, and if we could 
commandeer the freebsd-qa list for this purpose.

If you moving the conversation there and get a bunch of
 people subscribed and interested, they'll be able to look there for the
 stream of bug fixes associated with the install process, and get easy
 access to the testing guide as it evolves, since we usually pass drafts
 through there, etc.


	We will be moving the conversation to somewhere more appropriate, 
but I don't know when or where.

	I believe that our biggest problem at the moment is finding the 
necessary development types to help us debug the problems and get 
them sorted out -- we've got people who have hardware, and would be 
more than willing to help test things out, but in the past they 
haven't gotten much help from the developers.

--
Brad Knowles, [EMAIL PROTECTED]

They that can give up essential liberty to obtain a little temporary
safety deserve neither liberty nor safety.
-Benjamin Franklin, Historical Review of Pennsylvania.

GCS/IT d+(-) s:+(++): a C++(+++)$ UMBSHI$ P+++ L+ !E W+++(--) N+ !w---
O- M++ V PS++(+++) PE- Y+(++) PGP+++ t+(+++) 5++(+++) X++(+++) R+(+++)
tv+(+++) b+() DI+() D+(++) G+() e++ h--- r---(+++)* z(+++)

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message


Re: Searching for users of netncp and nwfs to help debug5.0 problems

2002-11-22 Thread Brad Knowles
At 2:31 PM -0800 2002/11/22, Terry Lambert wrote:


 A bug-filing wizard would be useful.  The send-pr system
 doesn't cut it, and most people are unaware of how to file a
 decent bug report.  It doesn't help when the process involves
 another computer, a serial cable, recompiling a kernel to use
 a serial console and turn DDB support on, special configuration
 for system dump images, and changing the size of your swap
 partition to support the amount of RAM you have put into the
 machine.


	Speaking as someone who is about to step off the deep end and 
start trying to actually run and test -CURRENT on my system here at 
home, I believe that this kind of resource would be vitally important.

	In contrast, I've had a few crashes this past week from other 
programs here on my PowerBook G4 running MacOS X (primarily Chimera, 
based on the Mozilla Gecko engine with native Aqua interface), and 
they have made it very easy for me to report crashes.  They have 
integrated tools to extract the maximum amount of information from 
the system as to exactly what other programs were running, what the 
program stack was, and a whole host of other things.  All I have to 
do is type in my e-mail address, optionally describe what I was 
trying to do at the time, and have a functioning Internet connection 
so that they can upload the reports.  I'd share some examples with 
you, but they are *huge*.


	Now, we can say that running -CURRENT is not for people who want 
to be molly-coddled.  But I believe it's a good idea to give people 
better tools to help make a better system.  I am convinced that we 
can find a better compromise.

 If the PR contains a patch, and the owner does nothing in the
 allotted time, then give the PR submitter a commit bit, and give
 ownership of the area over to them.

 At the very least, PR's will be closed, and more people actively
 writing code will end up with commit bits.


	Gack.  I'm not sure even I would be quite this radical -- any 
moron (like me ;-) can generate a PR that might include a patch. 
IMO, better would be to give the area to another person who is 
suitably qualified, has the available cycles, and presumably already 
has a commit bit.

--
Brad Knowles, [EMAIL PROTECTED]

They that can give up essential liberty to obtain a little temporary
safety deserve neither liberty nor safety.
-Benjamin Franklin, Historical Review of Pennsylvania.

GCS/IT d+(-) s:+(++): a C++(+++)$ UMBSHI$ P+++ L+ !E W+++(--) N+ !w---
O- M++ V PS++(+++) PE- Y+(++) PGP+++ t+(+++) 5++(+++) X++(+++) R+(+++)
tv+(+++) b+() DI+() D+(++) G+() e++ h--- r---(+++)* z(+++)

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message


Re: Searching for users of netncp and nwfs to help debug5.0 problems

2002-11-22 Thread Terry Lambert
[ ... bug filing wizard ... ]

Brad Knowles wrote:
 Speaking as someone who is about to step off the deep end and
 start trying to actually run and test -CURRENT on my system here at
 home, I believe that this kind of resource would be vitally important.
 
 In contrast, I've had a few crashes this past week from other
 programs here on my PowerBook G4 running MacOS X (primarily Chimera,
 based on the Mozilla Gecko engine with native Aqua interface), and
 they have made it very easy for me to report crashes.  They have
 integrated tools to extract the maximum amount of information from
 the system as to exactly what other programs were running, what the
 program stack was, and a whole host of other things.  All I have to
 do is type in my e-mail address, optionally describe what I was
 trying to do at the time, and have a functioning Internet connection
 so that they can upload the reports.  I'd share some examples with
 you, but they are *huge*.

The amount of information is much less important than the utility
of the information to someone who is attempting to resolve the bug.
Most bug reports contain a lot of information that is really useless,
not topical to the bug in question (I was running XYZ and the kernel
paniced!), etc..

In terms of kernel problems, the absolutely most useful information
is the DDB traceback, followed by a DDB traceback mapped against a
debug kernel, followed by a system dump image and a debug kernel, etc..
By default, the system, as distributed, is not setup to supply this.

In terms of general problems, diagnostic messages are pretty lame;
setting aside the argument against data interfaces not being ripped
out being in itself lame, one bix example is the libkvm error message
kinfo_proc size mismatch (expected %d, got %d).

An error message should tell give you enough information that you
could deal with the problem reasonably; for the example message, a
better message would be:

 _kvm_err(kd, kd-program,
kinfo_proc size mismatch (expected %d, got %d),
sizeof(struct kinfo_proc),
kd-procbase-ki_structsize);
 _kvm_err(kd, kd-program,
recompile libkvm and recompile and reinstall %s,
kd-programsrcdir);


 Now, we can say that running -CURRENT is not for people who want
 to be molly-coddled.  But I believe it's a good idea to give people
 better tools to help make a better system.  I am convinced that we
 can find a better compromise.

Plus we aren't really talking about -CURRENT, we are talking about
5.0-DP2, or almost 5.0-RC1, if we're being honest.


   If the PR contains a patch, and the owner does nothing in the
   allotted time, then give the PR submitter a commit bit, and give
   ownership of the area over to them.
 
   At the very least, PR's will be closed, and more people actively
   writing code will end up with commit bits.
 
 Gack.  I'm not sure even I would be quite this radical -- any
 moron (like me ;-) can generate a PR that might include a patch.
 IMO, better would be to give the area to another person who is
 suitably qualified, has the available cycles, and presumably already
 has a commit bit.

Moron PR's are easily filterable: they are closable by the owner
with little effort.

PR's that are ignored, on the other hand, rather than being closed,
are most likely legitimate, but inconvenient.

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: Searching for users of netncp and nwfs to help debug5.0 problems

2002-11-21 Thread Brad Knowles
At 5:23 PM -0500 2002/11/21, Robert Watson wrote:


 (And, you have to bring your own test environment, as the second sentence
 suggests, but doesn't actually state).


	Over on -chat, we're in the process of putting together a list of 
volunteers, hardware, organizational talent, etc... to help test out 
-DP2.  Mark Murray is involved, but I personally would like to see at 
least one or two more core team members committed to making this 
happen.

	If we can get a suitable group of people together, with suitable 
hardware, and get the coordination effort done correctly, I believe 
that we can help make this a much more successful project.


	Your assistance in this effort would be greatly appreciated.

--
Brad Knowles, [EMAIL PROTECTED]

They that can give up essential liberty to obtain a little temporary
safety deserve neither liberty nor safety.
-Benjamin Franklin, Historical Review of Pennsylvania.

GCS/IT d+(-) s:+(++): a C++(+++)$ UMBSHI$ P+++ L+ !E W+++(--) N+ !w---
O- M++ V PS++(+++) PE- Y+(++) PGP+++ t+(+++) 5++(+++) X++(+++) R+(+++)
tv+(+++) b+() DI+() D+(++) G+() e++ h--- r---(+++)* z(+++)

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message