Re: [PATCH] Searching for users of netncp and nwfs to help
Julian Elischer wrote: Where does the passed in thread come from? Your changes to make certain functions which are exported interfaces take a thread * instead of a proc * argument. Generally don't use a thread pointer other than yourself unless you have a lock on the proc structure, or the schedlock. Certainly never store it anywhere.. Particularly anywhere that may persist while you sleep in any way. -exception.. kernel threads- .. they are persistant. The received lock response is going to come in on IPX, which is like UDP, so it's connectionless. The NCP is an el-cheapo timeout-and-retransmit layer on top of formated IPX datagrams (NetWare runs on IPX, not SPX). Basically, this means that the response and the async response to a lock requests, or the async server-to-client notifications (shutdowns, etc.) can come in and activate any listener. The connection has to be looked up, and then the operation has to be processed in the context of the process that has the connection open. It does not care which thread it's processed on, only that it's in the process that owns the connection (there are address space issues). The main problem here is that lockmgr() is being called to lock things that technically don't need to be locked, at all, really, to insure that operations are not attempted concurrently. It's not really necessary: the server will refuse additional requests on a connection, when there is one request outstanding. The only exception isn't really relevent here, because the code that I've seen writen doesn't really support packet burst mode data transfers (pseudo-windowed data streams layered on top of datagrams). Basically, this means unless someone is willing to do the work to set up a virtual circuit -- three network handles per -- per each potentially outstanding thread, and then, further, maintain an idle pool for them, everyone should treat the code as if it were not thread reentrant... because it's not. Gary Tomlinson, Duck, Ted Cowan, and others literally put man years into getting that working, and they had access to Novell source code in order to do the work in the NUC (NetWare UNIX Client) product. It's unlikely that it can be reverse-engineered without at least a PNW/NWU (Portable NetWare/NetWare for UNIX) source license. The *only* reasons there's a thread in there now as a paremeter is that (1) the top level interfaces require it and (2) the lockmgr() calls, that shouldn't need to be there, IMO, require it as a parameter. -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: [PATCH] Searching for users of netncp and nwfs to help
Terry Lambert wrote: The main problem here is that lockmgr() is being called to lock things that technically don't need to be locked, at all, really, to insure that operations are not attempted concurrently. It's not really necessary: the server will refuse additional requests on a connection, when there is one request outstanding. In case this wasn't clear to whoever was thinking of doing the work: add a serialization barrier at the ncp_* layer. You can remove it later, without any other code being adversely affected, if you add a connection pool later. Note also that the credentials can be passed on the VC, if you don't mind not running on NetWare prior to 3.1b. I recommend this, since it means connection, but not credential, sharing between processes for threads in the work-to-do pool. -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
[PATCH] Searching for users of netncp and nwfs to help debug 5.0 problems
Nate Lawson wrote: It's not so much that I volunteered as I said that I'd help with thread/proc issues.. The trouble was that there are places where it used a proc in the old code, but in some cases it needs to be a proc, and in other cases it now needs to be a thread. But all they stored was the proc. Also, from my memories of the code you needed to understand the protocol to know which needed to be which, and I don't know that protocol. In addition whoever does it needs to remember that any structure that stores a thread poitner is probably in error, as threads are transient items and any stored thread pointer is probably a wild pointer within a few milliseconds of being stored. :-) I'll take a whack at it and send it out by tomorrow, working or not. Don't bother. 8-). The attached patch makes it compile, and takes a shot at doing the right thing. The threasd stuff is problematic; it's useful only for a blocking context. The process stuff is there to identify the connection, actually, which can mean huge latencies (hence the caching of procp). It helps to know that the protocol is exclusively request/response per session, the current code handles only a single session per process (not one per thread), and that lock requests are answered bith synchronously and asynchronously (request/response, then async message on timeout or success). -- Terry smbfs_thr.diff.gz Description: GNU Zip compressed data
Re: [PATCH] Searching for users of netncp and nwfs to help debug 5.0 problems
Terry Lambert wrote: I'll take a whack at it and send it out by tomorrow, working or not. Don't bother. 8-). The attached patch makes it compile, and takes a shot at doing the right thing. Just a followup... select definitely won't work (IMO), but needs someone who is threads-savvy with kernel locks to deal with it; I cribbed lock flow from elsewhere, and it looks wrong to me. So this is *definitely* 56K of diffs that *only* address compilation completely, and not full function. -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: [PATCH] Searching for users of netncp and nwfs to help debug5.0 problems
some comments: firstly: ncp_conn_locklist(int flags, struct proc *p) { ! struct thread *td = TAILQ_FIRST(p-p_threads); /* XXX */ ! ! return lockmgr(listlock, flags | LK_CANRECURSE, 0, td); } can't you use unidifs? :-) ok there is a Macro to find the first thread in a process. FIRST_THREAD_IN_PROC(p) I use it because it makes it easy to find the places that are DEFINITLY BROKEN. The marker for a KSE breakage is: XXXKSE, but places that use FIRST_THREAD_IN_PROC(p) are marked implicitly since nearly any time it is used, the code is trying to do something that doesn't make sense. (except in fork and maybe exit and exec, and in things like linux emulation where you KNOW there is only one thread). if you see TAILQ_FIRST(p-p_threads) (or FIRST_THREAD_IN_PROC(p)) you can pretty much guarantee that what is happenning there need to be completely rewritten, possibly to the extent of rerwiting it's callers and the caller's callers. That is the problem I hit when trying to convert it to start with... Since I don't know the way that process IDs come into the session control (you have to understand the protocol for that) I basically hit a wall on trying to work out what to rewrite, and how. BTW the obnoxious FIRST_THREAD_IN_PROC will go away when we have got rid of most of the broken code and be replaced in places for which it is ok with p2td() which will be: __inline struct thread * p2td(struct proc *p) { KASSERT(((p-p_flag P_KSES) == 0), (Threaded program uses p2td())); return (TAILQ_FIRST(p-p_threads)); } You can always get from a thread to a single process but the reverse always presents the question which thread?. This question can only be answered with external information or if you know there is only one thread at this moment. julian On Tue, 26 Nov 2002, Terry Lambert wrote: Nate Lawson wrote: It's not so much that I volunteered as I said that I'd help with thread/proc issues.. The trouble was that there are places where it used a proc in the old code, but in some cases it needs to be a proc, and in other cases it now needs to be a thread. But all they stored was the proc. Also, from my memories of the code you needed to understand the protocol to know which needed to be which, and I don't know that protocol. In addition whoever does it needs to remember that any structure that stores a thread poitner is probably in error, as threads are transient items and any stored thread pointer is probably a wild pointer within a few milliseconds of being stored. :-) I'll take a whack at it and send it out by tomorrow, working or not. Don't bother. 8-). The attached patch makes it compile, and takes a shot at doing the right thing. The threasd stuff is problematic; it's useful only for a blocking context. The process stuff is there to identify the connection, actually, which can mean huge latencies (hence the caching of procp). It helps to know that the protocol is exclusively request/response per session, the current code handles only a single session per process (not one per thread), and that lock requests are answered bith synchronously and asynchronously (request/response, then async message on timeout or success). -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: [PATCH] Searching for users of netncp and nwfs to help debug5.0 problems
First of all, the patch was just to get to the point of compilability, which other prople said they would take it from there. I don't have a NetWare server to test against in my apartment. I'd be just as happy to _let_ the other people who wanted to take it from there do do, now that I made it compile. That said... Julian Elischer wrote: some comments: firstly: ncp_conn_locklist(int flags, struct proc *p) { ! struct thread *td = TAILQ_FIRST(p-p_threads); /* XXX */ ! ! return lockmgr(listlock, flags | LK_CANRECURSE, 0, td); } can't you use unidifs? :-) Only if I want the code to be unreadable. 8-) 8-). Can't you apply the patch to a -current tree checked out for that purpose, and get whatever flavor of diffs you want, e.g. cvs diff -u? Unidiffs don't support fuzz, and uless you are committing the thing, I'd rather not have to recreate it interatively until it passes someone's filter. A context diff gives me that. ok there is a Macro to find the first thread in a process. FIRST_THREAD_IN_PROC(p) I didn't see this. It's definitely the way to go. Any chance of that being documented any time soon, so that no one else does the obvious thing, instead, like I did? I use it because it makes it easy to find the places that are DEFINITLY BROKEN. The marker for a KSE breakage is: XXXKSE, but places that use FIRST_THREAD_IN_PROC(p) are marked implicitly since nearly any time it is used, the code is trying to do something that doesn't make sense. (except in fork and maybe exit and exec, and in things like linux emulation where you KNOW there is only one thread). if you see TAILQ_FIRST(p-p_threads) (or FIRST_THREAD_IN_PROC(p)) you can pretty much guarantee that what is happenning there need to be completely rewritten, possibly to the extent of rerwiting it's callers and the caller's callers. That is the problem I hit when trying to convert it to start with... Actually, I really wanted a LAST_THREAD_IN_PROC(p). The reality is that I want the most recently inserted thread to use as a blocking context, on the theory that it would not be used until much later, and that all the pages it cares about are much more likely to be in core, particularly on a process with tons of threads. The only reason it's being used at all is that the lockmgr() class need a blocking context for their calls, and it's an explicit parameter (arguably, it should be curthread). I can edit the patch (since it's not a unidiff 8-) 8-) 8-)), or I can post a new one, if you want (it's only 9K, compressed). But realize that all it means is give me a thread that I can use as a blocking context while I'm waiting on lockmgr(). Since I don't know the way that process IDs come into the session control (you have to understand the protocol for that) I basically hit a wall on trying to work out what to rewrite, and how. The wire protocol is always request/response. Always. As I stated before, the only exception is a lock, with/without a timeout. In that case, you get the synchronous response to your synchornous request, which basically means request has been queue for servicing, and you later get a seperate notification. The notification is by connection. The connection is per process, because we are talking about a connection where the credentials are associated with the connection in question. The connection provides both a state context (waiting for request vs. request in progress). Making additional requests over the same VC will result in a request in progress, go away you dork response to the client. When an async response to a lock request comes in, in comes in on a seperate VC (each connection has 3 VCs, only 2 of which are normally used: the one for request/response, and the one for async notifications, which is overloaded to handle lock responses). The connection is mapped back to a process to map it back to the blocker. What this basically means is that NCP's can't be handled as multithreaded, without establishing a VC per thread. It just does not work. Therefore NCP request are going to serialize in the kernel, no matter what you do in happy-thread-town. If you are asking for the code to be thread safe, then you are basically talking about multithreading the whole stack: NWFS - NCP Client - IPX - IP Probably it would be a better idea if TCP/IP was multithreaded first? BTW the obnoxious FIRST_THREAD_IN_PROC will go away when we have got rid of most of the broken code and be replaced in places for which it is ok with p2td() which will be: __inline struct thread * p2td(struct proc *p) { KASSERT(((p-p_flag P_KSES) == 0), (Threaded program uses p2td())); return (TAILQ_FIRST(p-p_threads)); } Uh, how exactly is that less obnoxious, given it's the same code with a different name and an obnoxious inline instead of a macro? 8-). You can always get from a thread to a single process but the reverse always presents
Re: [PATCH] Searching for users of netncp and nwfs to help debug5.0 problems
Terry Lambert wrote: Did you want me to update the patch to use your FIRST_THREAD_IN_PROC macro and resend it? OK; here it is, whether you wanted it or not. -- Terry smbfs_thr.diff.gz Description: GNU Zip compressed data
Re: [PATCH] Searching for users of netncp and nwfs to help debug5.0problems
On Tue, 26 Nov 2002, Terry Lambert wrote: Uh, how exactly is that less obnoxious, given it's the same code with a different name and an obnoxious inline instead of a macro? 8-). it's shorter .. You can always get from a thread to a single process but the reverse always presents the question which thread?. This question can only be answered with external information or if you know there is only one thread at this moment. The answer is that the code doesn't care what thread; it would prefer to not have to think in terms of threads at all, but if you want to force it to, then it's going to think in terms of blocking contexts for the benefit of FreeBSD code it calls, and nothing else. Hense the confusion as to whether to use a thread or a proc.. Did you want me to update the patch to use your FIRST_THREAD_IN_PROC macro and resend it? you could but the fact that FIRST_THREAD_IN_PROC() is used indicates that the whole thing is broken anyway. Your edits are mostly mechanical and don't actually solve the problem. To do that you probably need to actually rewrite some of it I think. To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: [PATCH] Searching for users of netncp and nwfs to help debug5.0problems
Julian Elischer wrote: The answer is that the code doesn't care what thread; it would prefer to not have to think in terms of threads at all, but if you want to force it to, then it's going to think in terms of blocking contexts for the benefit of FreeBSD code it calls, and nothing else. Hense the confusion as to whether to use a thread or a proc.. Not confusing at all. The only issue is references to the connection structure caches proc, which uses the first thread on the cached proc; otherwise, it uses the thread that was passed in. Did you want me to update the patch to use your FIRST_THREAD_IN_PROC macro and resend it? you could but the fact that FIRST_THREAD_IN_PROC() is used indicates that the whole thing is broken anyway. Your edits are mostly mechanical and don't actually solve the problem. To do that you probably need to actually rewrite some of it I think. They were _intended_ to be mechanical edits. It fixes the problem for the people who were willing to fix it, but didn't have any idea of how to do the edits. I can't really rewrite the code for you, without risking that Novell would claim that I did it with knowledge of the NUC implementation... you _do_ remember the last time Novell and BSD had an issue over code, right, back in 1994, after they bought USL? It's probably better that the patch I've done get to the people who volunteered to fix the code, once it could be compiled, and that the people who volunteered to help them with the threads issues do so. I've done as much as I can without legal risk. -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: [PATCH] Searching for users of netncp and nwfs to help
debug5.0problems In-Reply-To: [EMAIL PROTECTED] Message-ID: [EMAIL PROTECTED] MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII On Tue, 26 Nov 2002, Terry Lambert wrote: Julian Elischer wrote: The answer is that the code doesn't care what thread; it would prefer to not have to think in terms of threads at all, but if you want to force it to, then it's going to think in terms of blocking contexts for the benefit of FreeBSD code it calls, and nothing else. Hense the confusion as to whether to use a thread or a proc.. Not confusing at all. The only issue is references to the connection structure caches proc, which uses the first thread on the cached proc; otherwise, it uses the thread that was passed in. Where does the passed in thread come from? Generally don't use a thread pointer other than yourself unless you have a lock on the proc structure, or the schedlock. Certainly never store it anywhere.. Particularly anywhere that may persist while you sleep in any way. -exception.. kernel threads- .. they are persistant. To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message