At 07:53 PM 8/5/2003 Tuesday, Matthew Gierlach wrote:
Hi Dave:
I have two LiS questions that are unrelated to the performance improvements thread.
These two questions are with regard to:
1) other Linux kernel components interfacing with LiS 2) LiS information in Oops back traces
The behaviors described below are being experienced with vanilla off the shelf LiS; specifically, LiS-2.16.8.
I always encourage people having problems to use the latest version, not that it will make any difference in your case, but it is definitely a waste of brain power to troubleshoot problems that may already be fixed.
With respect to the topic of other Linux kernel components interfacing with LiS, this is our situation.
We have the Linux IP kernel subsystem directly invoking (as a call back function) our driver when it has IP data to deliver for our driver. The driver is an U/L multiplexor driver. The execution of the call back function is preempting the execution of the driver write service routine. The write service routine is holding a lock that is used to protect an internal driver queue.
The call back function allocates a STREAMs message and copies the data from the IP skblk into the STREAM message. It then invokes putnext() to deliver the STREAMized message to the driver state machine. No locks are acquired by the call back function in the process of transforming the IP data or calling putnext(). The driver has a lower and upper read queue put routine. The upper read queue put routine is "known" by the lower read put routine.
In the driver proper we use the irqsave/irqrestore versions of lis lock acquire/release to synchronize the write and read sides of the tandem queue pair. Our write service routine can manipulate an internal queue that also has the potential to be manipulated by the upper read put routine.
We're experiencing Oops because the locking to provide mutual exclusion access to the internal queue does not appear to be effective.
The question we have is: Is it reasonable to expect that the read put routines attempting to acquire the same lock that's held by the preempted write service as those put routines are being executed on behalf of IP will be denied? Or restating the question in another way: since IP is executing the putnext() function, how does LiS enforce the request for the same lock by the put routine that's already held by the preempted write service routine?
LiS cannot know what your private locks are. You need to treat the IP callback function as though it were an interrupt routine and have it acquire a lock that is common with the read/write put/service routines. One good practice is not to have such a routine call putnext(). Have it call putq() instead and let the service routine perform the STREAMS related work.
With regard to the topic of LiS information in Oops back traces, we're confused by what we're seeing (or not seeing) in the Oops back trace, especially with respect to LiS routines. We use ksymoops to obtain the symbolic back trace information.
Our write service routine explicitly calls a LiS lock acquire
function and holds that lock for the duration of the execution
of the service routine. In the Oops back trace, immediately
after the trace entry for the write service routine, there is
no evidence of the explicit call to acquire the lock. There
are immediately 2 entries after the write service routine entry
for: lis_sem_template.1039+560 and lis_spin_unlock_irqrestore_fcn+99
listed as being from [streams] before the next trace entry for
our driver that is well beyond the call for the lock. We're
confused about what we see in these back traces and whether or
not the entries are legitimate given we believe they do not
actually represent the code execution. I can forward such an
Oops if that would be of any help.
ksymoops output always looks to me as though the symbols are being associated with "addresses" in the stack using heuristics. They seem to have a lot of noise. Since the kernel guys like to compile the kernel with no stack frame pointers, ksymoops doesn't really know where the subroutine calling boundaries are in the stack. So it pretty much has to guess when it comes to relating stack cell values to symbolic addresses. That means that the reader of the output also has to guess at the correct interpretation.
Others may have more detailed views on this.
-- Dave
Thanks, Matt
_______________________________________________ Linux-streams mailing list [EMAIL PROTECTED] http://gsyc.escet.urjc.es/mailman/listinfo/linux-streams
