Re: Potential problem with reading data from usb devices with ugen(4)
Hello. I've been making progress on this issue and have now run into another issue which folks may be able to shed some light on. First, where I am. I've modified the ugen(4) driver to read from devices asynchronously using a callout, allowing me to issue non-blocking read requests and fixing poll(9) so it actually works as expected. This gets me a lot closer to what I need, and I'm now able to do an initial exchange with several Apple devices to the point of having the libimobilidevice library try to pair with the units. However, see below. My new problem. Apparently, the data exchange protocol the iDevices use requires that data packets that happen to be a multiple of the transfer size be followed by zero length packets. Re-reading what Nick wrote on this thread about USB writes consisting of USB transactions on the bus, I think this means you just initiate a transfer to the device in question with no payload. I believe I know how to do this from within the ugen(4) driver itself, and I have a test patch that tries to do this in ugen_do_write(). However, if I call write(2) against the file descriptor for the bulk write endpoint of my device with a zero length write request, the call never gets down as far as the ugen(4) driver before the system decides the work is done. In general, that's probably a good thing and I'm not trying to change that behavior. However, I am wondering how folks feel the best way to generate zero length packets would be. My initial thought is to implement an additional ioctl that the usb libraries could call if they notice that they want to generate a zero length packet. Is there a better or different way to do this? Maybe someone has already solved this problem and I just don't see it? -thanks -Brian
Re: Potential problem with reading data from usb devices with ugen(4)
On Nov 5, 9:07pm, Nick Hudson wrote: } Subject: Re: Potential problem with reading data from usb devices with uge } On 05/11/2015 18:53, Brian Buhrow wrote: } > Hello. I've been making progress on this issue and have now run into } > another issue which folks may be able to shed some light on. } > } > I've modified the ugen(4) driver to read from devices asynchronously } > using a callout, allowing me to issue non-blocking read requests and fixing } > poll(9) so it actually works as expected. } } You shouldn't need this. Adjust the code to keep a read transfer alive } at all times. I'm not sure I understand what your saying. If I just keep perpetuating the transfer without the callout, don't I have to wait until the transfer is done before returning to the user process? Right now, with my patches, I'll return partial data, or no data to the user but then continue collecting data from the device, signaling the user with poll/select acknowledgement when the transfer is finally done. What I want is an almost immediate return from the read(2) call without any delay. Doesn't your suggestion ensure that doesn't happen? I'm not trying to argue, just trying to understand so I have something commitable at the end. } } See USBD_FORCE_SHORT_XFER - ugen needs to be taught about it. Ok, but how do I get a 0-length write request down to the ugen(4) driver in the first place? That's the part that doesn't seem to be making it. -thanks -Brian
Re: NFS related panics and hangs
On 05 Nov 2015, at 21:48, Rhialtowrote: > Looking into this: > > the occurrences of nfs_reqq are as follows: > > fs/nfs/client/nfs_clvnops.c: * nfs_reqq_mtx : Global lock, protects the > nfs_reqq list. > > Since there is no other mention of nfs_reqq_mtx in the whole syssrc > tarball, this looks wrong. It also immediately causes the suspicion > that the list isn't in fact protected at all. This file (fs/nfs/client/nfs_clvnops.c) is part of a second (dead) nfs implementation from FreeBSD. It is not part of any kernel. Our nfs lives in sys/nfs. -- J. Hannken-Illjes - hann...@eis.cs.tu-bs.de - TU Braunschweig (Germany) signature.asc Description: Message signed with OpenPGP using GPGMail
Re: NFS related panics and hangs
On Thu 05 Nov 2015 at 22:30:51 +0100, J. Hannken-Illjes wrote: > This file (fs/nfs/client/nfs_clvnops.c) is part of a second (dead) nfs > implementation from FreeBSD. It is not part of any kernel. > > Our nfs lives in sys/nfs. Ok, why is it included in syssrc.tgz then? I'd say it should not be there. My other observations still stand, it seems, since they concern files in sys/nfs. -Olaf. -- ___ Olaf 'Rhialto' Seibert -- The Doctor: No, 'eureka' is Greek for \X/ rhialto/at/xs4all.nl-- 'this bath is too hot.' signature.asc Description: PGP signature
Re: NFS related panics and hangs
[ Adding tech-kern. The relevant earlier mails start at http://mail-index.netbsd.org/current-users/2015/10/19/msg028233.html This is about a default-installed amd64 GENERIC 7.0 kernel. Replies are better in tech-kern, I think, so I set Reply-To accordingly. ] On Fri 23 Oct 2015 at 00:46:57 +0200, Rhialto wrote: > This problem is very repeatable, usually within a few hours, just now it > happened within half an hour. > > It seems to me that somehow the nfs_reqq list gets corrupted. Then > either there is a crash when traversing it in nfs_timer() (occurring in > nfs_sigintr() due to being called with a bogus pointer), or there is a > hang when one of the NFS requests gets lost and never retried. Looking into this: the occurrences of nfs_reqq are as follows: fs/nfs/client/nfs_clvnops.c: * nfs_reqq_mtx : Global lock, protects the nfs_reqq list. Since there is no other mention of nfs_reqq_mtx in the whole syssrc tarball, this looks wrong. It also immediately causes the suspicion that the list isn't in fact protected at all. nfs/nfs.h:extern TAILQ_HEAD(nfsreqhead, nfsreq) nfs_reqq; nfs/nfs_clntsocket.c: TAILQ_FOREACH(rep, _reqq, r_chain) { nfs/nfs_clntsocket.c: TAILQ_INSERT_TAIL(_reqq, rep, r_chain); nfs/nfs_clntsocket.c: TAILQ_REMOVE(_reqq, rep, r_chain); Protected with s = splsoftnet(); for match #2 and #3 but #1 seems not protected by anything I can see nearby. Maybe it is error = nfs_rcvlock(nmp, myrep); if that makes any sense. That function definitely does not use either splsoftnet() OR mutex_enter(softnet_lock). nfs/nfs_socket.c:struct nfsreqhead nfs_reqq; nfs/nfs_socket.c: TAILQ_FOREACH(rp, _reqq, r_chain) { nfs/nfs_socket.c: TAILQ_FOREACH(rep, _reqq, r_chain) { match #3 is protected with mutex_enter(softnet_lock); /* XXX PR 40491 */ but none of the others (visibly nearby). #2 is called from nfs_receive() which uses nfs_sndlock() which also doesn't use either splsoftnet() OR mutex_enter(softnet_lock). nfs/nfs_subs.c: TAILQ_INIT(_reqq); presumably doesn't need any extra protection. softnet_lock is allocated as ./kern/uipc_socket.c:kmutex_t *softnet_lock; ./kern/uipc_socket.c: softnet_lock = mutex_obj_alloc(MUTEX_DEFAULT, IPL_NONE); IPL_NONE seems inconsistent with splsoftnet(). I never studied the inner details of kernel locking, but the diversity of protections of this list doesn't inspire trust at first sight... -Olaf. -- ___ Olaf 'Rhialto' Seibert -- The Doctor: No, 'eureka' is Greek for \X/ rhialto/at/xs4all.nl-- 'this bath is too hot.' signature.asc Description: PGP signature
Re: NFS related panics and hangs
On Thu, Nov 05, 2015 at 10:46:17PM +0100, Rhialto wrote: > > This file (fs/nfs/client/nfs_clvnops.c) is part of a second (dead) nfs > > implementation from FreeBSD. It is not part of any kernel. > > > > Our nfs lives in sys/nfs. > > Ok, why is it included in syssrc.tgz then? > I'd say it should not be there. > > My other observations still stand, it seems, since they concern files in > sys/nfs. Because migrating to that nfs is the long-term plan because it'll get us nfsv4. (anyone who wants to help work on this, please help...) -- David A. Holland dholl...@netbsd.org