Re: Potential problem with reading data from usb devices with ugen(4)

2015-11-05 Thread Brian Buhrow
Hello.  I've been making progress on this issue and have now run into
another issue which folks may be able to shed some light on.

First, where I am.

I've modified the ugen(4) driver to read from devices asynchronously
using a callout, allowing me to issue non-blocking read requests and fixing
poll(9) so it actually works as expected. This gets me a lot closer to what
I need, and I'm now able to do an initial exchange with several Apple
devices to the point of having the libimobilidevice library try to pair
with the units.  However, see below.

My new problem.

Apparently, the data exchange protocol the iDevices use requires that
data packets that happen to be a multiple of the transfer size be followed
by zero length packets.  Re-reading what Nick wrote on this thread about
USB writes consisting of USB transactions on the bus, I think this means you
just initiate a transfer to the device in question with no payload.  I
believe I know how to do this from within the ugen(4) driver itself, and I
have a test patch that tries to do this in ugen_do_write().  However, if I
call write(2) against the file descriptor for the bulk write endpoint of my
device with a zero length write request, the call never gets down as far as
the ugen(4) driver before the system decides the work is done.  In general,
that's probably a good thing and I'm not trying to change that behavior.
However, I am wondering how folks feel the best way to generate zero length
packets would be.  My initial thought is to implement an additional ioctl that
the usb libraries could call if they notice that they want to generate a
zero length packet.  Is there a better or different way to do this?  Maybe
someone has already solved this problem and I just don't see it?
-thanks
-Brian





Re: Potential problem with reading data from usb devices with ugen(4)

2015-11-05 Thread Brian Buhrow
On Nov 5,  9:07pm, Nick Hudson wrote:
} Subject: Re: Potential problem with reading data from usb devices with uge
} On 05/11/2015 18:53, Brian Buhrow wrote:
} > Hello.  I've been making progress on this issue and have now run into
} > another issue which folks may be able to shed some light on.
} >
} > I've modified the ugen(4) driver to read from devices asynchronously
} > using a callout, allowing me to issue non-blocking read requests and fixing
} > poll(9) so it actually works as expected. 
} 
} You shouldn't need this. Adjust the code to keep a read transfer alive 
} at all times.

I'm not sure I understand what your saying.  If I just keep
perpetuating the transfer without the callout, don't I have to wait until
the transfer is done before returning to the user process? Right now, with
my patches, I'll return partial data, or no data to the user but then
continue collecting data from the device, signaling the user with
poll/select acknowledgement when the transfer is finally done.  What I want
is an almost immediate return from the read(2) call without any delay.
Doesn't your suggestion ensure that doesn't happen?  I'm not trying to
argue, just trying to understand so I have something commitable at the end.


} 
} See USBD_FORCE_SHORT_XFER - ugen needs to be taught about it.

Ok, but how do I get a 0-length write request down to the ugen(4)
driver in the first place?  That's the part that doesn't seem to be making
it.

-thanks

-Brian



Re: NFS related panics and hangs

2015-11-05 Thread J. Hannken-Illjes
On 05 Nov 2015, at 21:48, Rhialto  wrote:


> Looking into this:
> 
> the occurrences of nfs_reqq are as follows:
> 
> fs/nfs/client/nfs_clvnops.c: * nfs_reqq_mtx : Global lock, protects the 
> nfs_reqq list.
> 
> Since there is no other mention of nfs_reqq_mtx in the whole syssrc
> tarball, this looks wrong.  It also immediately causes the suspicion
> that the list isn't in fact protected at all.

This file (fs/nfs/client/nfs_clvnops.c) is part of a second (dead) nfs
implementation from FreeBSD.  It is not part of any kernel.

Our nfs lives in sys/nfs.

--
J. Hannken-Illjes - hann...@eis.cs.tu-bs.de - TU Braunschweig (Germany)



signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: NFS related panics and hangs

2015-11-05 Thread Rhialto
On Thu 05 Nov 2015 at 22:30:51 +0100, J. Hannken-Illjes wrote:
> This file (fs/nfs/client/nfs_clvnops.c) is part of a second (dead) nfs
> implementation from FreeBSD.  It is not part of any kernel.
> 
> Our nfs lives in sys/nfs.

Ok, why is it included in syssrc.tgz then?
I'd say it should not be there.

My other observations still stand, it seems, since they concern files in
sys/nfs.

-Olaf.
-- 
___ Olaf 'Rhialto' Seibert  -- The Doctor: No, 'eureka' is Greek for
\X/ rhialto/at/xs4all.nl-- 'this bath is too hot.'


signature.asc
Description: PGP signature


Re: NFS related panics and hangs

2015-11-05 Thread Rhialto
[ Adding tech-kern. The relevant earlier mails start at
  http://mail-index.netbsd.org/current-users/2015/10/19/msg028233.html
  This is about a default-installed amd64 GENERIC 7.0 kernel.
  Replies are better in tech-kern, I think, so I set Reply-To
  accordingly.  ]


On Fri 23 Oct 2015 at 00:46:57 +0200, Rhialto wrote:
> This problem is very repeatable, usually within a few hours, just now it
> happened within half an hour.
> 
> It seems to me that somehow the nfs_reqq list gets corrupted. Then
> either there is a crash when traversing it in nfs_timer() (occurring in
> nfs_sigintr() due to being called with a bogus pointer), or there is a
> hang when one of the NFS requests gets lost and never retried.

Looking into this:

the occurrences of nfs_reqq are as follows:

fs/nfs/client/nfs_clvnops.c: * nfs_reqq_mtx : Global lock, protects the 
nfs_reqq list.

Since there is no other mention of nfs_reqq_mtx in the whole syssrc
tarball, this looks wrong.  It also immediately causes the suspicion
that the list isn't in fact protected at all.

nfs/nfs.h:extern TAILQ_HEAD(nfsreqhead, nfsreq) nfs_reqq;

nfs/nfs_clntsocket.c: TAILQ_FOREACH(rep, _reqq, r_chain) {
nfs/nfs_clntsocket.c: TAILQ_INSERT_TAIL(_reqq, rep, r_chain);
nfs/nfs_clntsocket.c: TAILQ_REMOVE(_reqq, rep, r_chain);

Protected with

s = splsoftnet();

for match #2 and #3 but #1 seems not protected by anything I can see
nearby. Maybe it is

error = nfs_rcvlock(nmp, myrep);

if that makes any sense.
That function definitely does not use either splsoftnet() OR
mutex_enter(softnet_lock).

nfs/nfs_socket.c:struct nfsreqhead nfs_reqq;
nfs/nfs_socket.c: TAILQ_FOREACH(rp, _reqq, r_chain) {
nfs/nfs_socket.c: TAILQ_FOREACH(rep, _reqq, r_chain) {

match #3 is protected with

mutex_enter(softnet_lock);  /* XXX PR 40491 */

but none of the others (visibly nearby).

#2 is called from nfs_receive() which uses nfs_sndlock() which also
doesn't use either splsoftnet() OR mutex_enter(softnet_lock).

nfs/nfs_subs.c:   TAILQ_INIT(_reqq);

presumably doesn't need any extra protection.

softnet_lock is allocated as

./kern/uipc_socket.c:kmutex_t   *softnet_lock;
./kern/uipc_socket.c:   softnet_lock = mutex_obj_alloc(MUTEX_DEFAULT, IPL_NONE);

IPL_NONE seems inconsistent with splsoftnet().

I never studied the inner details of kernel locking, but the diversity
of protections of this list doesn't inspire trust at first sight...

-Olaf.
-- 
___ Olaf 'Rhialto' Seibert  -- The Doctor: No, 'eureka' is Greek for
\X/ rhialto/at/xs4all.nl-- 'this bath is too hot.'


signature.asc
Description: PGP signature


Re: NFS related panics and hangs

2015-11-05 Thread David Holland
On Thu, Nov 05, 2015 at 10:46:17PM +0100, Rhialto wrote:
 > > This file (fs/nfs/client/nfs_clvnops.c) is part of a second (dead) nfs
 > > implementation from FreeBSD.  It is not part of any kernel.
 > > 
 > > Our nfs lives in sys/nfs.
 > 
 > Ok, why is it included in syssrc.tgz then?
 > I'd say it should not be there.
 > 
 > My other observations still stand, it seems, since they concern files in
 > sys/nfs.

Because migrating to that nfs is the long-term plan because it'll get
us nfsv4.

(anyone who wants to help work on this, please help...)


-- 
David A. Holland
dholl...@netbsd.org