Re: usb flashkey disk copy error

2003-09-13 Thread John-Mark Gurney
Barney Wolff wrote this message on Fri, Sep 12, 2003 at 15:52 -0400:
 Patch below had some problems.  Needed #ifdef USB_DEBUG around the
 ref to ohcidebug to compile, and either BROKEN_OHCI added to the
 list of valid options or (as I did) kludged to 1.  Worse, trying
 to mount_msdosfs my camera caused an instant panic:  Length went
 negative: -4096.  If that's not enough info, I imagine I can
 recreate the panic.

Yeh, I ran across this when testing on a system.  But you can
ignore this patch.  With this patch applied the USB device would
stop working even after I fixed the #ifdef and -4096 problems..  (btw,
I never intended for the patch to compile w/o USB_DEBUG, but since
the modules don't inherit the kernel config's make files, it breaks)..

 Just to restate my particular problem, I get the wrong data on read
 of an existing file from the memory stick on the camera.  I have
 not dared to try writing to it since reads don't work.

Ok, I have a system that I'm going to be looking at tomorrow that
has a similar issue.  Could you file an add in to kern/54982 that
includes the dmesg output of your usb messages (ohci/uhci/umass/etc.)

I tried using my 128meg CF in the same reader/machine that was having
problems reading, and it worked.  So it looks like reads are broken
for only some devices, not all. :(

 On Sun, Sep 07, 2003 at 01:39:08PM -0700, John-Mark Gurney wrote:
  Barney Wolff wrote this message on Sun, Sep 07, 2003 at 15:48 -0400:
   I can't do more detailed diagnosis right now, but could in a few days.
  
  When you get a chance (or anyone else who has this problem), try the
  attached patch, and add options BROKEN_OHCI to your kernel config file.
  Please set hw.usb.ohci.debug=1, and send me the dmesg output of the
  writes.  (When you copy the data to the media.)
  
  Hmmm. I just thought of something.  Now is the data corrupt still correupt
  on another system?  What I mean is did the data get written properly, but
  just isn't being read back from the media correctly.  Unless you are
  coping a file larger than memory size, the cmp just pulls it from memory,
  not from the media.  The umount/mount forces a flush of the cache, and so
  attempts to read from the media.

-- 
  John-Mark Gurney  Voice: +1 415 225 5579

 All that I will do, has been done, All that I have, has not.
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: usb flashkey disk copy error

2003-09-12 Thread Barney Wolff
Patch below had some problems.  Needed #ifdef USB_DEBUG around the
ref to ohcidebug to compile, and either BROKEN_OHCI added to the
list of valid options or (as I did) kludged to 1.  Worse, trying
to mount_msdosfs my camera caused an instant panic:  Length went
negative: -4096.  If that's not enough info, I imagine I can
recreate the panic.

Just to restate my particular problem, I get the wrong data on read
of an existing file from the memory stick on the camera.  I have
not dared to try writing to it since reads don't work.
Thanks,
Barney

On Sun, Sep 07, 2003 at 01:39:08PM -0700, John-Mark Gurney wrote:
 Barney Wolff wrote this message on Sun, Sep 07, 2003 at 15:48 -0400:
  I can't do more detailed diagnosis right now, but could in a few days.
 
 When you get a chance (or anyone else who has this problem), try the
 attached patch, and add options BROKEN_OHCI to your kernel config file.
 Please set hw.usb.ohci.debug=1, and send me the dmesg output of the
 writes.  (When you copy the data to the media.)
 
 Hmmm. I just thought of something.  Now is the data corrupt still correupt
 on another system?  What I mean is did the data get written properly, but
 just isn't being read back from the media correctly.  Unless you are
 coping a file larger than memory size, the cmp just pulls it from memory,
 not from the media.  The umount/mount forces a flush of the cache, and so
 attempts to read from the media.
 
 Thanks.
 
 -- 
   John-Mark GurneyVoice: +1 415 225 5579
 
  All that I will do, has been done, All that I have, has not.

 Index: ohci.c
 ===
 RCS file: /home/ncvs/src/sys/dev/usb/ohci.c,v
 retrieving revision 1.132
 diff -u -r1.132 ohci.c
 --- ohci.c2003/08/24 17:55:54 1.132
 +++ ohci.c2003/09/07 20:28:13
 @@ -513,6 +513,14 @@
  
   DPRINTFN(alen  4096,(ohci_alloc_std_chain: start len=%d\n, alen));
  
 + if (ohcidebug  alen  4096) {
 + printf(len: %d, pages: , alen);
 + for (len = 0; len  alen; len += OHCI_PAGE_SIZE) {
 + printf(%s0x%x, len == 0 ?  : , , DMAADDR(dma,
 + len));
 + }
 + }
 +
   len = alen;
   cur = sp;
  
 @@ -546,9 +554,14 @@
* We can describe the above using maxsegsz = 4k and nsegs = 2
* in the future.
*/
 +#if BROKEN_OHCI
 + if (len  OHCI_PAGE_SIZE - OHCI_PAGE_OFFSET(dataphys))
 +#else
   if (OHCI_PAGE(dataphys) == OHCI_PAGE(DMAADDR(dma, offset +
   len - 1)) || len - (OHCI_PAGE_SIZE -
 - OHCI_PAGE_OFFSET(dataphys)) = OHCI_PAGE_SIZE) {
 + OHCI_PAGE_OFFSET(dataphys)) = OHCI_PAGE_SIZE)
 +#endif
 + {
   /* we can handle it in this TD */
   curlen = len;
   } else {


-- 
Barney Wolff http://www.databus.com/bwresume.pdf
I'm available by contract or FT, in the NYC metro area or via the 'Net.
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: usb flashkey disk copy error

2003-09-08 Thread John-Mark Gurney
raoul.megelas wrote this message on Sun, Sep 07, 2003 at 11:33 +0200:
 You have found the trick, fsync after cp works fine.
 Thanks very much.
 
 But why the fsync is not automatically done by umount on umass?
 
 (note) if you need to test against flashkey i can do that if you want.)

Well, we still need to figure out why an fsync fixes it.  Does it still
cause the corruption now after not doing an fsync?  Can you alternate it
a few times, doing an fsync, and then not, and seeing if it is reliable?

If this is an fsync issue, then it might be else where in the subsystem
that isn't flushing the buffers before umount, but that seems a bit
wierd since other fs's should/would be having this problem too.

Don't rejoice quite yet, there still is something to track down.  Did
you see my recent patch I posted?  Could you try that on your system?

-- 
  John-Mark Gurney  Voice: +1 415 225 5579

 All that I will do, has been done, All that I have, has not.
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: usb flashkey disk copy error

2003-09-08 Thread Bruce Evans
On Sun, 7 Sep 2003, raoul.megelas wrote:

 John-Mark Gurne wrote this message on Sun, Sep 07, 2003 at 08:45 +0200:
 raoul.megelas wrote:

  I have a copy error between hdd and a flashkey 1gig usb (easydisk)
  on Current dated August 28. Here is in short:
 
  mount -t msdos /dev/da2s1 /flashkey
  cp myfile /flashkey/
  diff myfile /flashkey/myfile
  (ok).

  could you try a fsync /flashkey/myfile before the umount?
  ...

 You have found the trick, fsync after cp works fine.
 Thanks very much.

 But why the fsync is not automatically done by umount on umass?

msdosfs_unmount() seems to be missing a VOP_FSYNC() of the vnode for
the device file.  This is needed to flush dirty metadata, if any.
msdosfs_sync() is not missing this VOP_FSYNC(), and according to my
debugging code it occasionally does something (unlike for ffs_sync()
where there is almost always some dirty metadata.

Perhaps there is another bug for VOP_CLOSE() on the device file to not
do the sync, but ffs_unmount() does the VOP_FSYNC() explicitly (via
ffs_flushfiles()).  This may be just to get better error handling.  In
fact, there is another bug in ffs's not ignoring errors returned by
VOP_CLOSE(): they cause null pointer panics if VOP_CLOSE() actually
returns an error.  Quick fix for ffs_unmount():

%%%
Index: ffs_vfsops.c
===
RCS file: /home/ncvs/src/sys/ufs/ffs/ffs_vfsops.c,v
retrieving revision 1.216
diff -u -r1.216 ffs_vfsops.c
--- ffs_vfsops.c15 Aug 2003 20:03:19 -  1.216
+++ ffs_vfsops.c17 Aug 2003 09:24:11 -
@@ -993,6 +998,12 @@
error = VOP_CLOSE(ump-um_devvp, FREAD|FWRITE, NOCRED, td);
 #endif

+   /*
+* XXX don't fail if VOP_CLOSE() failed since we have destroyed
+* our mount point and will soon destroy other resources.
+*/
+   error = 0;
+
vrele(ump-um_devvp);

free(fs-fs_csp, M_UFSMNT);
%%%

To see this bug, arrange for a disk driver to return nonzero from its
close routine.  Calling vflush() and VOP_FSYNC() before committing to
finishing the unmount should result in there being no dirty blocks for
VOP_CLOSE() to flush, so the correct error handling for failure of
VOP_CLOSE() in the above may be to panic.

Bruce
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to [EMAIL PROTECTED]


usb flashkey disk copy error

2003-09-07 Thread raoul.megelas
Hello all,

I have a copy error between hdd and a flashkey 1gig usb (easydisk)
on Current dated August 28. Here is in short:

mount -t msdos /dev/da2s1 /flashkey
cp myfile /flashkey/
diff myfile /flashkey/myfile
(ok).
umount /flashkey

mount (flashkey)
diff myfile /flashkey/myfile
(Binary file differ)!

It is not a flashkey disk error, it works on XP.
Note that this error occurs on FreeBSD 4.8 too.

Please, can you tell me how to deal with that?
And in an other hand can you tell me how to obtain the exact map of the flashkey
to verify the writing on the disk.
Thanks in advance for this newbie question.

[EMAIL PROTECTED]
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: usb flashkey disk copy error

2003-09-07 Thread John-Mark Gurney
raoul.megelas wrote this message on Sun, Sep 07, 2003 at 08:45 +0200:
 I have a copy error between hdd and a flashkey 1gig usb (easydisk)
 on Current dated August 28. Here is in short:
 
 mount -t msdos /dev/da2s1 /flashkey
 cp myfile /flashkey/
 diff myfile /flashkey/myfile
   (ok).

could you try a fsync /flashkey/myfile before the umount?

 umount /flashkey
 
 mount (flashkey)
 diff myfile /flashkey/myfile
   (Binary file differ)!
 
 It is not a flashkey disk error, it works on XP.
 Note that this error occurs on FreeBSD 4.8 too.
 
 Please, can you tell me how to deal with that?
 And in an other hand can you tell me how to obtain the exact map of the flashkey
 to verify the writing on the disk.
 Thanks in advance for this newbie question.

You're the second person that has reported corruption with USB umass
devices.  I am interested in tracking down this problem, but it's a
bit difficult since I haven't seen it myself.

(I currently don't quite have a test bed box to play with, but I will
in the next week.)

Thanks.

-- 
  John-Mark Gurney  Voice: +1 415 225 5579

 All that I will do, has been done, All that I have, has not.
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: usb flashkey disk copy error

2003-09-07 Thread Barney Wolff
On Sun, Sep 07, 2003 at 12:32:46AM -0700, John-Mark Gurney wrote:
 
 You're the second person that has reported corruption with USB umass
 devices.  I am interested in tracking down this problem, but it's a
 bit difficult since I haven't seen it myself.

Make me the third.  I have a Sony F707 camera, which I can use on
4-stable and with 5-current on a Dell I5000 laptop, but not on 5-current
on an Asus A7M266-D with world/kernel built 9/4/03.  The bad data starts
at byte 4096, always.  The problem has existed for at least a month,
perhaps longer; I can't remember if it ever worked on the Asus.  But
the same system runs fine with usb keyboard and mouse - which proves
nothing as the data rate and amount are so small.  So I can't rule
out hardware as the problem.  In a few days I can try booting from
a 4.8 live cd and see if the hardware works ok.  Old-quirks had no
effect on the problem.

-- 
Barney Wolff http://www.databus.com/bwresume.pdf
I'm available by contract or FT, in the NYC metro area or via the 'Net.
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: usb flashkey disk copy error

2003-09-07 Thread John-Mark Gurney
Barney Wolff wrote this message on Sun, Sep 07, 2003 at 13:46 -0400:
 On Sun, Sep 07, 2003 at 12:32:46AM -0700, John-Mark Gurney wrote:
  
  You're the second person that has reported corruption with USB umass
  devices.  I am interested in tracking down this problem, but it's a
  bit difficult since I haven't seen it myself.
 
 Make me the third.  I have a Sony F707 camera, which I can use on
 4-stable and with 5-current on a Dell I5000 laptop, but not on 5-current
 on an Asus A7M266-D with world/kernel built 9/4/03.  The bad data starts
 at byte 4096, always.  The problem has existed for at least a month,

Ahh, this is useful information.  What are the controller types
of the two machines?  ohci? uhci?  I think it might be ohci, and
it maybe a page miscalculation when dispatching the request.

Have you looked at the corrupted data?  does it apear to be from
some other location, like the kernel?

 perhaps longer; I can't remember if it ever worked on the Asus.  But
 the same system runs fine with usb keyboard and mouse - which proves
 nothing as the data rate and amount are so small.  So I can't rule
 out hardware as the problem.  In a few days I can try booting from
 a 4.8 live cd and see if the hardware works ok.  Old-quirks had no
 effect on the problem.

It's been about two months since I updated USB to use bus_dma.  So
you could try checking out the usb code from July 4th or so and see
if it goes away (it probably will).

-- 
  John-Mark Gurney  Voice: +1 415 225 5579

 All that I will do, has been done, All that I have, has not.
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: usb flashkey disk copy error

2003-09-07 Thread Barney Wolff
On Sun, Sep 07, 2003 at 10:55:24AM -0700, John-Mark Gurney wrote:
  
  Make me the third.  I have a Sony F707 camera, which I can use on
  4-stable and with 5-current on a Dell I5000 laptop, but not on 5-current
  on an Asus A7M266-D with world/kernel built 9/4/03.  The bad data starts
  at byte 4096, always.  The problem has existed for at least a month,
 
 Ahh, this is useful information.  What are the controller types
 of the two machines?  ohci? uhci?  I think it might be ohci, and
 it maybe a page miscalculation when dispatching the request.

Indeed, failing system has ohci, working has uhci.

 Have you looked at the corrupted data?  does it apear to be from
 some other location, like the kernel?

Can't tell offhand, but doesn't look like code - not enough zeroes.
As I recall, multiple tries produced the same bad data each time (after
umount and re-mount).
I can't do more detailed diagnosis right now, but could in a few days.

  perhaps longer; I can't remember if it ever worked on the Asus.  But
  the same system runs fine with usb keyboard and mouse - which proves
  nothing as the data rate and amount are so small.  So I can't rule
  out hardware as the problem.  In a few days I can try booting from
  a 4.8 live cd and see if the hardware works ok.  Old-quirks had no
  effect on the problem.
 
 It's been about two months since I updated USB to use bus_dma.  So
 you could try checking out the usb code from July 4th or so and see
 if it goes away (it probably will).

Will try that, again in a few days.

-- 
Barney Wolff http://www.databus.com/bwresume.pdf
I'm available by contract or FT, in the NYC metro area or via the 'Net.
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: usb flashkey disk copy error

2003-09-07 Thread John-Mark Gurney
Barney Wolff wrote this message on Sun, Sep 07, 2003 at 15:48 -0400:
 I can't do more detailed diagnosis right now, but could in a few days.

When you get a chance (or anyone else who has this problem), try the
attached patch, and add options BROKEN_OHCI to your kernel config file.
Please set hw.usb.ohci.debug=1, and send me the dmesg output of the
writes.  (When you copy the data to the media.)

Hmmm. I just thought of something.  Now is the data corrupt still correupt
on another system?  What I mean is did the data get written properly, but
just isn't being read back from the media correctly.  Unless you are
coping a file larger than memory size, the cmp just pulls it from memory,
not from the media.  The umount/mount forces a flush of the cache, and so
attempts to read from the media.

Thanks.

-- 
  John-Mark Gurney  Voice: +1 415 225 5579

 All that I will do, has been done, All that I have, has not.
Index: ohci.c
===
RCS file: /home/ncvs/src/sys/dev/usb/ohci.c,v
retrieving revision 1.132
diff -u -r1.132 ohci.c
--- ohci.c  2003/08/24 17:55:54 1.132
+++ ohci.c  2003/09/07 20:28:13
@@ -513,6 +513,14 @@
 
DPRINTFN(alen  4096,(ohci_alloc_std_chain: start len=%d\n, alen));
 
+   if (ohcidebug  alen  4096) {
+   printf(len: %d, pages: , alen);
+   for (len = 0; len  alen; len += OHCI_PAGE_SIZE) {
+   printf(%s0x%x, len == 0 ?  : , , DMAADDR(dma,
+   len));
+   }
+   }
+
len = alen;
cur = sp;
 
@@ -546,9 +554,14 @@
 * We can describe the above using maxsegsz = 4k and nsegs = 2
 * in the future.
 */
+#if BROKEN_OHCI
+   if (len  OHCI_PAGE_SIZE - OHCI_PAGE_OFFSET(dataphys))
+#else
if (OHCI_PAGE(dataphys) == OHCI_PAGE(DMAADDR(dma, offset +
len - 1)) || len - (OHCI_PAGE_SIZE -
-   OHCI_PAGE_OFFSET(dataphys)) = OHCI_PAGE_SIZE) {
+   OHCI_PAGE_OFFSET(dataphys)) = OHCI_PAGE_SIZE)
+#endif
+   {
/* we can handle it in this TD */
curlen = len;
} else {
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: usb flashkey disk copy error

2003-09-07 Thread Andrew Gordon

On Sun, 7 Sep 2003, John-Mark Gurney wrote:

 Hmmm. I just thought of something.  Now is the data corrupt still correupt
 on another system?  What I mean is did the data get written properly, but
 just isn't being read back from the media correctly.  Unless you are
 coping a file larger than memory size, the cmp just pulls it from memory,
 not from the media.  The umount/mount forces a flush of the cache, and so
 attempts to read from the media.

I'm also suffering (probably) the same problem.

In my case, the drive is a Sony memory stick slot on a PCG-U1 laptop;
connection to the system is via OHCI.

For my usage, it's definitely a _read_ phenomenon - I'm creating images on
the memory stick in my P800 phone/camera and trying to read them via an
msdosfs mount on the laptop.  Retrieving them via Windows demonstrates
that the images are good; reading them under FreeBSD shows them corrupt at
a _file_ offset of 4096 decimal.

I tried copying the whole filesystem with 'dd', then using 'mdconfig' to
mount the resulting image, eg.:

  dd if=/dev/da0s1 of=stickfile4 bs=32k
  mdconfig -a stickfile4
  mount -t msdos /dev/md0 /mnt

With a blocksize to 'dd' of 512, 8k it worked fine (no corruption); with a
block size of 100k the files were corrupt (but in different places
compared to mounting the memorystick directly).  Using a block size of
32k, it copied for a minute or so and then the machine hung totally
(repeatable across two attempts).

In terms of dates, I'm now running with -current of 4th september; this
problem was also present in a kernel built on August 22nd.  It was working
OK with whatever kernel I was running on 23rd May (based on timestamps on
some files I wrote on the PC).  In fact, up until around that time this
setup didn't work at all:  the internal OCHI port that connects to the
memory stick slot always reported 'device problem' and wouldn't find the
device (the second OHCI controller that is brought out to conventional
sockets worked OK).  One system update that I did suddenly made everything
work, then a month or two later this problem arrived.



___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: usb flashkey disk copy error

2003-09-07 Thread raoul.megelas
John-Mark Gurne wrote this message on Sun, Sep 07, 2003 at 08:45 +0200:
raoul.megelas wrote:

 I have a copy error between hdd and a flashkey 1gig usb (easydisk)
 on Current dated August 28. Here is in short:
 
 mount -t msdos /dev/da2s1 /flashkey
 cp myfile /flashkey/
 diff myfile /flashkey/myfile
   (ok).

 could you try a fsync /flashkey/myfile before the umount?

 umount /flashkey
 
 mount (flashkey)
 diff myfile /flashkey/myfile
   (Binary file differ)!
 
 It is not a flashkey disk error, it works on XP.
 Note that this error occurs on FreeBSD 4.8 too.
 
 Please, can you tell me how to deal with that?
 And in an other hand can you tell me how to obtain the exact map of the flashkey
 to verify the writing on the disk.
 Thanks in advance for this newbie question.

 You're the second person that has reported corruption with USB umass
 devices.  I am interested in tracking down this problem, but it's a
 bit difficult since I haven't seen it myself.

 (I currently don't quite have a test bed box to play with, but I will
 in the next week.)
  John-Mark Gurney Voice: +1 415 225 5579

 All that I will do, has been done, All that I have, has not.

You have found the trick, fsync after cp works fine.
Thanks very much.

But why the fsync is not automatically done by umount on umass?

(note) if you need to test against flashkey i can do that if you want.)

raoul
[EMAIL PROTECTED]

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to [EMAIL PROTECTED]