Re: SCSI Tape corruption - update
On Fri, 20 Jul 2001, Geert Uytterhoeven wrote: > On Sun, 8 Jul 2001, Geert Uytterhoeven wrote: > > New findings: > > - The problem doesn't happen with kernels <= 2.2.17. It does happen with all > > kernels starting with 2.2.18-pre1. > > - The only related stuff that changed in 2.2.18-pre1 seems to be the > > Sym53c8xx driver itself. I'll do some more tests soon to isolate the > > problem. > > - The changes to the Sym53c8xx driver in 2.2.18-pre1 are _huge_. Are the > > individual changes between sym53c8xx-1.3g and sym53c8xx-1.7.0 available > > somewhere? Not completely. The reason is that I used manual diffing/patching against various kernel versions and it would be a PITA to resurrect all intermediate driver versions using these patches. If we consider patches that went directly to kernel main stream without changing the driver version, a double PITA it would be. Btw, for sym-2.1.x series, I now use a CVS tree and each driver release is tagged independently. For those ones, it will be much more easy to isolate broken changes. > The problem is indeed introduced by the changes to the Sym53c8xx in 2.2.18-pre1. > I managed to find some intermediate versions in the 2.3.x series, and here are the > results: > - sym53c8xx-1.3g (from BK linuxppc_2_2): OK > - sym53c8xx-1.5e: crash in SCSI interrupt during driver init > - sym53c8xx-1.5f: lock up during driver init > - sym53c8xx-1.5g: random 32-byte error bursts when writing to tape That's an interesting result. But 1.5g - 1.3g diffs are probably very large. Patches available from ftp.tux.org should allow to resurrect driver versions 1.4, 1.5, 1.5a, 1.5b, 1.5c, 1.5d. ftp://ftp.tux.org/pub/roudier/drivers/linux/sym53c8xx/README You may, for example, apply incremental patches that address kernel 2.2.5 to a fresh kernel 2.2.5 tree and extract driver files accordingly. > Perhaps I can get 1.5e and 1.5g to work using some PPC-specific fixes from the > 1.3.g driver in the linuxppc_2_2 tree (it differed a bit from the 1.3g in > Alan's 2.2.17). But even then the changes in 1.5f and 1.5g are rather small, > compared to the changes between 1.3g and 1.5f. Some PPC specific changes are very probably not present in my driver sources. I am unable to help on that point. > So I'd be very happy if I could get my hand on more intermediate versions. > Thanks for your help! I _really_ want to nail this one down! > > Gr{oetje,eeting}s, Regards, Gérard. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: SCSI Tape corruption - update
On Sun, 8 Jul 2001, Geert Uytterhoeven wrote: > New findings: > - The problem doesn't happen with kernels <= 2.2.17. It does happen with all > kernels starting with 2.2.18-pre1. > - The only related stuff that changed in 2.2.18-pre1 seems to be the > Sym53c8xx driver itself. I'll do some more tests soon to isolate the > problem. > - The changes to the Sym53c8xx driver in 2.2.18-pre1 are _huge_. Are the > individual changes between sym53c8xx-1.3g and sym53c8xx-1.7.0 available > somewhere? The problem is indeed introduced by the changes to the Sym53c8xx in 2.2.18-pre1. I managed to find some intermediate versions in the 2.3.x series, and here are the results: - sym53c8xx-1.3g (from BK linuxppc_2_2): OK - sym53c8xx-1.5e: crash in SCSI interrupt during driver init - sym53c8xx-1.5f: lock up during driver init - sym53c8xx-1.5g: random 32-byte error bursts when writing to tape Perhaps I can get 1.5e and 1.5g to work using some PPC-specific fixes from the 1.3.g driver in the linuxppc_2_2 tree (it differed a bit from the 1.3g in Alan's 2.2.17). But even then the changes in 1.5f and 1.5g are rather small, compared to the changes between 1.3g and 1.5f. So I'd be very happy if I could get my hand on more intermediate versions. Thanks for your help! I _really_ want to nail this one down! Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [EMAIL PROTECTED] In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: SCSI Tape corruption - update
On Sun, 8 Jul 2001, Geert Uytterhoeven wrote: New findings: - The problem doesn't happen with kernels = 2.2.17. It does happen with all kernels starting with 2.2.18-pre1. - The only related stuff that changed in 2.2.18-pre1 seems to be the Sym53c8xx driver itself. I'll do some more tests soon to isolate the problem. - The changes to the Sym53c8xx driver in 2.2.18-pre1 are _huge_. Are the individual changes between sym53c8xx-1.3g and sym53c8xx-1.7.0 available somewhere? The problem is indeed introduced by the changes to the Sym53c8xx in 2.2.18-pre1. I managed to find some intermediate versions in the 2.3.x series, and here are the results: - sym53c8xx-1.3g (from BK linuxppc_2_2): OK - sym53c8xx-1.5e: crash in SCSI interrupt during driver init - sym53c8xx-1.5f: lock up during driver init - sym53c8xx-1.5g: random 32-byte error bursts when writing to tape Perhaps I can get 1.5e and 1.5g to work using some PPC-specific fixes from the 1.3.g driver in the linuxppc_2_2 tree (it differed a bit from the 1.3g in Alan's 2.2.17). But even then the changes in 1.5f and 1.5g are rather small, compared to the changes between 1.3g and 1.5f. So I'd be very happy if I could get my hand on more intermediate versions. Thanks for your help! I _really_ want to nail this one down! Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [EMAIL PROTECTED] In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say programmer or something like that. -- Linus Torvalds - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: SCSI Tape corruption - update
On Fri, 20 Jul 2001, Geert Uytterhoeven wrote: On Sun, 8 Jul 2001, Geert Uytterhoeven wrote: New findings: - The problem doesn't happen with kernels = 2.2.17. It does happen with all kernels starting with 2.2.18-pre1. - The only related stuff that changed in 2.2.18-pre1 seems to be the Sym53c8xx driver itself. I'll do some more tests soon to isolate the problem. - The changes to the Sym53c8xx driver in 2.2.18-pre1 are _huge_. Are the individual changes between sym53c8xx-1.3g and sym53c8xx-1.7.0 available somewhere? Not completely. The reason is that I used manual diffing/patching against various kernel versions and it would be a PITA to resurrect all intermediate driver versions using these patches. If we consider patches that went directly to kernel main stream without changing the driver version, a double PITA it would be. Btw, for sym-2.1.x series, I now use a CVS tree and each driver release is tagged independently. For those ones, it will be much more easy to isolate broken changes. The problem is indeed introduced by the changes to the Sym53c8xx in 2.2.18-pre1. I managed to find some intermediate versions in the 2.3.x series, and here are the results: - sym53c8xx-1.3g (from BK linuxppc_2_2): OK - sym53c8xx-1.5e: crash in SCSI interrupt during driver init - sym53c8xx-1.5f: lock up during driver init - sym53c8xx-1.5g: random 32-byte error bursts when writing to tape That's an interesting result. But 1.5g - 1.3g diffs are probably very large. Patches available from ftp.tux.org should allow to resurrect driver versions 1.4, 1.5, 1.5a, 1.5b, 1.5c, 1.5d. ftp://ftp.tux.org/pub/roudier/drivers/linux/sym53c8xx/README You may, for example, apply incremental patches that address kernel 2.2.5 to a fresh kernel 2.2.5 tree and extract driver files accordingly. Perhaps I can get 1.5e and 1.5g to work using some PPC-specific fixes from the 1.3.g driver in the linuxppc_2_2 tree (it differed a bit from the 1.3g in Alan's 2.2.17). But even then the changes in 1.5f and 1.5g are rather small, compared to the changes between 1.3g and 1.5f. Some PPC specific changes are very probably not present in my driver sources. I am unable to help on that point. So I'd be very happy if I could get my hand on more intermediate versions. Thanks for your help! I _really_ want to nail this one down! Gr{oetje,eeting}s, Regards, Gérard. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: SCSI Tape corruption - update
On Sun, 8 Jul 2001, Geert Uytterhoeven wrote: > On Thu, 21 Jun 2001, Geert Uytterhoeven wrote: > > On Tue, 8 May 2001, Geert Uytterhoeven wrote: > > > In the mean time I down/upgraded to 2.2.17 on my PPC box (CHRP LongTrail, > > > Sym53c875, HP C5136A DDS1) and I can confirm that the problem does not happen > > > under 2.2.17 neither. > > > > > > My experiences: > > > - reading works fine, writing doesn't > > > - 2.2.x works fine, 2.4.x doesn't (at least since 2.4.0-test1-ac10) > > > - hardware compression doesn't matter > > > - I have a sym53c875, Lorenzo has an Adaptec, so most likely it's not a > > > SCSI hardware driver bug > > > - I have a PPC, Lorenzo doesn't, so it's not CPU-specific > > > - corruption is always a block of 32 bytes being replaced by 32 bytes from > > > the previous tape block (depending on block size!) (approx. 6 errors per > > > 256 MB) > > > > > > Lorenzo, can you please investigate the exact nature of the corruption on your > > > system? > > > - How many successive bytes are corrupted? > > > - Where do the corrupted data come from? > > > > Yesterday I noticed the same corruption under 2.2.19 (yes, I run amverify after > > backing up my system now, so it detects corruption through the gzip CRCs). > > > > I'll do some more tests (when I find time) to get a higher statistical > > certainty that it really doesn't happen under earlier 2.2.x kernels. > > New findings: > - The problem doesn't happen with kernels <= 2.2.17. It does happen with all > kernels starting with 2.2.18-pre1. > - The only related stuff that changed in 2.2.18-pre1 seems to be the > Sym53c8xx driver itself. I'll do some more tests soon to isolate the > problem. > - The changes to the Sym53c8xx driver in 2.2.18-pre1 are _huge_. Are the > individual changes between sym53c8xx-1.3g and sym53c8xx-1.7.0 available > somewhere? No. But you can move the sym/ncr driver bundle from 2.2.18-pre1 to 2.2.17 and vice-versa. sym53c8xx.h, sym53c8xx_defs.h, sym53c8xx.c, sym53c8xx_comm.h, ncr53c8xx.h, ncr53c8xx.c You also can download either sym-1.7.3c-ncr-3.4.3b, or sym-2.1.11, or just both and play with all that stuff under 2.2.17 and later 2.2 kernels. ftp://ftp.tux.org/pub/roudier/README-drivers-linux Btw, I am interested in results using sym-1.7.3c and sym-2.1.11 under kernel 2.2.17 and possibly 2.2.18. > BTW, I wrote a small test program which tries to analyze error bursts. You can > find it at http://home.tvd.be/cr26864/Download/genpseudorandom.c > > Sample test using 2 bytes of data: > > genpseudorandom -o -l 2 > /dev/tape > genpseudorandom -i < /dev/tape Unfortunately, I haven't any tape device. > So far I always saw problems when writing even only 10 MB to tape: ca. 3-5 > bursts of 32 or 12 incorrect bytes, which are always a copy of the > corresponding bytes in the previous block. Of course I used a much larger test > stream to verify 2.2.17. > > Thanks! > > Gr{oetje,eeting}s, > > Geert Thanks for your testings, Gérard. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: SCSI Tape corruption - update
On Thu, 21 Jun 2001, Geert Uytterhoeven wrote: > On Tue, 8 May 2001, Geert Uytterhoeven wrote: > > In the mean time I down/upgraded to 2.2.17 on my PPC box (CHRP LongTrail, > > Sym53c875, HP C5136A DDS1) and I can confirm that the problem does not happen > > under 2.2.17 neither. > > > > My experiences: > > - reading works fine, writing doesn't > > - 2.2.x works fine, 2.4.x doesn't (at least since 2.4.0-test1-ac10) > > - hardware compression doesn't matter > > - I have a sym53c875, Lorenzo has an Adaptec, so most likely it's not a > > SCSI hardware driver bug > > - I have a PPC, Lorenzo doesn't, so it's not CPU-specific > > - corruption is always a block of 32 bytes being replaced by 32 bytes from > > the previous tape block (depending on block size!) (approx. 6 errors per > > 256 MB) > > > > Lorenzo, can you please investigate the exact nature of the corruption on your > > system? > > - How many successive bytes are corrupted? > > - Where do the corrupted data come from? > > Yesterday I noticed the same corruption under 2.2.19 (yes, I run amverify after > backing up my system now, so it detects corruption through the gzip CRCs). > > I'll do some more tests (when I find time) to get a higher statistical > certainty that it really doesn't happen under earlier 2.2.x kernels. New findings: - The problem doesn't happen with kernels <= 2.2.17. It does happen with all kernels starting with 2.2.18-pre1. - The only related stuff that changed in 2.2.18-pre1 seems to be the Sym53c8xx driver itself. I'll do some more tests soon to isolate the problem. - The changes to the Sym53c8xx driver in 2.2.18-pre1 are _huge_. Are the individual changes between sym53c8xx-1.3g and sym53c8xx-1.7.0 available somewhere? BTW, I wrote a small test program which tries to analyze error bursts. You can find it at http://home.tvd.be/cr26864/Download/genpseudorandom.c Sample test using 2 bytes of data: genpseudorandom -o -l 2 > /dev/tape genpseudorandom -i < /dev/tape So far I always saw problems when writing even only 10 MB to tape: ca. 3-5 bursts of 32 or 12 incorrect bytes, which are always a copy of the corresponding bytes in the previous block. Of course I used a much larger test stream to verify 2.2.17. Thanks! Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [EMAIL PROTECTED] In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: SCSI Tape corruption - update
On Thu, 21 Jun 2001, Geert Uytterhoeven wrote: On Tue, 8 May 2001, Geert Uytterhoeven wrote: In the mean time I down/upgraded to 2.2.17 on my PPC box (CHRP LongTrail, Sym53c875, HP C5136A DDS1) and I can confirm that the problem does not happen under 2.2.17 neither. My experiences: - reading works fine, writing doesn't - 2.2.x works fine, 2.4.x doesn't (at least since 2.4.0-test1-ac10) - hardware compression doesn't matter - I have a sym53c875, Lorenzo has an Adaptec, so most likely it's not a SCSI hardware driver bug - I have a PPC, Lorenzo doesn't, so it's not CPU-specific - corruption is always a block of 32 bytes being replaced by 32 bytes from the previous tape block (depending on block size!) (approx. 6 errors per 256 MB) Lorenzo, can you please investigate the exact nature of the corruption on your system? - How many successive bytes are corrupted? - Where do the corrupted data come from? Yesterday I noticed the same corruption under 2.2.19 (yes, I run amverify after backing up my system now, so it detects corruption through the gzip CRCs). I'll do some more tests (when I find time) to get a higher statistical certainty that it really doesn't happen under earlier 2.2.x kernels. New findings: - The problem doesn't happen with kernels = 2.2.17. It does happen with all kernels starting with 2.2.18-pre1. - The only related stuff that changed in 2.2.18-pre1 seems to be the Sym53c8xx driver itself. I'll do some more tests soon to isolate the problem. - The changes to the Sym53c8xx driver in 2.2.18-pre1 are _huge_. Are the individual changes between sym53c8xx-1.3g and sym53c8xx-1.7.0 available somewhere? BTW, I wrote a small test program which tries to analyze error bursts. You can find it at http://home.tvd.be/cr26864/Download/genpseudorandom.c Sample test using 2 bytes of data: genpseudorandom -o -l 2 /dev/tape genpseudorandom -i /dev/tape So far I always saw problems when writing even only 10 MB to tape: ca. 3-5 bursts of 32 or 12 incorrect bytes, which are always a copy of the corresponding bytes in the previous block. Of course I used a much larger test stream to verify 2.2.17. Thanks! Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [EMAIL PROTECTED] In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say programmer or something like that. -- Linus Torvalds - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: SCSI Tape corruption - update
On Sun, 8 Jul 2001, Geert Uytterhoeven wrote: On Thu, 21 Jun 2001, Geert Uytterhoeven wrote: On Tue, 8 May 2001, Geert Uytterhoeven wrote: In the mean time I down/upgraded to 2.2.17 on my PPC box (CHRP LongTrail, Sym53c875, HP C5136A DDS1) and I can confirm that the problem does not happen under 2.2.17 neither. My experiences: - reading works fine, writing doesn't - 2.2.x works fine, 2.4.x doesn't (at least since 2.4.0-test1-ac10) - hardware compression doesn't matter - I have a sym53c875, Lorenzo has an Adaptec, so most likely it's not a SCSI hardware driver bug - I have a PPC, Lorenzo doesn't, so it's not CPU-specific - corruption is always a block of 32 bytes being replaced by 32 bytes from the previous tape block (depending on block size!) (approx. 6 errors per 256 MB) Lorenzo, can you please investigate the exact nature of the corruption on your system? - How many successive bytes are corrupted? - Where do the corrupted data come from? Yesterday I noticed the same corruption under 2.2.19 (yes, I run amverify after backing up my system now, so it detects corruption through the gzip CRCs). I'll do some more tests (when I find time) to get a higher statistical certainty that it really doesn't happen under earlier 2.2.x kernels. New findings: - The problem doesn't happen with kernels = 2.2.17. It does happen with all kernels starting with 2.2.18-pre1. - The only related stuff that changed in 2.2.18-pre1 seems to be the Sym53c8xx driver itself. I'll do some more tests soon to isolate the problem. - The changes to the Sym53c8xx driver in 2.2.18-pre1 are _huge_. Are the individual changes between sym53c8xx-1.3g and sym53c8xx-1.7.0 available somewhere? No. But you can move the sym/ncr driver bundle from 2.2.18-pre1 to 2.2.17 and vice-versa. sym53c8xx.h, sym53c8xx_defs.h, sym53c8xx.c, sym53c8xx_comm.h, ncr53c8xx.h, ncr53c8xx.c You also can download either sym-1.7.3c-ncr-3.4.3b, or sym-2.1.11, or just both and play with all that stuff under 2.2.17 and later 2.2 kernels. ftp://ftp.tux.org/pub/roudier/README-drivers-linux Btw, I am interested in results using sym-1.7.3c and sym-2.1.11 under kernel 2.2.17 and possibly 2.2.18. BTW, I wrote a small test program which tries to analyze error bursts. You can find it at http://home.tvd.be/cr26864/Download/genpseudorandom.c Sample test using 2 bytes of data: genpseudorandom -o -l 2 /dev/tape genpseudorandom -i /dev/tape Unfortunately, I haven't any tape device. So far I always saw problems when writing even only 10 MB to tape: ca. 3-5 bursts of 32 or 12 incorrect bytes, which are always a copy of the corresponding bytes in the previous block. Of course I used a much larger test stream to verify 2.2.17. Thanks! Gr{oetje,eeting}s, Geert Thanks for your testings, Gérard. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: SCSI Tape corruption - update
On Tue, 8 May 2001, Geert Uytterhoeven wrote: > On Mon, 7 May 2001, Lorenzo Marcantonio wrote: > > On Mon, 7 May 2001, Rob Turk wrote: > > > Have you ruled out hardware failures? There's been a few isolated reports > > > > That tape drive (Sony SDT-9000, less than 2 years of service) works > > perfectly on Windows NT (were it was before) and even on Linux 2.2 > > > > Also the cartridge was brand new. > > In the mean time I down/upgraded to 2.2.17 on my PPC box (CHRP LongTrail, > Sym53c875, HP C5136A DDS1) and I can confirm that the problem does not happen > under 2.2.17 neither. > > My experiences: > - reading works fine, writing doesn't > - 2.2.x works fine, 2.4.x doesn't (at least since 2.4.0-test1-ac10) > - hardware compression doesn't matter > - I have a sym53c875, Lorenzo has an Adaptec, so most likely it's not a > SCSI hardware driver bug > - I have a PPC, Lorenzo doesn't, so it's not CPU-specific > - corruption is always a block of 32 bytes being replaced by 32 bytes from > the previous tape block (depending on block size!) (approx. 6 errors per > 256 MB) > > Lorenzo, can you please investigate the exact nature of the corruption on your > system? > - How many successive bytes are corrupted? > - Where do the corrupted data come from? Yesterday I noticed the same corruption under 2.2.19 (yes, I run amverify after backing up my system now, so it detects corruption through the gzip CRCs). I'll do some more tests (when I find time) to get a higher statistical certainty that it really doesn't happen under earlier 2.2.x kernels. Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [EMAIL PROTECTED] In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: SCSI Tape corruption - update
On Tue, 8 May 2001, Geert Uytterhoeven wrote: On Mon, 7 May 2001, Lorenzo Marcantonio wrote: On Mon, 7 May 2001, Rob Turk wrote: Have you ruled out hardware failures? There's been a few isolated reports That tape drive (Sony SDT-9000, less than 2 years of service) works perfectly on Windows NT (were it was before) and even on Linux 2.2 Also the cartridge was brand new. In the mean time I down/upgraded to 2.2.17 on my PPC box (CHRP LongTrail, Sym53c875, HP C5136A DDS1) and I can confirm that the problem does not happen under 2.2.17 neither. My experiences: - reading works fine, writing doesn't - 2.2.x works fine, 2.4.x doesn't (at least since 2.4.0-test1-ac10) - hardware compression doesn't matter - I have a sym53c875, Lorenzo has an Adaptec, so most likely it's not a SCSI hardware driver bug - I have a PPC, Lorenzo doesn't, so it's not CPU-specific - corruption is always a block of 32 bytes being replaced by 32 bytes from the previous tape block (depending on block size!) (approx. 6 errors per 256 MB) Lorenzo, can you please investigate the exact nature of the corruption on your system? - How many successive bytes are corrupted? - Where do the corrupted data come from? Yesterday I noticed the same corruption under 2.2.19 (yes, I run amverify after backing up my system now, so it detects corruption through the gzip CRCs). I'll do some more tests (when I find time) to get a higher statistical certainty that it really doesn't happen under earlier 2.2.x kernels. Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [EMAIL PROTECTED] In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say programmer or something like that. -- Linus Torvalds - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: SCSI Tape corruption - update
For comparison purposes, I use stock kernel 2.4.4. Use scsi tape support as module. Tape drive is HP c1539 (aka 1533a) dds-2. This drive is on the scsi chain of Tekram dc390, tmscsim driver (used as module). Hardware compression is enabled. Under this setup, tar cvbf 20 /dev/st0 large_directory works perfectly, and I can read it back without problem. What software do you use for writing to tape? Or maybe the problem is in the latest -ac tree only? (HP has a software that checks the hardware installation and drive health. The software runs on Windows, and it supports firmware upgrade, simple drive self-check, read/write check, etc. Highly recommended. Obviously, the software is meant to help the HP tech support. It generates a support ticket with the internal state of the firmware media recoverable error statistics history and the like. If the manufacturer of your tape drive has a similar test software, you might want to check the hardware using the vendor software.) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: SCSI Tape corruption - update
For comparison purposes, I use stock kernel 2.4.4. Use scsi tape support as module. Tape drive is HP c1539 (aka 1533a) dds-2. This drive is on the scsi chain of Tekram dc390, tmscsim driver (used as module). Hardware compression is enabled. Under this setup, tar cvbf 20 /dev/st0 large_directory works perfectly, and I can read it back without problem. What software do you use for writing to tape? Or maybe the problem is in the latest -ac tree only? (HP has a software that checks the hardware installation and drive health. The software runs on Windows, and it supports firmware upgrade, simple drive self-check, read/write check, etc. Highly recommended. Obviously, the software is meant to help the HP tech support. It generates a support ticket with the internal state of the firmware media recoverable error statistics history and the like. If the manufacturer of your tape drive has a similar test software, you might want to check the hardware using the vendor software.) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: SCSI Tape corruption - update
On Tue, 8 May 2001, Geert Uytterhoeven wrote: > In the mean time I down/upgraded to 2.2.17 on my PPC box (CHRP LongTrail, > Sym53c875, HP C5136A DDS1) and I can confirm that the problem does not happen > under 2.2.17 neither. > > My experiences: > - reading works fine, writing doesn't Same here > - 2.2.x works fine, 2.4.x doesn't (at least since 2.4.0-test1-ac10) SAME here > - hardware compression doesn't matter SAME HERE > - I have a sym53c875, Lorenzo has an Adaptec, so most likely it's not a > SCSI hardware driver bug > - I have a PPC, Lorenzo doesn't, so it's not CPU-specific > - corruption is always a block of 32 bytes being replaced by 32 bytes from > the previous tape block (depending on block size!) (approx. 6 errors per > 256 MB) YESSS... EXACTLY 32 consecutive bytes are different. I'll bet we've got the same problem > - How many successive bytes are corrupted? > - Where do the corrupted data come from? H I'll set up some sort of binary pattern match. This afternoon I'll pinpoint the source of the rogue bytes... -- Lorenzo Marcantonio - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: SCSI Tape corruption - update
On Mon, 7 May 2001, Lorenzo Marcantonio wrote: > On Mon, 7 May 2001, Rob Turk wrote: > > Have you ruled out hardware failures? There's been a few isolated reports > > That tape drive (Sony SDT-9000, less than 2 years of service) works > perfectly on Windows NT (were it was before) and even on Linux 2.2 > > Also the cartridge was brand new. In the mean time I down/upgraded to 2.2.17 on my PPC box (CHRP LongTrail, Sym53c875, HP C5136A DDS1) and I can confirm that the problem does not happen under 2.2.17 neither. My experiences: - reading works fine, writing doesn't - 2.2.x works fine, 2.4.x doesn't (at least since 2.4.0-test1-ac10) - hardware compression doesn't matter - I have a sym53c875, Lorenzo has an Adaptec, so most likely it's not a SCSI hardware driver bug - I have a PPC, Lorenzo doesn't, so it's not CPU-specific - corruption is always a block of 32 bytes being replaced by 32 bytes from the previous tape block (depending on block size!) (approx. 6 errors per 256 MB) Lorenzo, can you please investigate the exact nature of the corruption on your system? - How many successive bytes are corrupted? - Where do the corrupted data come from? Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [EMAIL PROTECTED] In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: SCSI Tape corruption - update
On Mon, 7 May 2001, Lorenzo Marcantonio wrote: On Mon, 7 May 2001, Rob Turk wrote: Have you ruled out hardware failures? There's been a few isolated reports That tape drive (Sony SDT-9000, less than 2 years of service) works perfectly on Windows NT (were it was before) and even on Linux 2.2 Also the cartridge was brand new. In the mean time I down/upgraded to 2.2.17 on my PPC box (CHRP LongTrail, Sym53c875, HP C5136A DDS1) and I can confirm that the problem does not happen under 2.2.17 neither. My experiences: - reading works fine, writing doesn't - 2.2.x works fine, 2.4.x doesn't (at least since 2.4.0-test1-ac10) - hardware compression doesn't matter - I have a sym53c875, Lorenzo has an Adaptec, so most likely it's not a SCSI hardware driver bug - I have a PPC, Lorenzo doesn't, so it's not CPU-specific - corruption is always a block of 32 bytes being replaced by 32 bytes from the previous tape block (depending on block size!) (approx. 6 errors per 256 MB) Lorenzo, can you please investigate the exact nature of the corruption on your system? - How many successive bytes are corrupted? - Where do the corrupted data come from? Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [EMAIL PROTECTED] In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say programmer or something like that. -- Linus Torvalds - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: SCSI Tape corruption - update
On Mon, 7 May 2001, Rob Turk wrote: > Lorenzo, > > Have you ruled out hardware failures? There's been a few isolated reports That tape drive (Sony SDT-9000, less than 2 years of service) works perfectly on Windows NT (were it was before) and even on Linux 2.2 Also the cartridge was brand new. (BTW, I've tried even with DC disabled. Well, it's REALLY fast dumping /dev/zero on tape with DC enabled :) ) -- Lorenzo Marcantonio - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: SCSI Tape corruption - update
"Lorenzo Marcantonio" <[EMAIL PROTECTED]> wrote in message news:[EMAIL PROTECTED]... > > As of my latest build [2.4.5-pre1] I've STILL got the tape corruption > problem. Some new facts: > > (1) It happens only writing the tape (tried exchanging tapes with a > brand new Alpha Digital Tru64 box). I can read her tape, she can't read > my tape. Tried with GNU tar and gzip. > Lorenzo, Have you ruled out hardware failures? There's been a few isolated reports about tape drives returning good status on write, where in fact they were writing corrupt data. This can happen when the compression hardware is malfunctioning. On many tape drives, read-back check isn't carried all the way back to the original (uncompressed) data. Rob - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: SCSI Tape corruption - update
Lorenzo Marcantonio [EMAIL PROTECTED] wrote in message news:[EMAIL PROTECTED]... As of my latest build [2.4.5-pre1] I've STILL got the tape corruption problem. Some new facts: (1) It happens only writing the tape (tried exchanging tapes with a brand new Alpha Digital Tru64 box). I can read her tape, she can't read my tape. Tried with GNU tar and gzip. Lorenzo, Have you ruled out hardware failures? There's been a few isolated reports about tape drives returning good status on write, where in fact they were writing corrupt data. This can happen when the compression hardware is malfunctioning. On many tape drives, read-back check isn't carried all the way back to the original (uncompressed) data. Rob - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: SCSI Tape corruption - update
On Mon, 7 May 2001, Rob Turk wrote: Lorenzo, Have you ruled out hardware failures? There's been a few isolated reports That tape drive (Sony SDT-9000, less than 2 years of service) works perfectly on Windows NT (were it was before) and even on Linux 2.2 Also the cartridge was brand new. (BTW, I've tried even with DC disabled. Well, it's REALLY fast dumping /dev/zero on tape with DC enabled :) ) -- Lorenzo Marcantonio - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: SCSI Tape Corruption - update 2
In article <[EMAIL PROTECTED]> you write: >On Fri, 13 Apr 2001, Nate Eldredge wrote: >> (32 bytes is the size of a cache line.) A memory tester might be >> something to try (I wrote a simple program that seemed to show the >> error better than memtest86; can send it if desired.) > >Already tried that... this system has passed some 20 hours running >memtest86... I suggest you try Cerberus: https://sourceforge.net/projects/va-ctcs/ which will viciously beat your system to within an inch of its life. If you have any motherboard problems, they're more likely to show up with Cerberus than with a simple memtest. -- Chip Salzenberg - a.k.a. - <[EMAIL PROTECTED]> "We have no fuel on board, plus or minus 8 kilograms." -- NEAR tech - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: SCSI Tape Corruption - update 2
On Fri, 13 Apr 2001, Nate Eldredge wrote: > (32 bytes is the size of a cache line.) A memory tester might be > something to try (I wrote a simple program that seemed to show the > error better than memtest86; can send it if desired.) Already tried that... this system has passed some 20 hours running memtest86... Also I've got NO OTHER memory failure symptom (and the tape fails only on writing) -- Lorenzo Marcantonio - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: SCSI Tape Corruption - update 2
On Fri, 13 Apr 2001, Nate Eldredge wrote: > [EMAIL PROTECTED] wrote: > > Well, the 2.2 distributed with Mandrake 7.2 works fine ... :) > > > > Hmmm... 32 CONSECUTIVE bytes are a very peculiar error. What can it be? > > > > Still experimenting... > > I once ran into a problem with 32-byte errors appearing in files, and > later, in memory. I eventually traced it to buggy motherboard cache. > (32 bytes is the size of a cache line.) A memory tester might be > something to try (I wrote a simple program that seemed to show the > error better than memtest86; can send it if desired.) In that case I'd expect the problem to show up when doing whatever. So far I could not find corrupted files on my hard disk, only when writing to tape, and only with 2.3/2.4. Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [EMAIL PROTECTED] In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: SCSI Tape Corruption - update 2
On Fri, 13 Apr 2001, Nate Eldredge wrote: [EMAIL PROTECTED] wrote: Well, the 2.2 distributed with Mandrake 7.2 works fine ... :) Hmmm... 32 CONSECUTIVE bytes are a very peculiar error. What can it be? Still experimenting... I once ran into a problem with 32-byte errors appearing in files, and later, in memory. I eventually traced it to buggy motherboard cache. (32 bytes is the size of a cache line.) A memory tester might be something to try (I wrote a simple program that seemed to show the error better than memtest86; can send it if desired.) In that case I'd expect the problem to show up when doing whatever. So far I could not find corrupted files on my hard disk, only when writing to tape, and only with 2.3/2.4. Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [EMAIL PROTECTED] In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: SCSI Tape Corruption - update 2
On Fri, 13 Apr 2001, Nate Eldredge wrote: (32 bytes is the size of a cache line.) A memory tester might be something to try (I wrote a simple program that seemed to show the error better than memtest86; can send it if desired.) Already tried that... this system has passed some 20 hours running memtest86... Also I've got NO OTHER memory failure symptom (and the tape fails only on writing) -- Lorenzo Marcantonio - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: SCSI Tape Corruption - update 2
In article [EMAIL PROTECTED] you write: On Fri, 13 Apr 2001, Nate Eldredge wrote: (32 bytes is the size of a cache line.) A memory tester might be something to try (I wrote a simple program that seemed to show the error better than memtest86; can send it if desired.) Already tried that... this system has passed some 20 hours running memtest86... I suggest you try Cerberus: https://sourceforge.net/projects/va-ctcs/ which will viciously beat your system to within an inch of its life. If you have any motherboard problems, they're more likely to show up with Cerberus than with a simple memtest. -- Chip Salzenberg - a.k.a. - [EMAIL PROTECTED] "We have no fuel on board, plus or minus 8 kilograms." -- NEAR tech - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: SCSI Tape Corruption - update 2
[EMAIL PROTECTED] wrote: > Well, the 2.2 distributed with Mandrake 7.2 works fine ... :) > > Hmmm... 32 CONSECUTIVE bytes are a very peculiar error. What can it be? > > Still experimenting... I once ran into a problem with 32-byte errors appearing in files, and later, in memory. I eventually traced it to buggy motherboard cache. (32 bytes is the size of a cache line.) A memory tester might be something to try (I wrote a simple program that seemed to show the error better than memtest86; can send it if desired.) -- Nate Eldredge [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: SCSI Tape Corruption - update
On Fri, 13 Apr 2001, Geert Uytterhoeven wrote: > On Thu, 12 Apr 2001 [EMAIL PROTECTED] wrote: > > It seems that the tape is written incorrectly. I wrote some large file > > (300MB) > > and read it back four time. The read copies are all the same. They differ > > from the original only in 32 consecutive bytes (the replaced values SEEM > > random). Of course, 32 bytes in 300MB tar.gz files are TOO MUCH to be > > accepted :) > > In my case, the 32 bad bytes are always a copy of the 32 bytes 10K before (10K > = blocksize of tar). Can you verify that's the case for you as well? For > reference, I have approx. 6 sequences of corrupted data when writing 256 MB to > tape. Reading gives no problems. Forgot some things... It also happens with dd, so it's not a bug in tar. If I set the tar blocksize to 512 bytes, the offset changes to 512 bytes as well. If I set the tar blocksize to 57*512 bytes, I didn't see a problem (however, could have been `good luck'). The problem seems to be there since at least 2.4.0-test1-ac10, which means quite some people may no longer have known good backups of their valuable data (of course we should not run 2.[34].x kernels on our systems, right? :-) Since you have a different SCSI host adapter, the problem is most likely in st.c. I was thinking of writing `predictable' data (or checksummed blocks or so) to tape and add some data verification tests to st.c at the very last moment before it sends a write command to the SCSI host adapter, but I haven't found time for that yet. Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [EMAIL PROTECTED] In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: SCSI Tape Corruption - update
On Thu, 12 Apr 2001 [EMAIL PROTECTED] wrote: > Still experimenting with my SDT-9000... tried connecting it to another > controller > (2940AU in place of 2904, sorry but I've only Adaptec stuff :). Same > problem. > Tried with another tape (even with an old DDS-2 tape). Same. Even tried > another > cable/removing the CDWR drive from the bus. > > It seems that the tape is written incorrectly. I wrote some large file > (300MB) > and read it back four time. The read copies are all the same. They differ > from the original only in 32 consecutive bytes (the replaced values SEEM > random). Of course, 32 bytes in 300MB tar.gz files are TOO MUCH to be > accepted :) As Gérard already replied, I have the same problem on my PPC box (cfr. my postings last month) with DDS-1 tape drive. It has 2 SCSI adapters (MESH and Sym53c875), and it seems to happen with the '875 only (but the MESH sucks anyway and has other problems making it unusable for my DDS-1). In my case, the 32 bad bytes are always a copy of the 32 bytes 10K before (10K = blocksize of tar). Can you verify that's the case for you as well? For reference, I have approx. 6 sequences of corrupted data when writing 256 MB to tape. Reading gives no problems. The problem does not appear in 2.2.13 (yep, that's old, but so far the latest 2.2.x kernel that runs on my CHRP LongTrail). I have to fix later kernels first. Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [EMAIL PROTECTED] In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: SCSI Tape Corruption - update
On Thu, 12 Apr 2001 [EMAIL PROTECTED] wrote: Still experimenting with my SDT-9000... tried connecting it to another controller (2940AU in place of 2904, sorry but I've only Adaptec stuff :). Same problem. Tried with another tape (even with an old DDS-2 tape). Same. Even tried another cable/removing the CDWR drive from the bus. It seems that the tape is written incorrectly. I wrote some large file (300MB) and read it back four time. The read copies are all the same. They differ from the original only in 32 consecutive bytes (the replaced values SEEM random). Of course, 32 bytes in 300MB tar.gz files are TOO MUCH to be accepted :) As Grard already replied, I have the same problem on my PPC box (cfr. my postings last month) with DDS-1 tape drive. It has 2 SCSI adapters (MESH and Sym53c875), and it seems to happen with the '875 only (but the MESH sucks anyway and has other problems making it unusable for my DDS-1). In my case, the 32 bad bytes are always a copy of the 32 bytes 10K before (10K = blocksize of tar). Can you verify that's the case for you as well? For reference, I have approx. 6 sequences of corrupted data when writing 256 MB to tape. Reading gives no problems. The problem does not appear in 2.2.13 (yep, that's old, but so far the latest 2.2.x kernel that runs on my CHRP LongTrail). I have to fix later kernels first. Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [EMAIL PROTECTED] In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: SCSI Tape Corruption - update
On Fri, 13 Apr 2001, Geert Uytterhoeven wrote: On Thu, 12 Apr 2001 [EMAIL PROTECTED] wrote: It seems that the tape is written incorrectly. I wrote some large file (300MB) and read it back four time. The read copies are all the same. They differ from the original only in 32 consecutive bytes (the replaced values SEEM random). Of course, 32 bytes in 300MB tar.gz files are TOO MUCH to be accepted :) In my case, the 32 bad bytes are always a copy of the 32 bytes 10K before (10K = blocksize of tar). Can you verify that's the case for you as well? For reference, I have approx. 6 sequences of corrupted data when writing 256 MB to tape. Reading gives no problems. Forgot some things... It also happens with dd, so it's not a bug in tar. If I set the tar blocksize to 512 bytes, the offset changes to 512 bytes as well. If I set the tar blocksize to 57*512 bytes, I didn't see a problem (however, could have been `good luck'). The problem seems to be there since at least 2.4.0-test1-ac10, which means quite some people may no longer have known good backups of their valuable data (of course we should not run 2.[34].x kernels on our systems, right? :-) Since you have a different SCSI host adapter, the problem is most likely in st.c. I was thinking of writing `predictable' data (or checksummed blocks or so) to tape and add some data verification tests to st.c at the very last moment before it sends a write command to the SCSI host adapter, but I haven't found time for that yet. Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [EMAIL PROTECTED] In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: SCSI Tape Corruption - update 2
[EMAIL PROTECTED] wrote: Well, the 2.2 distributed with Mandrake 7.2 works fine ... :) Hmmm... 32 CONSECUTIVE bytes are a very peculiar error. What can it be? Still experimenting... I once ran into a problem with 32-byte errors appearing in files, and later, in memory. I eventually traced it to buggy motherboard cache. (32 bytes is the size of a cache line.) A memory tester might be something to try (I wrote a simple program that seemed to show the error better than memtest86; can send it if desired.) -- Nate Eldredge [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: SCSI Tape Corruption - update 2
On Thu, 12 Apr 2001, Gérard Roudier wrote: > using a sym53c875 controller. In this case, kernel 2.2 was fine. > > > Now I'll build some old 2.2 kernel to try... > > If 2.2 is ok with your tape, a software error in 2.4 gets very likely, in > my opinion. Well, the 2.2 distributed with Mandrake 7.2 works fine ... :) Hmmm... 32 CONSECUTIVE bytes are a very peculiar error. What can it be? Still experimenting... -- Lorenzo Marcantonio - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: SCSI Tape Corruption - update
On Thu, 12 Apr 2001 [EMAIL PROTECTED] wrote: > Still experimenting with my SDT-9000... tried connecting it to another > controller > (2940AU in place of 2904, sorry but I've only Adaptec stuff :). Same > problem. > Tried with another tape (even with an old DDS-2 tape). Same. Even tried > another > cable/removing the CDWR drive from the bus. > > It seems that the tape is written incorrectly. I wrote some large file > (300MB) > and read it back four time. The read copies are all the same. They differ > from the original only in 32 consecutive bytes (the replaced values SEEM > random). Of course, 32 bytes in 300MB tar.gz files are TOO MUCH to be > accepted :) A similar problem has been reported under Linux/PPC a couple of weeks ago using a sym53c875 controller. In this case, kernel 2.2 was fine. > Now I'll build some old 2.2 kernel to try... If 2.2 is ok with your tape, a software error in 2.4 gets very likely, in my opinion. Gérard. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: SCSI Tape Corruption - update
[EMAIL PROTECTED] wrote: > It seems that the tape is written incorrectly. I wrote some large file > (300MB) > and read it back four time. The read copies are all the same. They differ > from the original only in 32 consecutive bytes (the replaced values SEEM > random). Of course, 32 bytes in 300MB tar.gz files are TOO MUCH to be > accepted :) Several years ago I ran into a problem with similar symptoms on an old Adaptec AHA-154X controller. Files (and most certainly "file systems" if I had persisted) on my hard disk were getting corrupted in random places with constant length strings of garbage. This turned out to be an inappropriate setting for the AHA1542_SCATTER constant: it *was* 16, and setting it to 8 fixed my problem. I'd look for a similar "#define" in the header file for your SCSI device driver and try cutting the value by half. Why "half"? No justification other than it worked for me, and it's a power-of-two kind of thing that hardware seems to like :-). --Bob - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: SCSI Tape Corruption - update
[EMAIL PROTECTED] wrote: It seems that the tape is written incorrectly. I wrote some large file (300MB) and read it back four time. The read copies are all the same. They differ from the original only in 32 consecutive bytes (the replaced values SEEM random). Of course, 32 bytes in 300MB tar.gz files are TOO MUCH to be accepted :) Several years ago I ran into a problem with similar symptoms on an old Adaptec AHA-154X controller. Files (and most certainly "file systems" if I had persisted) on my hard disk were getting corrupted in random places with constant length strings of garbage. This turned out to be an inappropriate setting for the AHA1542_SCATTER constant: it *was* 16, and setting it to 8 fixed my problem. I'd look for a similar "#define" in the header file for your SCSI device driver and try cutting the value by half. Why "half"? No justification other than it worked for me, and it's a power-of-two kind of thing that hardware seems to like :-). --Bob - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: SCSI Tape Corruption - update
On Thu, 12 Apr 2001 [EMAIL PROTECTED] wrote: Still experimenting with my SDT-9000... tried connecting it to another controller (2940AU in place of 2904, sorry but I've only Adaptec stuff :). Same problem. Tried with another tape (even with an old DDS-2 tape). Same. Even tried another cable/removing the CDWR drive from the bus. It seems that the tape is written incorrectly. I wrote some large file (300MB) and read it back four time. The read copies are all the same. They differ from the original only in 32 consecutive bytes (the replaced values SEEM random). Of course, 32 bytes in 300MB tar.gz files are TOO MUCH to be accepted :) A similar problem has been reported under Linux/PPC a couple of weeks ago using a sym53c875 controller. In this case, kernel 2.2 was fine. Now I'll build some old 2.2 kernel to try... If 2.2 is ok with your tape, a software error in 2.4 gets very likely, in my opinion. Grard. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: SCSI Tape Corruption - update 2
On Thu, 12 Apr 2001, Grard Roudier wrote: using a sym53c875 controller. In this case, kernel 2.2 was fine. Now I'll build some old 2.2 kernel to try... If 2.2 is ok with your tape, a software error in 2.4 gets very likely, in my opinion. Well, the 2.2 distributed with Mandrake 7.2 works fine ... :) Hmmm... 32 CONSECUTIVE bytes are a very peculiar error. What can it be? Still experimenting... -- Lorenzo Marcantonio - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/