Bug#354654: general: fat32 gets corrupted

2006-03-02 Thread Juan Piñeros
Hello,

Since I have the need to continue to work with my machine1, I finally did an 
apt-get dist-upgrade for machine1, and installed debian kernel 2.6.15-1-686. 
I erased fat32 partition and replaced it by a ext2 and give access to it from 
windows using fs-driver. I will see if problems are still present (if I 
continue to loose directories).

In machine2 I will also upgrade the kernel but without erasing the fat32, to 
have a chance to see if the problems comes from the kernel. I will also 
remove hdparm from machine2, if the kernel is not the problem.

I will let you know what are the results in the coming weeks, because if it is 
a bug it worth to be reported.

All the best,
Juan.



-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Bug#354654: general: fat32 gets corrupted

2006-02-28 Thread Juan Piñeros
Hello Cesare,

In machine1 hdparm is not currently installed, but it was 1 year ago when the 
machine1 had woody installed. I suppose hdparm does not change anything to 
the disk itself but only to the ide modules of the kernel?

In machine2, hdparm is installed, but I do not remember to have changed 
anything to the configuration of it (the log file I was maintaining for 
machine2 was lost during the disk crash). Do I have to check something or 
simply uninstall it?

Thanks for your reply,
Juan.

Le Tuesday 28 February 2006 01:56, Cesare Leonardi a écrit :
 Juan Piñeros wrote:
  I do not find any logical explanation. No strange message in syslog, we
  used normal programs (konqueror, thunderbird, oowriter) when sudenly
  when try to save a file or read mail, an error appears just saying that
  the directories did not exist any more.

 In the past i had a similar problem: sometimes, with no appearing
 regularity, some files simply got corrupted (filesystem was ext3).
 I simply couldn't understand what could be, since the hard disk seemed
 to be ok.
 Until i have remembered to have played with hdparm and put an optimized
 hdparm command line in a boot script.
 After i commented out that line, i hadn't no more corruption.

 I don't know if this can be your case.
 Regards.

 Cesare.



Bug#354654: general: fat32 gets corrupted

2006-02-28 Thread Juan Piñeros
Hello Jacques,

I reply here below:

Le Tuesday 28 February 2006 00:56, jacques Normand a écrit :
 On Mon, Feb 27, 2006 at 11:33:19PM +0100, Juan Piñeros wrote:
1 Raw_Read_Error_Rate 0x000d   100   100   050
   Pre-fail  Offline
  -   51
  195 Hardware_ECC_Recovered  0x001a   100   100   000
   Old_age   Always
  -   2
  199 UDMA_CRC_Error_Count0x003e   200   200   000
   Old_age   Always
  -   9

 This is where your issue seems to live. I have never seen the read
 error and ecc corrected number not matching. It would mean that an error
 occurs but there has been no way to make it right so I would expect the
 read to be garbage... Did you see any corruption in your files? I mean
 data corrupted instead of metadata?

* IN MACHINE1:

In machine1, the files that were recovered after the crash were converted to 8 
characters dos names, but the files I tried to open were ok (I can not open 
all of them since this is 10GB). There were some unrecoverable files 
(following the recover soft said) but I suppose these were previously deleted 
files that were partially overwritten before the crash.

I never saw a file simply corrupted but still existing in machine1: simply 
they were ok, then suddenly a whole directory was lost.

One thing I remember now is that I installed smartools in machine1 two months 
before the crash (the disk was 2 years in use without problems before), but 
this is maybe unrelated with the crash.

* IN MACHINE2:

Here, the most recent files that were lost and after that recovered were 
totally corrupted. However we avoided to write on the disk with linux, and 
the disk recovery function of windows (the System Volume Information) was 
disabled.


 Also, you say that sata does not support smart. That is not true, with
 one of the very recent kernels (2.6.15.4), you can get them. I have not
 much experience with the kernels shipped with debian. I always recompile
 my own. But some problems I had with an nfs server (in an HPC system)
 vanished when I upgraded from 2.6.12 to 2.6.14. There was a bug with the
 futex, and I think that was the source of my problems (race conditions
 are always nasty).

I can try to install a more recent kernel to make smart working, but the disk 
is new, is it useful? I have another question: why did etch installer chose 
the 2.6.12-1-386 kernel? Why not the 686? Maybe installing 686 can help? 
However this problem in machine2 remains classified as a bug and not as a 
problem related to user choices?


 As for the udma crc? That usually means that your controller/cable is
 going bad. Each time I have seen that, the whole system crashed
 corrupting files everywhere... That is pretty odd that you see the thig
 on two different system though.

So it seems that I have two different problems in the two machines? It seems 
that machine1's disk is to be replaced?


 jacques

 PS: With development kernels, always try to use the latest. Especially
 when you see a problem. (And I still consider the 2.6 as being a
 development version)

So if you install a very recent kernel from kernel.org, you have to apply a 
lot of patches before having it working? Is it better that I install a stock 
2.4 debian kernel on machine2? It can be difficult to downgrade from 2.6 to 
2.4 (udev, etc).

Thanks,
Juan.



Bug#354654: general: fat32 gets corrupted

2006-02-27 Thread Juan Piñeros
Package: general
Severity: critical

Dear all,

I had two vfat crashes, rather similar in two machines (laptops) with debian, 
dual boot windows xp:

- machine1 (compaq nx9010 with celeron): ide disk 30GB (-- System Information: 
Debian Release: 3.1, Architecture: i386 (i686), Kernel: Linux 2.6.8-1-686, 
Locale: [EMAIL PROTECTED], [EMAIL PROTECTED] (charmap=ISO-8859-15), libc6 
Version: 2.3.2.ds1-22)

- machine2 (ibm thinkpad r52 with pentium M): etch, kernel: 2.6.12-1-386, sata 
disk 80GB, (no physical access at the moment to this machine so no way for 
more system information).

In machine1 vfat, 40% of directories where lost (however recoverable under 
windows using a commercial crash recovery soft). In machine2, 100% was lost 
in a first crash, but it is random, since the problem is repeating from time 
to time, and not always everything is lost. Sometimes we lost what was used 
the day before, and sometimes everithing except the files we used. Lost 
directories are not located always at the same level.

I do not find any logical explanation. No strange message in syslog, we used 
normal programs (konqueror, thunderbird, oowriter) when sudenly when try to 
save a file or read mail, an error appears just saying that the directories 
did not exist any more.

In both cases, fat32 was formated under windows. 

I had smartd running on machine1, nothing strange (maybe a high temperature of 
the disk +-53°C). On machine2, smartd is not installable since it does not 
work with sata. 

Here below the fstab files, plus a smart test on machine1. It was difficult to 
install sata support in machine2, the strange procedure I used is also 
described below. Find also a lspci for machine1 and 2.
-

machine1:

cat /etc/fstab
# /etc/fstab: static file system information.
#
# file system mount point   type
 options  
 dump   pass
proc/proc   proc  
 defaults0
   0
#cd-rom
/dev/hdc/media/cdrom0  
iso9660 ro,user,noauto  0
   0
#hard disk
/dev/hda3   /   ext3  
 
defaults,errors=remount-ro 0   1
/dev/hda2   noneswap  
 sw  0
   0
/dev/hda6   /mnt/hda6   ext3  
 defaults0
   2
/dev/hda1   /mnt/hda1   ntfs  
 
gid=1002,user,ro,umask=002,noexec,nosuid  
 0   0
/dev/hda5   /mnt/hda5   vfat  
 
gid=1002,user,rw,umask=002,noexec,nosuid

***

#smartctl -A /dev/hda

smartctl version 5.32 Copyright (C) 2002-4 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF READ SMART DATA SECTION ===
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME  FLAG VALUE WORST
THRESH TYPE  UPDATED
WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate 0x000d   100   100   050  
 Pre-fail  Offline
-   51
  2 Throughput_Performance  0x0005   100   100   050  
 Pre-fail  Offline
-   3950
  3 Spin_Up_Time0x0007   100   100   050  
 Pre-fail  Always
-   0
  4 Start_Stop_Count0x0032   099   099   000  
 Old_age   Always
-   1912
  5 Reallocated_Sector_Ct   0x0033   100   100   010  
 Pre-fail  Always
-   0
  7 Seek_Error_Rate 0x000f   100   100   050  
 Pre-fail  Always
-   932
  8 Seek_Time_Performance   0x0005   100   100   050  
 Pre-fail  Offline
-   1224
  9 Power_On_Minutes0x0032   092   092   000  
 Old_age   Always
-   4066h+47m
 10 Spin_Retry_Count0x0013   100   100   050  
 Pre-fail  Always
-   0
 12 Power_Cycle_Count   0x0032   099   099   000  
 Old_age   Always
-   1876
191 G-Sense_Error_Rate  0x000a   100   100   000  
 Old_age   Always
-   7
192 Power-Off_Retract_Count 0x0032   100   100   000  
 Old_age   Always
-   285
193 Load_Cycle_Count0x0032   077   077   000  
 Old_age   Always
-   143904/143618
194 Temperature_Celsius 0x0022   076   052   000  
 Old_age   Always
-   52 (Lifetime Min/Max 64/8)
195 Hardware_ECC_Recovered  0x001a   100   100   000  
 Old_age   Always
-   2
196 Reallocated_Event_Count 0x0032   100   100   000  
 Old_age   Always
-   0
197 Current_Pending_Sector  0x0032   100   100   000  
 Old_age   Always
-   0
198 Offline_Uncorrectable   0x0010   100   100   000  
 Old_age   Offline
-   0
199 UDMA_CRC_Error_Count0x003e   200   200   000  
 Old_age   Always
-   9
200 Multi_Zone_Error_Rate   0x0012   100   100   000  
 Old_age   Always
-   0
201 Soft_Read_Error_Rate0x0012   100   100   000  
 Old_age   Always
-   0
223 Load_Retry_Count0x0012   100   100   000  
 Old_age   Always
-   0
230 Head_Amplitude  0x0032   094   094   000  
 Old_age   Always
-   198978
250 Read_Error_Retry_Rate   0x000a   100   100