On Wednesday 20 June 2007 00:56:00 Darryl Gregorash wrote: > You'll need to give us a lot more information about your system hardware > (including the modules that are loaded for hard drive i/o), plus > information from /var/log/messages about what is happening when the > filesystem goes RO.
OK. I will give as much as I can. The mail is therefore a bit long ... I have solved the problem partly by keeping to one FS per drive, as suggested by Carl Hartung. Thanx Carl. On Tuesday 19 June 2007 23:47:43 Carl Hartung wrote: > On Tue June 19 2007 17:11, [EMAIL PROTECTED] wrote: > <snip> > >... Can using different FS's in one system cause such problems? > > Theoretically, no, but in actual fact there are circumstances where conflicts > *can* arise. > > In my case... with this specific chipset and corresponding kernel IDE > controller module... cache buffering is enabled or disabled on a per drive > basis. Running disparate filesystem types in adjacent partitions on the same > drive (i.e. reiserfs + ext3) triggered errors comparable to those you're > experiencing now. > > I ultimately coaxed those errors away permanently by standardizing my > installations to using only one journaling filesystem type per drive. > The system is much more stable. ################################ Last night. however, it happened again !! I put my mobile phone on the USB port. I left the mobile phone on the USB on to charge the batteries, thinking nothing of it. Only when I did some access to it the files disappeared after the listing. The USB was detached automatically from the USB HUB. Again thinking nothing of it, I attached it directly to a USB port om the MOBO. I then wanted to install from a dvd mounted as /dev/hdd, and did a lot of disk access, the system went RO FS again. I went to bed .... On Wednesday 20 June 2007 00:56:00 Darryl Gregorash wrote: > I tend to doubt that the specific filesystem(s) in use have anything at > all to do with this, but the high disk access probably does. There is a > thread on Dell about problems with the MegaRAID sas driver (module name > megasas) -- > http://lists.us.dell.com/pipermail/linux-poweredge/2007-March/029974.html > -- but you have not given enough information for anyone to know if this > is relevant to your problem. Grep /var/log/messages for "megasas". sudo more /var/log/messages | grep "megasys" reports nothing Looking at the logs again afterwards this morning, I noticed these SCSI part /dev/sda1. [EMAIL PROTECTED]:~> sudo more /var/log/messages | grep "sda" Jun 30 22:43:28 sico kernel: SCSI device sda: 3903488 512-byte hdwr sectors (1999 MB) Jun 30 22:43:28 sico kernel: sda: Write Protect is off Jun 30 22:43:28 sico kernel: sda: Mode Sense: 00 6a 00 00 Jun 30 22:43:28 sico kernel: sda: assuming drive cache: write through Jun 30 22:43:28 sico kernel: SCSI device sda: 3903488 512-byte hdwr sectors (1999 MB) Jun 30 22:43:28 sico kernel: sda: Write Protect is off Jun 30 22:43:28 sico kernel: sda: Mode Sense: 00 6a 00 00 Jun 30 22:43:28 sico kernel: sda: assuming drive cache: write through Jun 30 22:43:28 sico kernel: sda: sda1 Jun 30 22:43:28 sico kernel: sd 0:0:0:0: Attached scsi removable disk sda Jun 30 22:43:30 sico hald: mounted /dev/sda1 on behalf of uid 1000 Jun 30 22:47:57 sico kernel: sda: Current: sense key: No Sense ... (repeated many times) ... Jun 30 22:47:58 sico kernel: end_request: I/O error, dev sda, sector 14464 Jun 30 22:47:59 sico hald: unmounted /dev/sda1 from '/media/disk' on behalf of uid 0 Jun 30 22:47:59 sico kernel: SCSI device sda: 3903488 512-byte hdwr sectors (1999 MB) Jun 30 22:47:59 sico kernel: sda: Write Protect is off Jun 30 22:47:59 sico kernel: sda: Mode Sense: 00 6a 00 00 Jun 30 22:47:59 sico kernel: sda: assuming drive cache: write through Jun 30 22:47:59 sico kernel: SCSI device sda: 3903488 512-byte hdwr sectors (1999 MB) Jun 30 22:47:59 sico kernel: sda: Write Protect is off Jun 30 22:47:59 sico kernel: sda: Mode Sense: 00 6a 00 00 Jun 30 22:47:59 sico kernel: sda: assuming drive cache: write through Jun 30 22:47:59 sico kernel: sda: sda1 Jun 30 22:47:59 sico kernel: SCSI device sda: 3903488 512-byte hdwr sectors (1999 MB) Jun 30 22:47:59 sico kernel: sda: Write Protect is off Jun 30 22:47:59 sico kernel: sda: Mode Sense: 00 6a 00 00 Jun 30 22:47:59 sico kernel: sda: assuming drive cache: write through Jun 30 22:47:59 sico kernel: sda: sda1 Jun 30 22:48:01 sico hald: mounted /dev/sda1 on behalf of uid 1000 Jun 30 22:48:26 sico kernel: sda: Current: sense key: No Sense Jun 30 22:48:27 sico kernel: end_request: I/O error, dev sda, sector 9152 Jun 30 22:48:27 sico kernel: end_request: I/O error, dev sda, sector 9152 ... (repeated many times) ... Jun 30 22:48:27 sico kernel: end_request: I/O error, dev sda, sector 19328 Jun 30 22:48:27 sico hald: unmounted /dev/sda1 from '/media/disk' on behalf of uid 0 Jun 30 22:48:29 sico kernel: SCSI device sda: 3903488 512-byte hdwr sectors (1999 MB) Jun 30 22:48:29 sico kernel: sda: Write Protect is off Jun 30 22:48:29 sico kernel: sda: Mode Sense: 00 6a 00 00 Jun 30 22:48:29 sico kernel: sda: assuming drive cache: write through Jun 30 22:48:29 sico kernel: SCSI device sda: 3903488 512-byte hdwr sectors (1999 MB) Jun 30 22:48:29 sico kernel: sda: Write Protect is off Jun 30 22:48:29 sico kernel: sda: Mode Sense: 00 6a 00 00 Jun 30 22:48:29 sico kernel: sda: assuming drive cache: write through Jun 30 22:48:29 sico kernel: sda: sda1 Jun 30 22:48:31 sico hald: mounted /dev/sda1 on behalf of uid 1000 Jun 30 22:48:37 sico kernel: sda: Current: sense key: No Sense Jun 30 22:48:37 sico kernel: end_request: I/O error, dev sda, sector 40320 ... (repeated many times) ... Jun 30 22:48:38 sico kernel: end_request: I/O error, dev sda, sector 46144 Jun 30 22:48:38 sico hald: unmounted /dev/sda1 from '/media/disk' on behalf of uid 0 Jun 30 22:48:40 sico kernel: SCSI device sda: 3903488 512-byte hdwr sectors (1999 MB) Jun 30 22:48:40 sico kernel: sda: Write Protect is off Jun 30 22:48:40 sico kernel: sda: Mode Sense: 00 6a 00 00 Jun 30 22:48:40 sico kernel: sda: assuming drive cache: write through Jun 30 22:48:40 sico kernel: SCSI device sda: 3903488 512-byte hdwr sectors (1999 MB) Jun 30 22:48:40 sico kernel: sda: Write Protect is off Jun 30 22:48:40 sico kernel: sda: Mode Sense: 00 6a 00 00 Jun 30 22:48:40 sico kernel: sda: assuming drive cache: write through Jun 30 22:48:40 sico kernel: sda: sda1 Jun 30 22:48:41 sico hald: mounted /dev/sda1 on behalf of uid 1000 Jun 30 22:48:45 sico kernel: sda: Current: sense key: No Sense Jun 30 22:48:45 sico kernel: end_request: I/O error, dev sda, sector 41280 Jun 30 22:48:45 sico kernel: end_request: I/O error, dev sda, sector 41280 ... (repeated many times) ... Jun 30 22:48:46 sico kernel: end_request: I/O error, dev sda, sector 73344 Jun 30 22:48:47 sico hald: unmounted /dev/sda1 from '/media/disk' on behalf of uid 0 Jun 30 22:48:47 sico kernel: SCSI device sda: 3903488 512-byte hdwr sectors (1999 MB) Jun 30 22:48:47 sico kernel: sda: Write Protect is off Jun 30 22:48:47 sico kernel: sda: Mode Sense: 00 6a 00 00 Jun 30 22:48:47 sico kernel: sda: assuming drive cache: write through Jun 30 22:48:47 sico kernel: SCSI device sda: 3903488 512-byte hdwr sectors (1999 MB) Jun 30 22:48:47 sico kernel: sda: Write Protect is off Jun 30 22:48:47 sico kernel: sda: Mode Sense: 00 6a 00 00 Jun 30 22:48:47 sico kernel: sda: assuming drive cache: write through Jun 30 22:48:47 sico kernel: sda: sda1 Jun 30 22:48:48 sico hald: mounted /dev/sda1 on behalf of uid 1000 Jun 30 22:48:50 sico kernel: sda: Current: sense key: No Sense Jun 30 22:48:50 sico kernel: end_request: I/O error, dev sda, sector 1985 Jun 30 22:48:50 sico kernel: end_request: I/O error, dev sda, sector 41088 ... (repeated many times) ... Jun 30 22:48:51 sico kernel: end_request: I/O error, dev sda, sector 50624 Jun 30 22:48:51 sico hald: unmounted /dev/sda1 from '/media/disk' on behalf of uid 0 Jun 30 22:48:53 sico kernel: SCSI device sda: 3903488 512-byte hdwr sectors (1999 MB) Jun 30 22:48:53 sico kernel: sda: Write Protect is off Jun 30 22:48:53 sico kernel: sda: Mode Sense: 00 6a 00 00 Jun 30 22:48:53 sico kernel: sda: assuming drive cache: write through Jun 30 22:48:53 sico kernel: SCSI device sda: 3903488 512-byte hdwr sectors (1999 MB) Jun 30 22:48:53 sico kernel: sda: Write Protect is off Jun 30 22:48:53 sico kernel: sda: Mode Sense: 00 6a 00 00 Jun 30 22:48:53 sico kernel: sda: assuming drive cache: write through Jun 30 22:48:53 sico kernel: sda: sda1 Jun 30 22:48:54 sico hald: mounted /dev/sda1 on behalf of uid 1000 Jun 30 22:48:59 sico kernel: sda: Current: sense key: No Sense Jun 30 22:48:59 sico kernel: end_request: I/O error, dev sda, sector 41088 ... (repeated many times) ... > > One writer in that thread (on Dell) writes "the problem is that the > Linux kernel's SCSI layer insists on a single timeout for all SCSI > requests, and doesn't tolerate high variances in command completion > times. If any single command times out, it resets the whole bus, even if > there is still significant activity." This suggests that the problem is > more widespread than just a RAID issue. This is that writer's message -- > http://lists.us.dell.com/pipermail/linux-poweredge/2007-March/029982.html > -- and it contains a suggestion that may be of use to you. > I found the mail of Joe Malicki (http://lists.us.dell.com/pipermail/linux-poweredge/2007-March/029982.html) about this topic and changed the SCSI timeout: [EMAIL PROTECTED]:~> more /sys/block/sda/device/timeout 60 [EMAIL PROTECTED]:~> sudo echo 120 > /sys/block/sda/device/timeout bash: /sys/block/sda/device/timeout: Permission denied [EMAIL PROTECTED]:~> su - Password: sico:~ # echo 120 > /sys/block/sda/device/timeout Current state: ... Jul 1 14:47:35 sico kernel: SCSI device sda: 3903488 512-byte hdwr sectors (1999 MB) Jul 1 14:47:35 sico kernel: sda: Write Protect is off Jul 1 14:47:35 sico kernel: sda: Mode Sense: 00 6a 00 00 Jul 1 14:47:35 sico kernel: sda: assuming drive cache: write through Jul 1 14:47:35 sico kernel: SCSI device sda: 3903488 512-byte hdwr sectors (1999 MB) Jul 1 14:47:35 sico kernel: sda: Write Protect is off Jul 1 14:47:35 sico kernel: sda: Mode Sense: 00 6a 00 00 Jul 1 14:47:35 sico kernel: sda: assuming drive cache: write through Jul 1 14:47:35 sico kernel: sda: sda1 Jul 1 14:47:51 sico hald: mounted /dev/sda1 on behalf of uid 1000 The lines: Jun 30 22:48:59 sico kernel: sda: Current: sense key: No Sense Jun 30 22:48:59 sico kernel: end_request: I/O error, dev sda, sector 41088 ... (repeated many times) ... do not seem to come anymore after some extensive disk access as before. ################################ I am not sure what to make of these RO comments in the last lines in messages. Can it be that it just reports that the DVD is RO?: Jul 1 18:18:29 sico sudo: sico : TTY=pts/1 ; PWD=/home/sico ; USER=root ; COMMAND=/bin/more /var/log/messages Jul 1 18:20:29 sico kernel: ISO 9660 Extensions: Microsoft Joliet Level 3 Jul 1 18:20:29 sico kernel: ISO 9660 Extensions: RRIP_1991A Jul 1 18:20:29 sico hald: mounted /dev/hdd on behalf of uid 1000 Jul 1 18:21:53 sico gconfd (sico-5635): GConf server is not in use, shutting down. Jul 1 18:21:53 sico gconfd (sico-5635): Exiting Jul 1 18:26:43 sico gconfd (sico-18750): starting (version 2.14.0), pid 18750 user 'sico' Jul 1 18:26:43 sico gconfd (sico-18750): Resolved address "xml:readonly:/etc/opt/gnome/gconf/gconf.xml.mandatory" to a read-only configuration source at position 0 Jul 1 18:26:43 sico gconfd (sico-18750): Resolved address "xml:readwrite:/home/sico/.gconf" to a writable configuration source at position 1 Jul 1 18:26:43 sico gconfd (sico-18750): Resolved address "xml:readonly:/etc/opt/gnome/gconf/gconf.xml.defaults" to a read-only configuration source at position 2 Jul 1 18:26:43 sico gconfd (sico-18750): Resolved address "xml:readonly:/etc/opt/gnome/gconf/gconf.xml.schemas" to a read-only configuration source at position 3 Jul 1 18:27:13 sico gconfd (sico-18750): GConf server is not in use, shutting down. ################################ Is it normal for USB to use the SCSI layer? Can the SCSI layer be avoided? Can it be changed to IDE like /dev/hde? :-) Al -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
