Re: [reiserfs-list] O/T but expert answer needed: MS says NTFS does full data journaling

2002-02-14 Thread Paul Robertson

 On Thu, 14 Feb 2002 02:58, [EMAIL PROTECTED] wrote:
  On Wed, 13 Feb 2002 12:26:59 +1300, Adam Warner
[EMAIL PROTECTED]
 said:
  Does Windows journal the metadata, data or both?
  
  Answer:  Windows NT/2000 systems that utilize NTFS since NT3.1 have
  always journalled and logged metadata and data, so we've been doing
  this for close to a decade.
  
   I just want to confirm if this is in fact true. I can't find a
 
  Hint:  If they journal both, why do you ever hear of people getting
  corrupted filesystems when the box BSOD's?
 
  (No, I don't know if it does or not - but I've heard *too* many people
say
  It hosed the disk and I had to reinstall for me to think that it's
done
  correctly)

 When a maching gets an Oops or BSOD condition then the kernel is
inherantly
 doing improper and unpredictable things with memory.  Therefore regardless
of
 what file system you use it could get trashed and data could get lost.

 Oops conditions are generally rare on Linux machines so this shouldn't be
 much of an issue.  BSOD on NT is quite common...

 --
 http://www.coker.com.au/bonnie++/ Bonnie++ hard drive benchmark
 http://www.coker.com.au/postal/   Postal SMTP/POP benchmark
 http://www.coker.com.au/projects.html Projects I am working on
 http://www.coker.com.au/~russell/ My home page

IMO oops and BSOD are quite different. There are many possible reasons why
an NT kernel component might decide to call KeBugCheck() which generates the
BSOD. I have a book which lists around 100 common bugcheck codes. In
particular, NT can be configured to dump the system state to a file on the
boot partition when a crash occurs.
--
Paul Robertson






Re: [reiserfs-list] O/T but expert answer needed: MS says NTFS does full data journaling

2002-02-14 Thread Russell Coker

On Thu, 14 Feb 2002 20:25, Paul Robertson wrote:
  When a maching gets an Oops or BSOD condition then the kernel is

 inherantly

  doing improper and unpredictable things with memory.  Therefore
  regardless

 of

  what file system you use it could get trashed and data could get lost.
 
  Oops conditions are generally rare on Linux machines so this shouldn't be
  much of an issue.  BSOD on NT is quite common...

 IMO oops and BSOD are quite different. There are many possible reasons why
 an NT kernel component might decide to call KeBugCheck() which generates
 the BSOD. I have a book which lists around 100 common bugcheck codes. In
 particular, NT can be configured to dump the system state to a file on the
 boot partition when a crash occurs.

There are also a couple of Linux kernel patches to support dumping the memory 
to the swap partition on an Oops, and an Oops can be triggered by any 
condition that some kernel code considers Oops-worthy.

IMHO The biggest difference between an Oops and a BSOD is that a machine 
doesn't totally die after an Oops (which can be considered a good or a bad 
thing).

-- 
http://www.coker.com.au/bonnie++/ Bonnie++ hard drive benchmark
http://www.coker.com.au/postal/   Postal SMTP/POP benchmark
http://www.coker.com.au/projects.html Projects I am working on
http://www.coker.com.au/~russell/ My home page



[reiserfs-list] worried about error-messages in /var/log/messages

2002-02-14 Thread Mark Rosa

dear all,

i am new to reiserfs and not really a kernel or filesystem hacker.
therefore i am worried about some messages i find quite regularly in 
/var/log/messages .

my system is:
x386 with ide-harddisks running SUSE Linux 7.2 with ReiserFS on the 
data-disk-partitions (/ is running on ext2).

here is an excerpt out of the messages-log:
 START ###

Feb 11 14:20:49 server-linux kernel: hdc: timeout waiting for DMA
Feb 11 14:20:49 server-linux kernel: ide_dmaproc: chipset supported 
ide_dma_timeout func only: 14
Feb 11 14:20:49 server-linux kernel: hdc: irq timeout: status=0x58 { 
DriveReady SeekComplete DataRequest }
Feb 11 14:20:51 server-linux kernel: hdc: dma_intr: status=0x58 { 
DriveReady SeekComplete DataRequest }
Feb 11 14:20:51 server-linux kernel: hdc: status timeout: status=0xd0 { 
Busy }
Feb 11 14:20:51 server-linux kernel: hdc: DMA disabled
Feb 11 14:20:51 server-linux kernel: hdc: drive not ready for command
Feb 11 14:20:55 server-linux kernel: ide1: reset: success
Feb 11 14:23:33 server-linux kernel: vs-13060: reiserfs_update_sd: stat 
data of object [8044 8127 0x0 SD] (nlink == 1) not found (pos 42)
Feb 11 14:27:02 server-linux kernel: vs-13060: reiserfs_update_sd: stat 
data of object [5283 5382 0x0 SD] (nlink == 1) not found (pos 21)
Feb 11 14:27:02 server-linux kernel: vs-13060: reiserfs_update_sd: stat 
data of object [5283 5383 0x0 SD] (nlink == 1) not found (pos 21)
Feb 11 14:27:02 server-linux kernel: vs-13060: reiserfs_update_sd: stat 
data of object [5283 5385 0x0 SD] (nlink == 1) not found (pos 21)
Feb 11 14:27:03 server-linux kernel: vs-13060: reiserfs_update_sd: stat 
data of object [5283 5375 0x0 SD] (nlink == 1) not found (pos 21)
Feb 11 14:27:03 server-linux kernel: vs-13060: reiserfs_update_sd: stat 
data of object [5283 5375 0x0 SD] (nlink == 1) not found (pos 21)
Feb 11 14:27:04 server-linux kernel: vs-13060: reiserfs_update_sd: stat 
data of object [5283 5376 0x0 SD] (nlink == 1) not found (pos 21)
Feb 11 14:27:04 server-linux kernel: vs-13060: reiserfs_update_sd: stat 
data of object [5283 5376 0x0 SD] (nlink == 1) not found (pos 21)
Feb 11 14:27:04 server-linux kernel: vs-13060: reiserfs_update_sd: stat 
data of object [5283 5377 0x0 SD] (nlink == 1) not found (pos 21)
Feb 11 14:27:04 server-linux kernel: vs-13060: reiserfs_update_sd: stat 
data of object [5283 5378 0x0 SD] (nlink == 1) not found (pos 21)
Feb 11 14:27:04 server-linux kernel: vs-13060: reiserfs_update_sd: stat 
data of object [5283 5380 0x0 SD] (nlink == 1) not found (pos 21)
Feb 11 14:27:04 server-linux kernel: vs-13060: reiserfs_update_sd: stat 
data of object [5283 5381 0x0 SD] (nlink == 1) not found (pos 21)
Feb 11 14:27:04 server-linux kernel: vs-13060: reiserfs_update_sd: stat 
data of object [5283 5381 0x0 SD] (nlink == 1) not found (pos 21)
Feb 11 14:27:04 server-linux kernel: vs-13060: reiserfs_update_sd: stat 
data of object [5283 5379 0x0 SD] (nlink == 1) not found (pos 21)
Feb 11 14:27:05 server-linux kernel: vs-13060: reiserfs_update_sd: stat 
data of object [5283 5384 0x0 SD] (nlink == 1) not found (pos 21)
Feb 11 14:30:25 server-linux kernel: vs-13060: reiserfs_update_sd: stat 
data of object [6661 6975 0x0 SD] (nlink == 1) not found (pos 0)
Feb 13 16:42:55 server-linux kernel: vs-13060: reiserfs_update_sd: stat 
data of object [7643 7666 0x0 SD] (nlink == 1) not found (pos 63)
Feb 13 16:47:09 server-linux kernel: is_leaf: item location seems wrong 
(second one): *OLD*[9058 9147 0x12a9 DIRECT], item_len 1025, 
item_location 4095, free_space(entry_count) 65535
Feb 13 16:47:09 server-linux kernel: vs-5150: search_by_key: invalid 
format found in block 9062. Fsck?
Feb 13 16:47:09 server-linux kernel: is_leaf: item location seems wrong 
(second one): *OLD*[9058 9147 0x12a9 DIRECT], item_len 1025, 
item_location 4095, free_space(entry_count) 65535
Feb 13 16:47:09 server-linux kernel: vs-5150: search_by_key: invalid 
format found in block 9062. Fsck?
Feb 13 16:47:09 server-linux kernel: is_leaf: item location seems wrong 
(second one): *OLD*[9058 9147 0x12a9 DIRECT], item_len 1025, 
item_location 4095, free_space(entry_count) 65535
Feb 13 16:47:09 server-linux kernel: vs-5150: search_by_key: invalid 
format found in block 9062. Fsck?
Feb 13 16:47:09 server-linux kernel: is_leaf: item location seems wrong 
(second one): *OLD*[9058 9147 0x12a9 DIRECT], item_len 1025, 
item_location 4095, free_space(entry_count) 65535
Feb 13 16:47:09 server-linux kernel: vs-5150: search_by_key: invalid 
format found in block 9062. Fsck?
Feb 13 16:47:23 server-linux kernel: vs-13050: reiserfs_update_sd: i/o 
failure occurred trying to update [9058 9150 0x0 SD] stat 
data4is_leaf: item location seems wrong (second one): *OLD*[9058 9147 
0x12a9 DIRECT], item_len 1025, item_location 4095, 
free_space(entry_count) 65535
Feb 13 16:47:23 server-linux kernel: vs-5150: search_by_key: invalid 
format found in block 9062. Fsck?
Feb 13 16:47:23 server-linux kernel: is_leaf: item location 

[reiserfs-list] worried about error-messages in /var/log/messages

2002-02-14 Thread Mark Rosa

dear all,

i am new to reiserfs and not really a kernel or filesystem hacker.
therefore i am worried about some messages i find quite regularly in 
/var/log/messages .

my system is:
x386 with ide-harddisks running SUSE Linux 7.2 with ReiserFS on the 
data-disk-partitions (/ is running on ext2).

here is an excerpt out of the messages-log:
 START ###

Feb 11 14:20:49 server-linux kernel: hdc: timeout waiting for DMA
Feb 11 14:20:49 server-linux kernel: ide_dmaproc: chipset supported 
ide_dma_timeout func only: 14
Feb 11 14:20:49 server-linux kernel: hdc: irq timeout: status=0x58 { 
DriveReady SeekComplete DataRequest }
Feb 11 14:20:51 server-linux kernel: hdc: dma_intr: status=0x58 { 
DriveReady SeekComplete DataRequest }
Feb 11 14:20:51 server-linux kernel: hdc: status timeout: status=0xd0 { 
Busy }
Feb 11 14:20:51 server-linux kernel: hdc: DMA disabled
Feb 11 14:20:51 server-linux kernel: hdc: drive not ready for command
Feb 11 14:20:55 server-linux kernel: ide1: reset: success
Feb 11 14:23:33 server-linux kernel: vs-13060: reiserfs_update_sd: stat 
data of object [8044 8127 0x0 SD] (nlink == 1) not found (pos 42)
Feb 11 14:27:02 server-linux kernel: vs-13060: reiserfs_update_sd: stat 
data of object [5283 5382 0x0 SD] (nlink == 1) not found (pos 21)
Feb 11 14:27:02 server-linux kernel: vs-13060: reiserfs_update_sd: stat 
data of object [5283 5383 0x0 SD] (nlink == 1) not found (pos 21)
Feb 11 14:27:02 server-linux kernel: vs-13060: reiserfs_update_sd: stat 
data of object [5283 5385 0x0 SD] (nlink == 1) not found (pos 21)
Feb 11 14:27:03 server-linux kernel: vs-13060: reiserfs_update_sd: stat 
data of object [5283 5375 0x0 SD] (nlink == 1) not found (pos 21)
Feb 11 14:27:03 server-linux kernel: vs-13060: reiserfs_update_sd: stat 
data of object [5283 5375 0x0 SD] (nlink == 1) not found (pos 21)
Feb 11 14:27:04 server-linux kernel: vs-13060: reiserfs_update_sd: stat 
data of object [5283 5376 0x0 SD] (nlink == 1) not found (pos 21)
Feb 11 14:27:04 server-linux kernel: vs-13060: reiserfs_update_sd: stat 
data of object [5283 5376 0x0 SD] (nlink == 1) not found (pos 21)
Feb 11 14:27:04 server-linux kernel: vs-13060: reiserfs_update_sd: stat 
data of object [5283 5377 0x0 SD] (nlink == 1) not found (pos 21)
Feb 11 14:27:04 server-linux kernel: vs-13060: reiserfs_update_sd: stat 
data of object [5283 5378 0x0 SD] (nlink == 1) not found (pos 21)
Feb 11 14:27:04 server-linux kernel: vs-13060: reiserfs_update_sd: stat 
data of object [5283 5380 0x0 SD] (nlink == 1) not found (pos 21)
Feb 11 14:27:04 server-linux kernel: vs-13060: reiserfs_update_sd: stat 
data of object [5283 5381 0x0 SD] (nlink == 1) not found (pos 21)
Feb 11 14:27:04 server-linux kernel: vs-13060: reiserfs_update_sd: stat 
data of object [5283 5381 0x0 SD] (nlink == 1) not found (pos 21)
Feb 11 14:27:04 server-linux kernel: vs-13060: reiserfs_update_sd: stat 
data of object [5283 5379 0x0 SD] (nlink == 1) not found (pos 21)
Feb 11 14:27:05 server-linux kernel: vs-13060: reiserfs_update_sd: stat 
data of object [5283 5384 0x0 SD] (nlink == 1) not found (pos 21)
Feb 11 14:30:25 server-linux kernel: vs-13060: reiserfs_update_sd: stat 
data of object [6661 6975 0x0 SD] (nlink == 1) not found (pos 0)
Feb 13 16:42:55 server-linux kernel: vs-13060: reiserfs_update_sd: stat 
data of object [7643 7666 0x0 SD] (nlink == 1) not found (pos 63)
Feb 13 16:47:09 server-linux kernel: is_leaf: item location seems wrong 
(second one): *OLD*[9058 9147 0x12a9 DIRECT], item_len 1025, 
item_location 4095, free_space(entry_count) 65535
Feb 13 16:47:09 server-linux kernel: vs-5150: search_by_key: invalid 
format found in block 9062. Fsck?
Feb 13 16:47:09 server-linux kernel: is_leaf: item location seems wrong 
(second one): *OLD*[9058 9147 0x12a9 DIRECT], item_len 1025, 
item_location 4095, free_space(entry_count) 65535
Feb 13 16:47:09 server-linux kernel: vs-5150: search_by_key: invalid 
format found in block 9062. Fsck?
Feb 13 16:47:09 server-linux kernel: is_leaf: item location seems wrong 
(second one): *OLD*[9058 9147 0x12a9 DIRECT], item_len 1025, 
item_location 4095, free_space(entry_count) 65535
Feb 13 16:47:09 server-linux kernel: vs-5150: search_by_key: invalid 
format found in block 9062. Fsck?
Feb 13 16:47:09 server-linux kernel: is_leaf: item location seems wrong 
(second one): *OLD*[9058 9147 0x12a9 DIRECT], item_len 1025, 
item_location 4095, free_space(entry_count) 65535
Feb 13 16:47:09 server-linux kernel: vs-5150: search_by_key: invalid 
format found in block 9062. Fsck?
Feb 13 16:47:23 server-linux kernel: vs-13050: reiserfs_update_sd: i/o 
failure occurred trying to update [9058 9150 0x0 SD] stat 
data4is_leaf: item location seems wrong (second one): *OLD*[9058 9147 
0x12a9 DIRECT], item_len 1025, item_location 4095, 
free_space(entry_count) 65535
Feb 13 16:47:23 server-linux kernel: vs-5150: search_by_key: invalid 
format found in block 9062. Fsck?
Feb 13 16:47:23 server-linux kernel: is_leaf: item location 

Re: [reiserfs-list] [PATCH] write barriers for 2.4.x

2002-02-14 Thread Chris Mason



On Thursday, February 14, 2002 02:18:40 AM +0100 Philippe Gramoullé 
[EMAIL PROTECTED] wrote:

 
 Hi,
 
 Chris Mason wrote:
 
 ...
 If you really want to experiment with this on scsi, but have a different
 adapter, let me know.
 ...
 
 I'd be very much interested to see how it behaves with the PERC3/QC from
 DELL
 (megaraid driver)
 

Hmmm, the megaraid driver seems to use scsi tags, but not support any of
the ordered ones.  I'll have to drop a message to the maintainer to
see if this is possible.

-chris




Re: [reiserfs-list] worried about error-messages in /var/log/messages

2002-02-14 Thread Oleg Drokin

Hello!

On Thu, Feb 14, 2002 at 03:28:28PM +0100, Mark Rosa wrote:
 Feb 11 14:20:49 server-linux kernel: hdc: timeout waiting for DMA
 Feb 11 14:20:49 server-linux kernel: ide_dmaproc: chipset supported 
 ide_dma_timeout func only: 14
 Feb 11 14:20:49 server-linux kernel: hdc: irq timeout: status=0x58 { 
 DriveReady SeekComplete DataRequest }
 Feb 11 14:20:51 server-linux kernel: hdc: dma_intr: status=0x58 { 
 DriveReady SeekComplete DataRequest }
 Feb 11 14:20:51 server-linux kernel: hdc: status timeout: status=0xd0 { 
 Busy }
 Feb 11 14:20:51 server-linux kernel: hdc: DMA disabled
 Feb 11 14:20:51 server-linux kernel: hdc: drive not ready for command
 Feb 11 14:20:55 server-linux kernel: ide1: reset: success
This is a hard drive error.
Hardware errors are dealt with on the http://namesys.com/support.html terms.

Also you might want to upgrade your kernel.
2.4.4 is quite old.

Bye,
Oleg



[reiserfs-list] Re: Reiserfs Corruption with 2.5.5-pre1

2002-02-14 Thread Oleg Drokin

Hello!

On Thu, Feb 14, 2002 at 05:24:21PM +0100, Sebastian Dröge wrote:

 reiserfsck --check said I have to do --rebuild-tree because of critical corruption 
(many bad_leaf: block x has wrong order of items)...

these are 2.5.3 signs.

 after that I booted into 2.4.17. Everything works okay.
 Then I booted 2.5.5-pre1 and the mysterious files are there again after starting 
GNOME. I've copied one file to another location but when I reboot into 2.4.17 the 
files and the copy are gone again...

But GNOME is working, right?

 If you need one or two file names or the content of them just ask (They begin with 
an ^)... then I'll handcopy them ;)

I have a better approach.
Just recreate them (by running GNOME in 2.5.5-pre1?) and then tar them up ;)
Send the ersulting tar file to me.

 The format of the partition is 3.6 and another partition with 3.5 format had no 
errors... Maybe this helps

So now problem only is that there are strange files after GNOME start, right?
Do these files disa[[ear after you quit GNOME?

 I could build 2.5.5-pre1 without your patch from the last mail but for this try I 
have build the kernel with it
I just found this patch is only needed on SMP ;)

Bye,
Oleg



Re: [reiserfs-list] Serious ReiserFS errors when updating from 2.4.18pre9 to rc1

2002-02-14 Thread Chris Mason


[ marcelo, you're bcc'd as an FYI, I'll forward details when we figure
this out ]

On Thursday, February 14, 2002 11:46:35 PM +0100 Jens Benecke [EMAIL PROTECTED] 
wrote:

 Hi,
 
 I compiled the 2.4.18rc1 kernel now (because Marcelo wrote ReiserFS
 fixes in the changelog) and with that kernel I cannot access half my
 harddisk any more, and syslog complains it cannot find inode stat data
 (or something like that) a thousand times.
 
 What happened?  I can access the files normally with 2.4.18pre9 and
 below.
 
 Do I need to worry?

Yes, that would be something to worry about.  Is this disk 3.5.x or
3.6.x, mounted as root?  What kind of hardware is the disk on?

The reiserfs change in rc1 was pretty minor, are there any other messages
in your logfile?

-chris





[reiserfs-list] Re: [PATCH] write barriers for 2.4.x

2002-02-14 Thread Alan Cox

 sure we only try to use tag commands when they are turned on for the
 target, otherwise we can safely assume the drive won't do
 other writes first.

Is this guaranteed by the SCSI standards or do you need to issue some
kind of cache flush as with IDE ?

 With -o barrier, this is now:
 
 write X log blocks
 write 1 commit block
 wait.

That will work nicely with the I2O controllers, and possibly (if its
in the firmware as well as the .h file) the aacraid cards. In those
cases I can often commit to battery backed ram rather than physical
media.

Do you have any idea of driving the cache write through rather than write
back is likely to help here by evening out the commit wait for a flush?



[reiserfs-list] Re: [PATCH] write barriers for 2.4.x

2002-02-14 Thread Chris Mason



On Friday, February 15, 2002 01:21:20 AM + Alan Cox [EMAIL PROTECTED] 
wrote:

 sure we only try to use tag commands when they are turned on for the
 target, otherwise we can safely assume the drive won't do
 other writes first.
 
 Is this guaranteed by the SCSI standards or do you need to issue some
 kind of cache flush as with IDE ?

We're sending the scsi ordered queue tag command, which the spec
says will be written after anything already received by the target,
and before anything it receives later on.  I have no data
at all how well the drives follow the spec ;-)

The IDE changes issue cache flushes before the barrier write,
and then another flush after it, which gives us similar semantics.

 
 With -o barrier, this is now:
 
 write X log blocks
 write 1 commit block
 wait.
 
 That will work nicely with the I2O controllers, and possibly (if its
 in the firmware as well as the .h file) the aacraid cards. In those
 cases I can often commit to battery backed ram rather than physical
 media.
 
 Do you have any idea of driving the cache write through rather than write
 back is likely to help here by evening out the commit wait for a flush?
 
Controllers that do write back caching should be helped by the reiserfs 
usage changes.  If we pretend they immediately tell the OS a write is 
completed, unpatched reiserfs does this:

write X log blocks
wait on X log blocks (all already complete, so just a CPU loop)
write 1 commit
wait on 1 commit.

With the new code, the controller is more likely to get the commit in 
time to merge the requests.  Hopefully someone who knows more about scsi 
can correct me, but I think the write back controller can ignore the 
ordering rules (since battery backup should promise the request does hit 
media eventually).

I think write through caches should be helped too, as long as they
are smart about how they do the write ordering.  My scsi drive doesn't
seem to be very smart at all, it has been hard to find usage patterns
that show improvement.  So far, only O_SYNC writes really show it.

I think that's what you were asking, sorry if I misunderstood the Q.

-chris