subject:"filesystem corruption \?"

Re: Filesystem corruption

2007-06-06 Thread Ingo Bormuth

On 2007-06-06 11:10, Xu CanHao wrote:
 So maybe I'd suggest anybody take the _official_ reiser4 patch-set and
 _vanilla_ kernel source, these things should provide the maximum
 stability. My root filesystem with reiser4 never loses data.

I fully agree, as long as there _exists_ a current official patch.
That was not always the case in the recent past. No wonder people 
started to get their own hands dirty from time to time. 

Btw: It's also fun to read / mess with the code ...


-- 
Ingo Bormuth, voicebox  fax: +49-(0)-12125-10226517
public key 86326EC9, http://ibormuth.efil.de/contact

Re: Filesystem corruption

2007-06-05 Thread Ingo Bormuth

On 2007-06-04 13:41, Edward Shishkin wrote:

 When performing mapping read (needed for execution, etc) reiser4
 converts small files from tails to extents and back (your /bin/sleep
 is less then 4 * blocksize, right?)

Yes, it's 15k. 

The conversion is done on disk, even when mounted read only?  I'd like
to see the logic in the code. In case you just know by heart, it' would
be nice if you could give me a little hint where to start at.

 Please, rebuild your kernel with the official patch
 [...]
 Please, report, if such data loss still takes place after upgrade.

I'll keep you informed ...

Thanks.


-- 
Ingo Bormuth, voicebox  fax: +49-(0)-12125-10226517
public key 86326EC9, http://ibormuth.efil.de/contact

Re: Filesystem corruption

2007-06-05 Thread Xu CanHao


So maybe I'd suggest anybody take the _official_ reiser4 patch-set and
_vanilla_ kernel source, these things should provide the maximum
stability. My root filesystem with reiser4 never loses data.

Re: Filesystem corruption

2007-06-04 Thread Edward Shishkin


Ingo Bormuth wrote:


On 2007-06-03 03:10, Edward Shishkin wrote:
 


Ingo Bormuth wrote:
   


Hm, same here. I lost /bin/sleep several times.
 



 


Would you please describe the problem in more details?
What kernel version? What does I lost /bin/sleep mean?
Does it mean that:
1. /bin/sleep was truncated to 0 bytes, i.e. ls -l /bin/sleep shows  
something like

-rwxr-xr-x  1 root root 0 2005-04-20 18:32 /bin/sleep
2. /bin/sleep disappeared (ls -l /bin doesn't show this file)
3. /bin/sleep exists, but filled by zeros
etc...
   



The file was removed by 'fsck.reiser4 --fix' which emmitted a
message about deleting a corrupted file. (Case 2 in your list).

This always happened after a system freeze or power loss.
The machine freezes quite frequently - I think it has a DMA problem.
Nevertheless I don't see how a file that was not written to can
get corrupted.

 



When performing mapping read (needed for execution, etc) reiser4 
converts small
files from tails to extents and back (your /bin/sleep is less then 4 * 
blocksize, right?)


Current kernel is 2.6.20.5 (the reiser4 patch I submitted to this 
list on may 2nd).
 



Please, rebuild your kernel with the official patch
http://ftp.namesys.com/pub/reiser4-for-2.6/2.6.20/
It contains a bugfix related to tail conversion (races when acquiring 
exclusive access).


Please, report, if such data loss still takes place after upgrade.

Thanks,
Edward.


Root is mounted rw,noatime,nodiratime,onerror=remount-ro,tmgr.atom_max_age=60

Hope that helps.

Re: Filesystem corruption

2007-06-03 Thread Ingo Bormuth

On 2007-06-03 03:10, Edward Shishkin wrote:
 Ingo Bormuth wrote:
 Hm, same here. I lost /bin/sleep several times.

 Would you please describe the problem in more details?
 What kernel version? What does I lost /bin/sleep mean?
 Does it mean that:
 1. /bin/sleep was truncated to 0 bytes, i.e. ls -l /bin/sleep shows  
 something like
 -rwxr-xr-x  1 root root 0 2005-04-20 18:32 /bin/sleep
 2. /bin/sleep disappeared (ls -l /bin doesn't show this file)
 3. /bin/sleep exists, but filled by zeros
 etc...

The file was removed by 'fsck.reiser4 --fix' which emmitted a
message about deleting a corrupted file. (Case 2 in your list).

This always happened after a system freeze or power loss.
The machine freezes quite frequently - I think it has a DMA problem.
Nevertheless I don't see how a file that was not written to can
get corrupted.

Current kernel is 2.6.20.5 (the reiser4 patch I submitted to this 
list on may 2nd).

Root is mounted rw,noatime,nodiratime,onerror=remount-ro,tmgr.atom_max_age=60

Hope that helps.



-- 
Ingo Bormuth, voicebox  fax: +49-(0)-12125-10226517
public key 86326EC9, http://ibormuth.efil.de/contact

Re: Filesystem corruption

2007-06-02 Thread Ingo Bormuth

On 2007-05-30 15:03, David Masover wrote:

 Only, recently, these fsck-a-thons started happening more and more often, and 
 I started to lose random files. They'd just be silently truncated to 0 bytes. 
 And not files I was writing a lot -- I'm talking about things 
 like /bin/mount.

Hm, same here. I lost /bin/sleep several times. I have a little script
printing status messages to the screen, sleeping two seconds and print
again - you name it. The probability that /bin/sleep is accessed at the
same time the system crashes is quite high (this is _no_ write access,
the system is even mounted noatime).

How could pure execution of a file cause corruption of the file itself?
Any idea ?

Apart from that single file, I never had any serious problems with
reiser4 on three busy systems for years - fsck.reiser4 works like charme.


-- 
Ingo Bormuth, voicebox  fax: +49-(0)-12125-10226517
public key 86326EC9, http://ibormuth.efil.de/contact

Re: Filesystem corruption

2007-06-02 Thread Edward Shishkin


Ingo Bormuth wrote:


On 2007-05-30 15:03, David Masover wrote:

 

Only, recently, these fsck-a-thons started happening more and more often, and 
I started to lose random files. They'd just be silently truncated to 0 bytes. 
And not files I was writing a lot -- I'm talking about things 
like /bin/mount.
   



Hm, same here. I lost /bin/sleep several times.



Would you please describe the problem in more details?
What kernel version? What does I lost /bin/sleep mean?
Does it mean that:
1. /bin/sleep was truncated to 0 bytes, i.e. ls -l /bin/sleep shows  
something like

-rwxr-xr-x  1 root root 0 2005-04-20 18:32 /bin/sleep
2. /bin/sleep disappeared (ls -l /bin doesn't show this file)
3. /bin/sleep exists, but filled by zeros
etc...

Thanks,
Edward.


I have a little script
printing status messages to the screen, sleeping two seconds and print
again - you name it. The probability that /bin/sleep is accessed at the
same time the system crashes is quite high (this is _no_ write access,
the system is even mounted noatime).

How could pure execution of a file cause corruption of the file itself?
Any idea ?

Apart from that single file, I never had any serious problems with
reiser4 on three busy systems for years - fsck.reiser4 works like charme.

Re: Filesystem corruption

2007-05-30 Thread David Masover

On Tuesday 29 May 2007 07:36:13 Toby Thain wrote:

  but you can't
  mention using reiserfs in mixed company without someone accusing
  you of
  throwing your data away.

 People who repeat this rarely have any direct experience of Reiser;
 they repeat what they've heard; like all myths and legends they are
 transmitted orally rather than based on scientific observation.

Well, there is one problem I vaguely remember that I don't think has been 
addressed, I think it was one of those lets-put-it-off-till-v4 things. It was 
the fact that there are a limited number of inodes (or keys, or whatever you 
call a unique file), and no way of knowing how many you have left until your 
FS will suddenly, one day refuse to create another file.

(For comparison, ext3 seems to support not only telling you how many inodes 
you have left, but tuning that on the fly.)

But, I haven't run into that, and the only problem I've had lately has been 
Reiser4 losing data, and crashing occasionally. I switched most of my data 
off of Reiser4 and onto XFS for that reason. I've also been using ext3 in 
some places, and Reiser3 in others (one place in particular where space is 
limited, but I will have tons of small files).

I later learned that XFS does out-of-order writes by default, making me think 
I should give up and invest in UPS hardware. But, switching away from Reiser4 
means I no longer see random files (including stuff in, for example, /sbin, 
that I hadn't touched in months) go up in smoke.

Ordinarily I like to help debug things, but not at the risk of my data. Maybe 
I'll try again later, and see if I can reproduce it in a VM or somewhere 
safe...

I do still follow the list, though, in case something interesting happens. It 
was fun while it lasted!


pgpariYsg6fOw.pgp
Description: PGP signature

Re: Filesystem corruption

2007-05-30 Thread Vladimir V. Saveliev

Hello

On Wednesday 30 May 2007 17:25, David Masover wrote:
 On Tuesday 29 May 2007 07:36:13 Toby Thain wrote:
 
   but you can't
   mention using reiserfs in mixed company without someone accusing
   you of
   throwing your data away.
 
  People who repeat this rarely have any direct experience of Reiser;
  they repeat what they've heard; like all myths and legends they are
  transmitted orally rather than based on scientific observation.
 
 Well, there is one problem I vaguely remember that I don't think has been 
 addressed, I think it was one of those lets-put-it-off-till-v4 things. It was 
 the fact that there are a limited number of inodes (or keys, or whatever you 
 call a unique file), and no way of knowing how many you have left until your 
 FS will suddenly, one day refuse to create another file.
 

reiserfs is limited to ~2^32 file creations. It is possible to exhaust but I do 
not remember any reports about that.

 (For comparison, ext3 seems to support not only telling you how many inodes 
 you have left, but tuning that on the fly.)
 
 But, I haven't run into that, and the only problem I've had lately has been 
 Reiser4 losing data, and crashing occasionally. I switched most of my data 
 off of Reiser4 and onto XFS for that reason. I've also been using ext3 in 
 some places, and Reiser3 in others (one place in particular where space is 
 limited, but I will have tons of small files).
 
 I later learned that XFS does out-of-order writes by default, making me think 
 I should give up and invest in UPS hardware. But, switching away from Reiser4 
 means I no longer see random files (including stuff in, for example, /sbin, 
 that I hadn't touched in months) go up in smoke.
 
 Ordinarily I like to help debug things, but not at the risk of my data. Maybe 
 I'll try again later, and see if I can reproduce it in a VM or somewhere 
 safe...
 
that would be great, thanks

 I do still follow the list, though, in case something interesting happens. It 
 was fun while it lasted!

Re: Filesystem corruption

2007-05-30 Thread Vladimir V. Saveliev

Hello

On Tuesday 29 May 2007 16:36, Toby Thain wrote:
   I have always found reiser3 to be rock solid
 
 My experienced too, over many server years.
 
  but you can't
  mention using reiserfs in mixed company without someone accusing  
  you of
  throwing your data away.
 
 People who repeat this rarely have any direct experience of Reiser;  
 they repeat what they've heard; like all myths and legends they are  
 transmitted orally rather than based on scientific observation.
 
well, there were in past several bad stories when reiserfsck was unable restore 
filesystems because it was unable to find
reiserfs metadata.
Later we found that sometimes (for unknown (but not likely due to reiserfs 
problem) reason) partition table changes so that 
beginning of a partition gets shifted by few sectors. So, now, when a user 
reports that reiserfs metadata disappered from a device completely - recovering 
a partition table to 
original state makes data available again.

  You would think the developers would be doing
  more to counter this but I have been following reiserfs for years and
  nobody seems to really care all that much.
 
 
 Can't do much about human nature. MySQL suffers from the same  
 baseless poisoned folk wisdom.
 
 --Toby

Re: Filesystem corruption

2007-05-30 Thread Toby Thain



On 30-May-07, at 10:25 AM, David Masover wrote:


On Tuesday 29 May 2007 07:36:13 Toby Thain wrote:


but you can't
mention using reiserfs in mixed company without someone accusing
you of
throwing your data away.


People who repeat this rarely have any direct experience of Reiser;
they repeat what they've heard; like all myths and legends they are
transmitted orally rather than based on scientific observation.


Well, there is one problem I vaguely remember that I don't think  
has been
addressed, I think it was one of those lets-put-it-off-till-v4  
things. It was
the fact that there are a limited number of inodes (or keys, or  
whatever you

call a unique file),


But does it cause data loss? One usually sees claims that reiserfs  
ate my data, or I heard reiserfs ate somebody's data, but without  
supplying a root cause - bad memory? powerfail? bad disk? etc.



and no way of knowing how many you have left until your
FS will suddenly, one day refuse to create another file.




... switching away from Reiser4
means I no longer see random files (including stuff in, for  
example, /sbin,

that I hadn't touched in months) go up in smoke.


I only wish sanity had prevailed over  kernel inclusion, then we'd  
see it shaken down a lot quicker, like R3 was.




Ordinarily I like to help debug things, but not at the risk of my  
data. Maybe
I'll try again later, and see if I can reproduce it in a VM or  
somewhere

safe...

I do still follow the list, though, in case something interesting  
happens.


Yeah, R4 is something interesting. :) I still hope it gets finished...

--Toby


It
was fun while it lasted!

Re: Filesystem corruption

2007-05-30 Thread devsk

I think people just like to spread FUD without doing any analysis of what 
really caused the FS corruption. It can be anything from a bad 3rd party driver 
to bad hardware ('bad blocks', does anybody check for them before mkfs these 
days? I do). People also like to try those untested patchsets, containing every 
blah that's thrown out by so called 'kernel hackers' which makes your system 
10x faster. Rieser4 seems like an easy candidate to vent their anger on 
afterwards.

I have used R4 for a year now and I have had to reset my PC, troubleshooting 
problems with vmware/mythtv/cisco vpn client/nvidia, so many times that its not 
even funny! And R4 didn't give me any problems even once. It boots right up, 
without any files lost and consistent FS as a subsequent livecd boot and fsck 
proved it everytime. If I did that to ext or xfs, I would have lost big time. 
Only files I have ever lost were on ext3 during a sudden power failure. I don't 
trust safety of my data on any FS but Rieserfs. I hope people don't leave this 
good piece of code to rot!!

-devsk

- Original Message 
From: Toby Thain [EMAIL PROTECTED]
To: David Masover [EMAIL PROTECTED]
Cc: ReiserFS List reiserfs-list@namesys.com
Sent: Wednesday, May 30, 2007 9:42:01 AM
Subject: Re: Filesystem corruption


On 30-May-07, at 10:25 AM, David Masover wrote:

 On Tuesday 29 May 2007 07:36:13 Toby Thain wrote:

 but you can't
 mention using reiserfs in mixed company without someone accusing
 you of
 throwing your data away.

 People who repeat this rarely have any direct experience of Reiser;
 they repeat what they've heard; like all myths and legends they are
 transmitted orally rather than based on scientific observation.

 Well, there is one problem I vaguely remember that I don't think  
 has been
 addressed, I think it was one of those lets-put-it-off-till-v4  
 things. It was
 the fact that there are a limited number of inodes (or keys, or  
 whatever you
 call a unique file),

But does it cause data loss? One usually sees claims that reiserfs  
ate my data, or I heard reiserfs ate somebody's data, but without  
supplying a root cause - bad memory? powerfail? bad disk? etc.

 and no way of knowing how many you have left until your
 FS will suddenly, one day refuse to create another file.


 ... switching away from Reiser4
 means I no longer see random files (including stuff in, for  
 example, /sbin,
 that I hadn't touched in months) go up in smoke.

I only wish sanity had prevailed over  kernel inclusion, then we'd  
see it shaken down a lot quicker, like R3 was.


 Ordinarily I like to help debug things, but not at the risk of my  
 data. Maybe
 I'll try again later, and see if I can reproduce it in a VM or  
 somewhere
 safe...

 I do still follow the list, though, in case something interesting  
 happens.

Yeah, R4 is something interesting. :) I still hope it gets finished...

--Toby

 It
 was fun while it lasted!








   
Boardwalk
 for $500? In 2007? Ha! Play Monopoly Here and Now (it's updated for today's 
economy) at Yahoo! Games.
http://get.games.yahoo.com/proddesc?gamekey=monopolyherenow

Re: Filesystem corruption

2007-05-30 Thread Toby Thain



On 30-May-07, at 2:22 PM, devsk wrote:

I think people just like to spread FUD without doing any analysis  
of what really caused the FS corruption.


I fear you're right. OTOH, filesystem developers on this list (and  
others including ZFS list) tend to be extremely meticulous.


--Toby

Re: Filesystem corruption

2007-05-30 Thread David Masover

On Wednesday 30 May 2007 11:42:01 Toby Thain wrote:

 But does it cause data loss? One usually sees claims that reiserfs
 ate my data, or I heard reiserfs ate somebody's data, but without
 supplying a root cause - bad memory? powerfail? bad disk? etc.

Power failure shouldn't kill a filesystem, and generally shouldn't eat data 
that was written to disk before the failure. (Although I could complain all 
day here about why corruption happens anyway when you do any kind of 
out-of-order operations...  I am looking forward to that Reiser4 transaction 
API, so we can finally get rid of the tmpfile+rename hack.)

But in any case, there were some kernels -- 2.4.16, I think? -- in which 
reiserfs was unstable and did corrupt easily. I believe that was tracked down 
to kernel bugs outside of reiserfs.


pgpro4QoRvDOq.pgp
Description: PGP signature

Re: Filesystem corruption

2007-05-30 Thread David Masover

On Wednesday 30 May 2007 12:22:17 devsk wrote:

 I have used R4 for a year now and I have had to reset my PC,
 troubleshooting problems with vmware/mythtv/cisco vpn client/nvidia, so
 many times that its not even funny! And R4 didn't give me any problems even
 once. It boots right up, without any files lost and consistent FS as a
 subsequent livecd boot and fsck proved it everytime.

That happened to me for maybe a year or so, I'm not sure. Then, slowly, I 
started to get problems. The machine crashing due to some nvidia bug -- or 
even a reiser-specific oops or something -- then I'd have to fsck it, which 
would take an hour or more, then I'd boot, and apparently no problems.

Only, recently, these fsck-a-thons started happening more and more often, and 
I started to lose random files. They'd just be silently truncated to 0 bytes. 
And not files I was writing a lot -- I'm talking about things 
like /bin/mount.

Now, maybe it's an amd64-specific bug. Or (somehow) a dmraid-specific bug, or 
a dont_load_bitmap bug. (Who can blame me; without dont_load_bitmap, it takes 
at least 30 seconds, maybe a minute to mount.) Could even be, somehow, a 
Gentoo-specific bug. Could be a 350-gig-partition bug, or even a bug of the 
it-hates-me variety. (My server ran Reiser4 for awhile longer, with no 
problems, but I wasn't about to take chances there.)

But, I switched a friend over to Ubuntu, and he had the same kind of problems. 
In fact, he had them first (I thought it was his computer, for awhile).

Finally, we switched to stock Ubuntu kernels and XFS, me on dmraid, him on 
normal linux raid5 (md), and we now have no problems. It's even faster -- the 
biggest gain for Reiser4 was /usr/portage, which doesn't exist on Ubuntu.

 If I did that to ext 
 or xfs, I would have lost big time.

Well, I'm on XFS on my desktop now, and ext3 on my server. No problems at all 
so far. Also much faster, because my desktop now has a repacker (xfs_fsr).

 I hope people don't leave this good piece of code to rot!!

Me too, but you know, I can no longer afford to spend a few hours running fsck 
for no apparent reason. I no longer have a machine that can do anything but 
just work.

The killer feature of Reiser4, as implemented, is small file performance that 
makes ReiserFSv3 weep, and v3 makes XFS weep. All the other stuff we were 
promised is either planned for a later release (repacker, pseudofiles, 
transaction API) or barely working (cryptocompress).

And on just about any setup I work on today, small file performance is a small 
enough priority that even the slightest hint of instability is a 
deal-breaker. Enough people feel the same way that ext3 is still widely used. 
And if it's ever really crucial, there's reiserfs3.

So, you can blame it on my hardware, or on not getting kernel inclusion, or 
anything you want, but the only place I still use Reiser4 is on the 
gameserver at our LAN party, and we're thinking of moving that to something 
like ext3 or xfs, just so we don't need custom kernels. And after all, that's 
a gameserver, it's not like the filesystem is the bottleneck anyway.


pgpyny6ogblkT.pgp
Description: PGP signature

Re: Filesystem corruption

2007-05-30 Thread David Masover

On Wednesday 30 May 2007 11:02:26 Vladimir V. Saveliev wrote:

  Ordinarily I like to help debug things, but not at the risk of my data.
  Maybe I'll try again later, and see if I can reproduce it in a VM or
  somewhere safe...

 that would be great, thanks

Keep in mind, it's unlikely, given I don't have much resembling my original 
setup left around. And it was fairly random, under fairly normal usage 
patterns -- just I'd suddenly notice my movie had stopped playing, and I'd 
hit ctrl+alt+f8 and find a bunch of reiser4 error messages.

Is it at all likely that this is an amd64 bug? (The only two places I've seen 
it are on my box and my friend's, both amd64 on some sort of RAID.) If you 
don't have enough testers or hardware for amd64, I can try (again) to setup a 
working x86_64 VM for you to test on.


pgphsmCDRGDn1.pgp
Description: PGP signature

Re: Filesystem corruption

2007-05-30 Thread devsk

David, Its funny how my setup is very similar to yours: gentoo, amd64, nvraid 
using dmraid. mount/mkfs is VERY fast (less than a second) here, and I don't 
use any specific mount options except noatime. My partition is about 16GB 
though, hosting '/' and /home.

what sources do you use? I use gentoo-sources (currently using 2.6.21-r2) with 
the latest stable patch (currently 2.6.21) from namesys, applied manually. 
Nothing else. I use suspend-to-ram (with a UPS) and the whole system is rock 
solid.

-devsk

- Original Message 
From: David Masover [EMAIL PROTECTED]
To: devsk [EMAIL PROTECTED]
Cc: Toby Thain [EMAIL PROTECTED]; ReiserFS List reiserfs-list@namesys.com
Sent: Wednesday, May 30, 2007 1:03:14 PM
Subject: Re: Filesystem corruption

On Wednesday 30 May 2007 12:22:17 devsk wrote:

 I have used R4 for a year now and I have had to reset my PC,
 troubleshooting problems with vmware/mythtv/cisco vpn client/nvidia, so
 many times that its not even funny! And R4 didn't give me any problems even
 once. It boots right up, without any files lost and consistent FS as a
 subsequent livecd boot and fsck proved it everytime.

That happened to me for maybe a year or so, I'm not sure. Then, slowly, I 
started to get problems. The machine crashing due to some nvidia bug -- or 
even a reiser-specific oops or something -- then I'd have to fsck it, which 
would take an hour or more, then I'd boot, and apparently no problems.

Only, recently, these fsck-a-thons started happening more and more often, and 
I started to lose random files. They'd just be silently truncated to 0 bytes. 
And not files I was writing a lot -- I'm talking about things 
like /bin/mount.

Now, maybe it's an amd64-specific bug. Or (somehow) a dmraid-specific bug, or 
a dont_load_bitmap bug. (Who can blame me; without dont_load_bitmap, it takes 
at least 30 seconds, maybe a minute to mount.) Could even be, somehow, a 
Gentoo-specific bug. Could be a 350-gig-partition bug, or even a bug of the 
it-hates-me variety. (My server ran Reiser4 for awhile longer, with no 
problems, but I wasn't about to take chances there.)

But, I switched a friend over to Ubuntu, and he had the same kind of problems. 
In fact, he had them first (I thought it was his computer, for awhile).

Finally, we switched to stock Ubuntu kernels and XFS, me on dmraid, him on 
normal linux raid5 (md), and we now have no problems. It's even faster -- the 
biggest gain for Reiser4 was /usr/portage, which doesn't exist on Ubuntu.

 If I did that to ext 
 or xfs, I would have lost big time.

Well, I'm on XFS on my desktop now, and ext3 on my server. No problems at all 
so far. Also much faster, because my desktop now has a repacker (xfs_fsr).

 I hope people don't leave this good piece of code to rot!!

Me too, but you know, I can no longer afford to spend a few hours running fsck 
for no apparent reason. I no longer have a machine that can do anything but 
just work.

The killer feature of Reiser4, as implemented, is small file performance that 
makes ReiserFSv3 weep, and v3 makes XFS weep. All the other stuff we were 
promised is either planned for a later release (repacker, pseudofiles, 
transaction API) or barely working (cryptocompress).

And on just about any setup I work on today, small file performance is a small 
enough priority that even the slightest hint of instability is a 
deal-breaker. Enough people feel the same way that ext3 is still widely used. 
And if it's ever really crucial, there's reiserfs3.

So, you can blame it on my hardware, or on not getting kernel inclusion, or 
anything you want, but the only place I still use Reiser4 is on the 
gameserver at our LAN party, and we're thinking of moving that to something 
like ext3 or xfs, just so we don't need custom kernels. And after all, that's 
a gameserver, it's not like the filesystem is the bottleneck anyway.







   
Building
 a website is a piece of cake. Yahoo! Small Business gives you all the tools to 
get online.
http://smallbusiness.yahoo.com/webhosting

Re: Filesystem corruption

2007-05-29 Thread Vladimir V. Saveliev

Hello

On Tuesday 29 May 2007 08:18, Tracy R Reed wrote:
 Laurent CARON wrote:
  Seems to me it is a filesystem corruption.
 
 Did I miss it or did not a single person ask you if this happened with
 reiserfs 3 or 4?
 

Laurent mentioned rebuild-tree mode of reiserfsck. So the problem happened  
with reiserfs 3.

 I would be quite surprised if this were reiser 3 and not so surprised if
 it were reiser 4 which is still beta afaik.
 
 Reiser has a nasty reputation for filesystem corruption more than any
 other fs. I have always found reiser3 to be rock solid but you can't
 mention using reiserfs in mixed company without someone accusing you of
 throwing your data away. You would think the developers would be doing
 more to counter this but I have been following reiserfs for years and
 nobody seems to really care all that much.

Re: Filesystem corruption

2007-05-29 Thread Vladimir V. Saveliev

Hello

On Monday 28 May 2007 22:16, Laurent CARON wrote:
 Christian Kujau a écrit :
  Please try to check the fs with a current version of reiserfsprogs 
  first. As the manpage advises, try --check first and use 
  --rebuild-tree only if you know what you're doing, IOW: have a current 
  backup.
 
 Over the past few years, i experienced a few reiser corruption on 
 various hardware (dell, hp, asus, sata, scsi, ide...) with the same 
 symptoms (unredable file/dir).
 Always ran check which told me to run fix-fixable or rebuild-tree, which 
 I did after ensuring of backup reliability, and the error was corrected 
 (after eventually losing a few files i fortunately had in the backups).
 

Would you run reiserfsck --check -l log and let us see the log?
That may give a hint about which kind of corruptions do you have.

 
  Also, which kernel/machine is this running on? Do you know *why* this 
  corruption may have occured? Any recent hardware issues? Is ther 
  anything in the logs regarding fs/device errors?
 
 Kernel is 2.6.19.
 The machine does not seem to have any HW issue, nothing strange in the 
 logs. :$
 This is just a plain Dell 2650 server with a bunch of SCSI HDD, software 
 raid5 array, reiserfs on top of it.
 
 Laurent

Re: Filesystem corruption

2007-05-29 Thread Toby Thain


 I have always found reiser3 to be rock solid


My experienced too, over many server years.


but you can't
mention using reiserfs in mixed company without someone accusing  
you of

throwing your data away.


People who repeat this rarely have any direct experience of Reiser;  
they repeat what they've heard; like all myths and legends they are  
transmitted orally rather than based on scientific observation.



You would think the developers would be doing
more to counter this but I have been following reiserfs for years and
nobody seems to really care all that much.



Can't do much about human nature. MySQL suffers from the same  
baseless poisoned folk wisdom.


--Toby

Re: Filesystem corruption

2007-05-28 Thread Vladimir V. Saveliev

Hello

On Sunday 27 May 2007 17:18, Laurent CARON wrote:
 Hi,
 
 A few days ago, one of my procmail suddenly receipes stopped to work.
 
 I didn't care much since this only was for 1 or 2 mails.
 
 Yesterday, i took time to dig it a bit further and looked at the
 filesystem on my mail server
 
 Here is the output of ls -al in the Maildir where my mails are stored
 
 total 1341
 drwx--   6 lcaron mail   256 2007-05-24 10:35 ./
 drwx-- 363 lcaron mail 12184 2007-05-25 21:52 ../
 -rw-r--r--   1 lcaron mail17 2004-05-25 09:19 courierimapacl
 drwx--   2 lcaron mail48 2004-05-25 09:20 courierimapkeywords/
 -rw-r--r--   1 lcaron lcaron  169365 2007-05-24 10:35 courierimapuiddb
 drwx--   2 lcaron mail   1185016 2007-05-24 10:26 cur/
 -rw---   1 lcaron mail 0 2004-05-25 09:19 maildirfolder
 ?-   ? ?  ??? new
 drwx--   2 lcaron mail48 2007-05-24 19:16 tmp/
 
 
 The entry that scares me is
 ?-   ? ?  ??? new
 
 Seems to me it is a filesystem corruption.
 
 Any other solution than rebuild-tree ?
 

Did you try rm -rf new?


 Thanks
 
 Laurent

Re: Filesystem corruption

2007-05-28 Thread Laurent CARON


Vladimir V. Saveliev a écrit :

Did you try rm -rf new?


$ rm -rf new
rm: cannot lstat `new': Permission denied

Re: Filesystem corruption

2007-05-28 Thread Vladimir V. Saveliev

Hello

On Monday 28 May 2007 18:10, Laurent CARON wrote:
 Vladimir V. Saveliev a écrit :
  Did you try rm -rf new?
 
 $ rm -rf new
 rm: cannot lstat `new': Permission denied
 
 
Is there anything from reiserfs in system logs?

Re: Filesystem corruption

2007-05-28 Thread Laurent CARON


Vladimir V. Saveliev a écrit :

Is there anything from reiserfs in system logs?



Nothing from reiserfs/kernel in

I did experience a similar bug on another computer a while ago (this bug 
was fixed by rebuilding the tree).

Re: Filesystem corruption

2007-05-28 Thread Christian Kujau


[resending, because lncsa.com bounced my mail]

On Mon, 28 May 2007, Christian Kujau wrote:

On Sun, 27 May 2007, Laurent CARON wrote:

The entry that scares me is
?-   ? ?  ??? new

Seems to me it is a filesystem corruption.
Any other solution than rebuild-tree ?


Please try to check the fs with a current version of reiserfsprogs first. As 
the manpage advises, try --check first and use --rebuild-tree only if you 
know what you're doing, IOW: have a current backup.


Also, which kernel/machine is this running on? Do you know *why* this 
corruption may have occured? Any recent hardware issues? Is ther anything in 
the logs regarding fs/device errors?


C.
--
BOFH excuse #448:

vi needs to be upgraded to vii

Re: Filesystem corruption

2007-05-28 Thread Laurent CARON


Christian Kujau a écrit :
Please try to check the fs with a current version of reiserfsprogs 
first. As the manpage advises, try --check first and use 
--rebuild-tree only if you know what you're doing, IOW: have a current 
backup.


Over the past few years, i experienced a few reiser corruption on 
various hardware (dell, hp, asus, sata, scsi, ide...) with the same 
symptoms (unredable file/dir).
Always ran check which told me to run fix-fixable or rebuild-tree, which 
I did after ensuring of backup reliability, and the error was corrected 
(after eventually losing a few files i fortunately had in the backups).




Also, which kernel/machine is this running on? Do you know *why* this 
corruption may have occured? Any recent hardware issues? Is ther 
anything in the logs regarding fs/device errors?


Kernel is 2.6.19.
The machine does not seem to have any HW issue, nothing strange in the 
logs. :$
This is just a plain Dell 2650 server with a bunch of SCSI HDD, software 
raid5 array, reiserfs on top of it.


Laurent

Re: Filesystem corruption

2007-05-28 Thread Christian Kujau


-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On Mon, 28 May 2007, Laurent CARON wrote:
Always ran check which told me to run fix-fixable or rebuild-tree, which I 
did after ensuring of backup reliability, and the error was corrected (after 
eventually losing a few files i fortunately had in the backups).


Well, lucky you :)

The machine does not seem to have any HW issue, nothing strange in the 
logs. :$
This is just a plain Dell 2650 server with a bunch of SCSI HDD, software 
raid5 array, reiserfs on top of it.


...and no power-failures, bad memory whatsoever?
Hm, too bad, since now it's unclear 
what *caused* the corruptions in the first place. You'll probably 
(hopefully) be able to correct this corruption with --rebuild-tree but 
I'd have a close look on this filesystem for further curruptions.


Christian.
- -- 
BOFH excuse #118:


the router thinks its a printer.
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFGW2N/+A7rjkF8z0wRAg9yAJ9PgWYfv1KC1Z3o/cVXScqxTYDPfwCdHKDD
Wy3p1M9ODJFfuqn0JaCEu8U=
=uCAH
-END PGP SIGNATURE-

Filesystem corruption

2007-05-27 Thread Laurent CARON

Hi,

A few days ago, one of my procmail suddenly receipes stopped to work.

I didn't care much since this only was for 1 or 2 mails.

Yesterday, i took time to dig it a bit further and looked at the
filesystem on my mail server

Here is the output of ls -al in the Maildir where my mails are stored

total 1341
drwx--   6 lcaron mail   256 2007-05-24 10:35 ./
drwx-- 363 lcaron mail 12184 2007-05-25 21:52 ../
-rw-r--r--   1 lcaron mail17 2004-05-25 09:19 courierimapacl
drwx--   2 lcaron mail48 2004-05-25 09:20 courierimapkeywords/
-rw-r--r--   1 lcaron lcaron  169365 2007-05-24 10:35 courierimapuiddb
drwx--   2 lcaron mail   1185016 2007-05-24 10:26 cur/
-rw---   1 lcaron mail 0 2004-05-25 09:19 maildirfolder
?-   ? ?  ??? new
drwx--   2 lcaron mail48 2007-05-24 19:16 tmp/


The entry that scares me is
?-   ? ?  ??? new

Seems to me it is a filesystem corruption.

Any other solution than rebuild-tree ?

Thanks

Laurent

Re: Filesystem corruption

2003-08-14 Thread Oleg Drokin

Hello!

On Thu, Aug 14, 2003 at 12:05:28AM +0800, Locke wrote:
 the files. I'm guessing the reason why it recovered so little was 
 because that because I was running a 7.8GB+40GB LVM and the 40GB 
 pyhsical volume wasn't working and left it with only 7.8GB.

Yes of course.

 is_tree_node: node level 0 does not match to the expected one 1
 vs-5150: search_by_key: invalid format found in block 8838461. Fsck?

So LVM substitures zero filled blocks instead of data if physical volume
is unavailable.
Of course reiserfsck happily thrown all of those blocks out of the tree.

 And also when rebooting after the corruption I saw several error 
 messages for all drives, hda, hdb and hdg
 **
 hda: dma_intr: status=0x51 { DriveReady SeekComplete Error }
 hda: dma_intr: error=0x84 { DriveStatusError BadCRC }
 hda: dma_intr: status=0x51 { DriveReady SeekComplete Error }
 hda: dma_intr: error=0x84 { DriveStatusError BadCRC }

Also you should consider replacing your noisy IDE cable for primary IDE
controller with not noisy one. Or just run in lower UDMA mode.

 **The messages are copied from the FAQ in namesys.com because they 
 looked similar so I'm not sure if they're the exactly same.

Well, if they are not the same, you'd better write them down on paper.

 Is there anything I can try to recover more data?

You might try to get LVM up again and run reiserfsck --rebuild tree.
Some more stuff wuill be restored.
Though still you will have lots of files' content lost and there is no way
to restore it anymore.
Also use reiserfsck 3.6.11

Bye,
Oleg

Re: filesystem corruption ?

2003-03-22 Thread Bernd Schubert

Hello,

 Though this machine will be replaced by a real server in a few month, I'm
 still rather worried what happend. Even if its 'only' a hardware memory
 problem this means lots of trouble for us -- on the one hand it seems not
 to be memtest86 detectable and on the other hand our programs really do
 need working memory, but of course this is not of your concern.

Update: I yesterday started our fall-back-server and run another memtest86 on 
the suspected machine. A colleague just told me that memtest86 reported 3 
errors in test 8, well lets see what comes in test 11.
So this either means that the physicians have run some experiments today or 
that the memory became damaged within 2 weeks.

Thanks a lot for your help to identify this as a hardware problem.

Best regards,
Bernd

Re: filesystem corruption ?

2003-03-21 Thread Bernd Schubert

On Friday 21 March 2003 08:32, you wrote:
 Hello!

 On Thu, Mar 20, 2003 at 07:23:48PM +0100, Bernd Schubert wrote:
   Hm, interesting.
   And what are the differences? How big are they?
 
  Since it are binaries files, a colleague had the idea to use hexdump and
  diff, so the command for the attached file was:
  diff (hexdump /worka/gdb) (hexdump /usr/bin/gdb)|sort -k 2 gdb.diff
  So the lines beginning with '' are from working gdb and lines beginning
  with '' are from corrupted gdb. When you look into the diff-file you
  will see, that only some bits per line have changed.

 I see.
 Basically you have two pages of data corrupted.
 And the corruption indeed looks like bit corruption.
 How about rebooting that box and checking if corruption pattern changes?
 Also I'd recommend you to run memtext86 for some time as this looks like
 bad memory pattern.

All of our machines have to pass a full memtest86 checking before we intend to 
use them - this machine is about 3 weeks old, of course it also had to run 
this test and furthermore it has ECC-memory.


   Any events happening between morning backup and time of problem
   discovery?
 
  Except, that I recompiled a kernel and we installed some programs using
  aptitude (its a debian system), nothing happend to the filesystem. There
  was also no reboot, no crash, etc.
  Update: The corruption probably happend at 15:48, since at this time also
  a xchat on one of the clients crashed and this was noticed by us at
  first. The xchat binary was also affected by the corruption.

 So, the beam of X-rays run through the memory module corrupting some bits?

There is the 'Environmental Physics Institut' in the floor below us and since 
we currently have an extremely high hardware failure rate, I have been joking 
for some time that they might be causing it (I believe they are indeed using 
x-ray beams). I should really ask them if their constructions are shielded 
properly ;-)

 ;) This stuff should not have been written to disk, so probably
 plain reboot should fix everything? Can you test that?

Yes of course, if something goes wrong we still have our fall back machine :-)

I will report in the afternoon if it worked.

Best regards,
Bernd

Re: filesystem corruption ?

2003-03-21 Thread Bernd Schubert

Hi,

 So, the beam of X-rays run through the memory module corrupting some bits?
 ;) This stuff should not have been written to disk, so probably
 plain reboot should fix everything? Can you test that?


indeed after rebooting everything is fine again. We will run another memtest86 
during the weekend, though I really don't believe we will find a problem.

Though this machine will be replaced by a real server in a few month, I'm 
still rather worried what happend. Even if its 'only' a hardware memory 
problem this means lots of trouble for us -- on the one hand it seems not to 
be memtest86 detectable and on the other hand our programs really do need 
working memory, but of course this is not of your concern.


Thanks for your help,
Bernd

Re: filesystem corruption ?

2003-03-21 Thread Oleg Drokin

Hello!

On Fri, Mar 21, 2003 at 02:01:38PM +0100, Bernd Schubert wrote:
  So, the beam of X-rays run through the memory module corrupting some bits?
  ;) This stuff should not have been written to disk, so probably
  plain reboot should fix everything? Can you test that?
 indeed after rebooting everything is fine again. We will run another memtest86 

So on-disk corruption is out of question.

 during the weekend, though I really don't believe we will find a problem.

Ask those physics guys to run some X-ray experiments while you are running memtest86 ;)

 Though this machine will be replaced by a real server in a few month, I'm 
 still rather worried what happend. Even if its 'only' a hardware memory 
 problem this means lots of trouble for us -- on the one hand it seems not to 
 be memtest86 detectable and on the other hand our programs really do need 

Well, it may be not detectable because no high-enerty beams are running around at
the time of test.

 working memory, but of course this is not of your concern.

I've learn in the school that if you put some bit amount of plumbum in between
some area and source of radiation, chances are radiation that will reach the
protected area will be of much lesser strenght.
In fact you might go to those guys and ask them what matherial (and how much of it)
is best suited to shield against stuff they generate.

Bye,
Oleg

Re: filesystem corruption ?

2003-03-21 Thread Bernd Schubert

 I've learn in the school that if you put some bit amount of plumbum in
 between some area and source of radiation, chances are radiation that will
 reach the protected area will be of much lesser strenght.
 In fact you might go to those guys and ask them what matherial (and how
 much of it) is best suited to shield against stuff they generate.

We already discussed during the lunch time to order somthing like this for our 
systems ;-) (would be a rather strange order for a usual computer company, 
wouldn't it ?)
But in fact, I'm now really going to contact  the those guys and ask if they 
have some stuff to detect their beams.

Have a nice weekend,
Bernd

Re: filesystem corruption ?

2003-03-21 Thread Russell Coker

On Fri, 21 Mar 2003 14:07, Oleg Drokin wrote:
 I've learn in the school that if you put some bit amount of plumbum in

It's better known in English as lead.

The problem with lead is that it's poisonous and soft.  Having to wash your 
hands after touching your computer could get annoying.

Other metals such as copper and steel will reduce the radiation and can also 
be used for protection against mechanical damage.

The best way to reduce radiation is by distance.  The inverse-square law 
applies, so moving the computer further away from the experiment will reduce 
the radiation more easily than anything else you may do.  One thing to 
consider is disk-less X-term machines for if you need to operate a computer 
from near the experiment, so if the X-term crashed from radiation then your 
server with the data should continue running correctly.

-- 
http://www.coker.com.au/selinux/   My NSA Security Enhanced Linux packages
http://www.coker.com.au/bonnie++/  Bonnie++ hard drive benchmark
http://www.coker.com.au/postal/Postal SMTP/POP benchmark
http://www.coker.com.au/~russell/  My home page

Re: [reiserfs-list] Filesystem corruption after resize

2002-06-12 Thread Baldur Norddahl


Quoting Vitaly Fertman ([EMAIL PROTECTED]):
 Hi, 
 
  Hello,
  The exact commands used are:
 
  resize_reiserfs -s 400G /dev/vg01/stuff
  lvreduce -l 16693 /dev/vg01/stuff
  pvmove -v /dev/md1
  vgreduce -v vg01 /dev/md1
  resize_reiserfs /dev/vg01/stuff
  reiserfsck --check /dev/vg01/stuff
 
  This all worked like a charm, until I noticed that a nightly script that
  scans all files, no longer was able to access about 20 files (access denied
  even though the script is running as root).
 
 Do you mean reiserfsck finished without any error/warning massage? 

Yes, it did not detect any errors after the resize. The errors turned up a
day after. So it might not be 100% that those two events are linked. But
since nothing else was done that could justify corruptions, that is the
theory I am working on.

 This progs I send to you is what is going to be the next release. 
 Please run --check and tell me what is in fsck.log. You can run 
 --fix-fixable if it says so, but it would be better to run 
 rebuild-tree on a copy (it is not a release). Or you can do the following:
 
 debugreiserfs/debugreiserfs -p /dev/vg01/stuff | gzip -p  stuff.gz
 
 it will pack metadata (without filebodies), I will download it and test 
 locally.

I will send you those two files in a seperate mail.

I copied all the data over to the other raid device, so I am not so much
concerned about rescueing the filesystem - I could just reformat the whole
thing and copy the files back.

But I would very much like to find out what happened so I can take actions
to prevent it from happening again. Particularly I need to know if resizing
on lvm devices is working properly, since I will need to resize again
shortly when the replacement disk arrives.

Baldur

Re: [reiserfs-list] Filesystem Corruption

2002-06-11 Thread Kurt


Thanks Oleg,
sorry for the late response (i was out of the office) , you may find 
the 
following information on the last crash useful :-
+++
3 04:32:37 devo kernel: vs-13075: reiserfs_read_inode2: dead inode read from 
disk [854 1695654 0x0 SD]. This is likely to be race with knfsd. Ignore
Jun  3 04:32:39 devo kernel: vs-13060: reiserfs_update_sd: stat data of object 
[854 1695654 0x0 SD] (nlink == 1) not found (pos 1)
Jun  3 04:41:38 devo kernel: vs-13060: reiserfs_update_sd: stat data of object 
[854 1695654 0x0 SD] (nlink == 1) not found (pos 1)
Jun  3 04:41:43 devo kernel: vs-13060: reiserfs_update_sd: stat data of object 
[854 1695654 0x0 SD] (nlink == 1) not found (pos 1)

I will upgrade the kernel and reiserfs tools this week and inform you of the 
result after a fsck.
-Kurt

On Friday 07 June 2002 3:15 am, Oleg Drokin wrote:
 Hello!

 On Thu, Jun 06, 2002 at 02:00:01PM -0400, Kurt wrote:
  error stating the file pointed to nowhere.
  I was unable to complete a reiserfsck --fix-fixable because of the length
  of time that this (fsck) process took since this was an unscheduled
  downtime. During the weekend i will attempt to do the fsck again, however
  i really needed to know if this problem has been observed by anyone else,
  and what steps they took to fix the problem.

 We recommend you to upgrade your kernel to 2.4.18.
 To know what exact problem is it would be very useful if you'd posted
 excerpts from kernel logs with actual errors.
 Thank you.

 Bye,
 Oleg

-- 

Kurt Palmer  SysAdmin
[EMAIL PROTECTED]Advance Internet
201-459-2846

[reiserfs-list] Filesystem corruption after resize

2002-06-11 Thread Baldur Norddahl


Hello,

First something about my setup:

md0: 8x80 GB in a RAID5 configuration
md1: 4x160 GB in a RAID5 configuration
/dev/vg01/stuff: the union of md0 and md1 done with lvm.

dark:/mnt# reiserfsck -V

-reiserfsck, 2002-
reiserfsprogs 3.x.1a

dark:/mnt# resize_reiserfs -v

-resize_reiserfs, 2002-
reiserfsprogs 3.x.1a

Usage: resize_reiserfs  [-s[+|-]#[G|M|K]] [-fqv] device

dark:/mnt# cat /proc/version 
Linux version 2.4.18 (root@dark) (gcc version 2.95.4 20011006 (Debian
prerelease)) #1 SMP Fri Apr 12 13:40:03 CEST 2002

The system is a dual AMD Athlon(tm) MP 1800+ (1533 MHz), with 1 GB memory.

Now recently one of the 160 GB disks died. Since I still had enough free
space and I wanted to preserve the redundancy, I used resize_reiserfs to
shrink the filesystem. Then I used lvm to move it away from the
non-redundant md1 device.

The exact commands used are:

resize_reiserfs -s 400G /dev/vg01/stuff
lvreduce -l 16693 /dev/vg01/stuff
pvmove -v /dev/md1
vgreduce -v vg01 /dev/md1
resize_reiserfs /dev/vg01/stuff
reiserfsck --check /dev/vg01/stuff

This all worked like a charm, until I noticed that a nightly script that
scans all files, no longer was able to access about 20 files (access denied
even though the script is running as root).

Dmesg is full of this:

vs-5150: search_by_key: invalid format found in block 66153. Fsck?
vs-13070: reiserfs_read_inode2: i/o failure occurred trying to find stat
data of [163330 163334 0x0 SD]
is_leaf: free space seems wrong: level=1, nr_items=1, free_space=3040 rdkey 
vs-5150: search_by_key: invalid format found in block 72879. Fsck?
vs-13070: reiserfs_read_inode2: i/o failure occurred trying to find stat
data of [168724 168732 0x0 SD]
is_tree_node: node level 29122 does not match to the expected one 1
vs-5150: search_by_key: invalid format found in block 70647. Fsck?
vs-13070: reiserfs_read_inode2: i/o failure occurred trying to find stat
data of [167220 167223 0x0 SD]
is_tree_node: node level 2 does not match to the expected one 1
vs-5150: search_by_key: invalid format found in block 66153. Fsck?
vs-13070: reiserfs_read_inode2: i/o failure occurred trying to find stat
data of [163330 163334 0x0 SD]

and so on, there is alot of this stuff repeating.

reiserfsck --fix-fixable /dev/vg01/stuff crashes.

Btw. a seperate problem, I am never able to unmount this filesystem
properly. I always get this error:

dark:/mnt# umount stuff
umount: /mnt/stuff: device is busy
dark:/mnt# fuser -v stuff

 USERPID ACCESS COMMAND
stuffroot kernel mount  /mnt/stuff

So without rebooting I can't quote the exact output from --fix-fixable, but
it is approximate the same as when I just run it plain:

dark:/mnt# reiserfsck -l /root/reiserfsck.log /dev/vg01/stuff

-reiserfsck, 2002-
reiserfsprogs 3.x.1a

Will read-only check consistency of the filesystem on /dev/vg01/stuff
Will put log info to '/root/reiserfsck.log'

Do you want to run this program?[N/Yes] (note need to type Yes):Yes
###
reiserfsck --check started at Tue Jun 11 16:36:38 2002
###
Filesystem seems mounted read-only. Skipping journal replay..
Checking S+tree../  4 (of   6)/ 27 (of 132)/ 44 (of 152)bit 1359513587,
bitsize 136749056
reiserfsck: bitmap.c:168: reiserfs_bitmap_test_bit: Assertion `bit_number 
bm-bm_bit_size' failed.
Aborted


What can I do to resolve this?

Thanks,
  Baldur

Re: [reiserfs-list] Filesystem Corruption

2002-06-07 Thread Oleg Drokin


Hello!

On Thu, Jun 06, 2002 at 02:00:01PM -0400, Kurt wrote:

 error stating the file pointed to nowhere.
 I was unable to complete a reiserfsck --fix-fixable because of the length of 
 time that this (fsck) process took since this was an unscheduled downtime.
 During the weekend i will attempt to do the fsck again, however i really 
 needed to know if this problem has been observed by anyone else, and what 
 steps they took to fix the problem.

We recommend you to upgrade your kernel to 2.4.18.
To know what exact problem is it would be very useful if you'd posted excerpts
from kernel logs with actual errors.
Thank you.

Bye,
Oleg

[reiserfs-list] Filesystem Corruption

2002-06-06 Thread Kurt


  
 (Embedded
 image moved   Kurt [EMAIL PROTECTED] 
 to file:  06/06/2002 02:00 PM
 pic11654.pcx)
  








 (Embedded
 image moved   Kurt [EMAIL PROTECTED]
 to file:  06/06/2002 02:00 PM
 pic24262.pcx)









 (Embedded
 image moved   Kurt [EMAIL PROTECTED]
 to file:  06/06/2002 02:00 PM
 pic13835.pcx)








Hello all,
 I currently have a system configured as follows :-
1) LVM version 1.0.1-rc4(ish)(03/10/2001)
2) /dev/PROJ/proj on /proj type reiserfs (rw,noatime,notail)
3) /dev/PROJ/proj239G  142G   97G  60% /proj
4) 2.4.17 with reiserfs tools 3.x.0k
5) Reiserfs compiled in (CONFIG_REISERFS_CHECK set to NO)
6) 256 MB RAM (sar -r shows memory usage is not abnormal for this box)
7)Tuns of very small files based on log processing
I am told by my co-worker that the system unresponsive and showed reiserfs
related errors on the console.
Upon restart they noticed that the file
/proj/webtrends/receive/bama/www3/access.01Jun.r.gz was unreadable by root
(permission denied).
I did a reiserfsck on the drive and noticed that access.01Jun.r.gz returned an
error stating the file pointed to nowhere.
I was unable to complete a reiserfsck --fix-fixable because of the length of
time that this (fsck) process took since this was an unscheduled downtime.
During the weekend i will attempt to do the fsck again, however i really
needed to know if this problem has been observed by anyone else, and what
steps they took to fix the problem.
-Kurt



--

Kurt Palmer  SysAdmin
[EMAIL PROTECTED]Advance Internet
201-459-2846

[reiserfs-list] Filesystem Corruption

2002-06-06 Thread Kurt


  
 (Embedded
 image moved   Kurt [EMAIL PROTECTED] 
 to file:  06/06/2002 02:00 PM
 pic29967.pcx)
  








 (Embedded
 image moved   Kurt [EMAIL PROTECTED]
 to file:  06/06/2002 02:00 PM
 pic30134.pcx)









 (Embedded
 image moved   Kurt [EMAIL PROTECTED]
 to file:  06/06/2002 02:00 PM
 pic18956.pcx)









 (Embedded
 image moved   Kurt [EMAIL PROTECTED]
 to file:  06/06/2002 02:00 PM
 pic19921.pcx)









 (Embedded
 image moved   Kurt [EMAIL PROTECTED]
 to file:  06/06/2002 02:00 PM
 pic06540.pcx)









 (Embedded
 image moved   Kurt [EMAIL PROTECTED]
 to file:  06/06/2002 02:00 PM
 pic08003.pcx)









 (Embedded
 image moved   Kurt [EMAIL PROTECTED]
 to file:  06/06/2002 02:00 PM
 pic04883.pcx)









 (Embedded
 image moved   Kurt [EMAIL PROTECTED]
 to file:  06/06/2002 02:00 PM
 pic11654.pcx)









 (Embedded
 image moved   Kurt [EMAIL PROTECTED]
 to file:  06/06/2002 02:00 PM
 pic24262.pcx)









 (Embedded
 image moved   Kurt [EMAIL PROTECTED]
 to file:  06/06/2002 02:00 PM
 pic13835.pcx)








Hello all,
 I currently have a system configured as follows :-
1) LVM version 1.0.1-rc4(ish)(03/10/2001)
2) /dev/PROJ/proj on /proj type reiserfs (rw,noatime,notail)
3) /dev/PROJ/proj239G  142G   97G  60% /proj
4) 2.4.17 with reiserfs tools 3.x.0k
5) Reiserfs compiled in (CONFIG_REISERFS_CHECK set to NO)
6) 256 MB RAM (sar -r shows memory usage is not abnormal for this box)
7)Tuns of very small files based on log processing
I am told by my co-worker that the system unresponsive and showed reiserfs
related errors on the console.
Upon restart they noticed that the file
/proj/webtrends/receive/bama/www3/access.01Jun.r.gz was unreadable by root
(permission denied).
I did a reiserfsck on the drive and noticed that access.01Jun.r.gz returned an
error stating the file pointed to nowhere.
I was unable to complete a reiserfsck --fix-fixable because of the length of
time that this (fsck) process took since this was an unscheduled downtime.
During the weekend i will attempt to do the fsck again, however i really
needed to know if this problem has been observed by anyone else, and what
steps they took to fix the problem.
-Kurt



--

Kurt Palmer  SysAdmin
[EMAIL PROTECTED]Advance Internet
201-459-2846

41 matches

Mail list logo