Re: Unable to mount hammer file system Undo failed

2012-07-20 Thread Wojciech Puchar

   People who use HAMMER also tend to backup their filesystems using
   the streaming mirroring feature.  You need a backup anyway, regardless.


definitely. Backups are different thing.

But i do not consider online mirroring from hammer as backup feature, but 
something like more sophisticated mirroring.


As long as backed up data is available on line for writing i don't 
consider it backup.


I use rsync for backing up everything, with backup machine located on 
different place, and not accessible from outside internet.



   the fact that the copies are all being managed from the same machine).

and that this copies are not ever regenerated.


   failures over the years, mostly blamed on failed writes to disks or
   people not having UPS's (since UFS was never designed to work with
   a disk synchronization command, crashes from e.g. power failures could


Seems you are quite out of date with FreeBSD. FreeBSD UFS do perform disk 
cache flushes at right time.




Re: Unable to mount hammer file system Undo failed

2012-07-20 Thread elekktretterr
 I am NOT talking about background fsck which is implemented in FreeBSD and
 i turn this off.

 I am talking about just not doing fsck of every filesystem after crash.
 And doing it within same day but when pause is not a problem.

 This is legitimate method with UFS+softupdates.


Then explain this: Every now and then when the OS crashed it would leave
truncated files that wouldn't be recovered until fsck ran.

Background fsck is another abomination. I remember once I had a truncated
file that got written to before fsck repaired it and it caused data loss
on that file.



Re: Unable to mount hammer file system Undo failed

2012-07-19 Thread Wojciech Puchar

HAMMER(ROOT) recovery check seqno=8ca97e62
HAMMER(ROOT) recovery range 36877528-36892fa0
HAMMER(ROOT) recovery nexto 36892fa0 endseqno=8ca98015
HAMMER(ROOT) recovery undo  36877528-36892fa0 (113272 bytes)(RW)
ad4: FAILURE - READ_DMA48 status=51READY,DSC,ERROR
error=40UNCORRECTABLE LBA=483752928
HAMMER: UNDO record, cannot access buffer 203436e35ca8
HAMMER(ROOT) UNDO record at 36891a30 failed
HAMMER(ROOT) recovery complete
Failed to recover HAMMER filesystem on mount

this is example of what i fear about hammer. One error results in 
unability to mount.


If it is just software error in hammer that's great. If it is design... 
not great.


Re: Unable to mount hammer file system Undo failed

2012-07-19 Thread Alex Hornung
On 19/07/12 09:25, Wojciech Puchar wrote:
 this is example of what i fear about hammer. One error results in
 unability to mount.
 
 If it is just software error in hammer that's great. If it is design...
 not great.

This is not a hammer problem but a problem with the underlying disk. It
couldn't read from the disk - that is pretty much a file-system
independent problem; UFS would fail equally miserably.

Cheers,
Alex


Re: Unable to mount hammer file system Undo failed

2012-07-19 Thread Siju George
On Thu, Jul 19, 2012 at 1:59 PM, Alex Hornung ahorn...@gmail.com wrote:

 This is not a hammer problem but a problem with the underlying disk. It
 couldn't read from the disk - that is pretty much a file-system
 independent problem; UFS would fail equally miserably.


I have PFS slaves on a second disk.
I have already fitted a new disk and the OS installation is complete.
 I will upgrade the Slaves to Master and then configure slaves for
them so there is no problem.

But I have lost the snapshot symlinks :-(
In the PFSes I snapshotted every 5 minutes I have a lot of symlinks.

Is there any easy way to recreate those symlinks from the snapshot IDs ?

Thanks

Siju


Re: Unable to mount hammer file system Undo failed

2012-07-19 Thread Wojciech Puchar

not great.


This is not a hammer problem but a problem with the underlying disk. It
couldn't read from the disk - that is pretty much a file-system
independent problem; UFS would fail equally miserably.

not true.
it is very unlinkey case you will not be able to mount. you will not be 
able to read everything.


copying to new disks with skipping errors (dd conv=sync,noerror) and doing 
fsck_ffs basically would recover everything that can be recovered.


UFS use flat on disk structure. inodes are at known places.

I don't know how HAMMER data is placed, but seems everything is dynamic.

any link to description of HAMMER on disk layout?



Re: Unable to mount hammer file system Undo failed

2012-07-19 Thread Francis GUDIN

Wojciech Puchar writes:


not great.


This is not a hammer problem but a problem with the underlying disk. It
couldn't read from the disk - that is pretty much a file-system
independent problem; UFS would fail equally miserably.

not true.
it is very unlinkey case you will not be able to mount. you will not be 
able to read everything.


copying to new disks with skipping errors (dd conv=sync,noerror) and doing 
fsck_ffs basically would recover everything that can be recovered.


UFS use flat on disk structure. inodes are at known places.

I don't know how HAMMER data is placed, but seems everything is dynamic.

any link to description of HAMMER on disk layout?


Please, read hammer(8) (at the subcommand recover).

--
Francis



Re: Unable to mount hammer file system Undo failed

2012-07-19 Thread Wojciech Puchar

UFS use flat on disk structure. inodes are at known places.

I don't know how HAMMER data is placed, but seems everything is dynamic.

any link to description of HAMMER on disk layout?


Please, read hammer(8) (at the subcommand recover).

thank you very much.

While such recovery is painfully slow (it scans entire image not just 
selected predefined areas like fsck_ffs) it DO EXIST.


Seems i have to make some more tests with intentionally broken hardware, 
which i don't have at the moment.




Re: Unable to mount hammer file system Undo failed

2012-07-19 Thread Markus Pfeiffer
On Thu, Jul 19, 2012 at 12:42:02PM +0200, Wojciech Puchar wrote:
  UFS use flat on disk structure. inodes are at known places.
  
  I don't know how HAMMER data is placed, but seems everything is dynamic.
  
  any link to description of HAMMER on disk layout?
 
  Please, read hammer(8) (at the subcommand recover).
 thank you very much.
 
 While such recovery is painfully slow (it scans entire image not just 
 selected predefined areas like fsck_ffs) it DO EXIST.
 
 Seems i have to make some more tests with intentionally broken hardware, 
 which i don't have at the moment.
 


just dd /dev/random and overwrite a few sectors?

Markus
-- 
Markus Pfeiffer, University of St Andrews
email: markus.pfeif...@morphism.de | xmpp: markus.pfeif...@jabber.morphism.de


pgp6zMu1T4MDJ.pgp
Description: PGP signature


Re: Unable to mount hammer file system Undo failed

2012-07-19 Thread Wojciech Puchar

which i don't have at the moment.




just dd /dev/random and overwrite a few sectors?


good but... real failures are always worse than that.

In my tests ZFS for example (which for me is plain example of bad design 
and bad implementation) failed within less than hour to the point it was 
mountable but anything in /usr subdirectory was unreadable resulting in 
crash.


i willingly used machine with failed chipset, resulting in bad data in RAM 
every now and then. When it hit userspace it resulted in signal 11 or 
weird results of software.


After hitting kernel it resulted what i described.

i've wrote this HDD image for later tests. newer ZFS versions fixed crash, 
replacing it with message of error with still no data access.


memory corruption resulted in writing bad metadata, in all copies of 
course. ZFS is tree structured so the results was clear.


No offline fsck exist for ZFS because it is not needed.

I actually agree. it is not needed, as well as ZFS ;)

As for me, what i really need is plain filesystem functionality.

I doesn't really need snapshots etc.

UFS is really fine for me, but seems like swapcache isn't well suited to 
cache it properly as Matthew Dillon said.


What i will need within a month is a service doing lots of I/O to some not 
that huge part of dataset, while keeping large files too.


swapcache is perfect fit for that workload if it would work.

All my fears aren't from nothing.
One should be picky about replacing something (UFS) that is close to 
perfect.


Thee fsck time is an overstated problem, over other problems.



On the same machine i was unable to destroy UFS after whole day of trying.
Of course it crashed. Of course it produced SMALL data loss (few files), 
fsck_ffs always fixed things properly.


This was under FreeBSD but DragonFly UFS is no different.


I really appreciate LOTS OF HARD WORK Of Matthew Dillon and others, but 
simply dismissing trusted UFS with over 20 years of history by stating 
just use hammer isn't good IMHO.




Re: Unable to mount hammer file system Undo failed

2012-07-19 Thread elekktretterr
 This was under FreeBSD but DragonFly UFS is no different.


My main problem had been with ffs_fsck. At one point my machine was
randomly crashing due to a bad power supply. Everytime I started up, did
an hour of work, then crash, then 30-40 minutes for fsck to run, and an
hour later do it all over again.

I'd rather use Linux/ext3 than any UFS ever again.



Re: Unable to mount hammer file system Undo failed

2012-07-19 Thread Wojciech Puchar




My main problem had been with ffs_fsck. At one point my machine was
randomly crashing due to a bad power supply. Everytime I started up, did
an hour of work, then crash, then 30-40 minutes for fsck to run, and an


you may postpone fsck when using softupdates. It is clearly stated in 
softupdate documents you may find (McKusick was one of the authors).

that's what i do.

of course i've had hardware failure like that, and got quite a few crashes 
before i was certain that it is hardware not software failure and 
requested new hardware (old was post warranty).


did fsck once a day after worktime. With new hardware did fsck as well as 
(just for sure) rebuild every index files, eg. dovecot indexes.


But EVEN if i would need to wait 30 minutes for fsck i prefer it over 
solutions that say fsck is not needed at all.


I would say more after really stress tests of HAMMER filesystem, including 
a real run of hammer rebuild at last.


as someone proposed doing tests with writing random disk blocks, i would 
rather write make a program that would flip random memory bit every few 
minutes.




hour later do it all over again.

I'd rather use Linux/ext3 than any UFS ever again.


i mean last sentence is a joke.

Linux extwhatever, and linux as a whole is most dangerous system i ever 
used if we talk about filesystem.


Not once i had to recover everything from backup because of amount of 
damage.


I wish you more such a luck in future, but remember that luck is never 
persistent even for you :)


Re: Unable to mount hammer file system Undo failed

2012-07-19 Thread Francis GUDIN

Wojciech Puchar writes:


My main problem had been with ffs_fsck. At one point my machine was
randomly crashing due to a bad power supply. Everytime I started up, did
an hour of work, then crash, then 30-40 minutes for fsck to run, and an


you may postpone fsck when using softupdates. It is clearly stated in 
softupdate documents you may find (McKusick was one of the authors).

that's what i do.


Then, you suffer a performance hit when fsck'ing in bg. No such thing with 
hammer, fwiw.


as someone proposed doing tests with writing random disk blocks, i would 
rather write make a program that would flip random memory bit every few 
minutes.


If you're assuming that even the computer itself ain't reliable, how the 
hell could any FS be trustworthy then ??? IMHO, that's nonsense.


--
Francis



Re: Unable to mount hammer file system Undo failed

2012-07-19 Thread Wojciech Puchar


you may postpone fsck when using softupdates. It is clearly stated in 
softupdate documents you may find (McKusick was one of the authors).

that's what i do.


Then, you suffer a performance hit when fsck'ing in bg.

once again - read more carefully :)

I am NOT talking about background fsck which is implemented in FreeBSD and 
i turn this off.


I am talking about just not doing fsck of every filesystem after crash. 
And doing it within same day but when pause is not a problem.


This is legitimate method with UFS+softupdates.

as someone proposed doing tests with writing random disk blocks, i would 
rather write make a program that would flip random memory bit every few 
minutes.


If you're assuming that even the computer itself ain't reliable, how the hell


Assuming hardware never fails is certainly wrong


could any FS be trustworthy then ??? IMHO, that's nonsense.

No it isn't. Sorry if i wasn't clear enough to explain it.



Re: Unable to mount hammer file system Undo failed

2012-07-19 Thread Francis GUDIN

Wojciech Puchar writes:



you may postpone fsck when using softupdates. It is clearly stated in 
softupdate documents you may find (McKusick was one of the authors).

that's what i do.


Then, you suffer a performance hit when fsck'ing in bg.

once again - read more carefully :)

I am NOT talking about background fsck which is implemented in FreeBSD and 
i turn this off.


I am talking about just not doing fsck of every filesystem after crash. 
And doing it within same day but when pause is not a problem.


This is legitimate method with UFS+softupdates.


OK, understood now, i think: you agree with temporarily loosing a bit of 
unreclaimed free-space on disk until time permits cleaning things up 
properly, afaiu softupdates (+journalling ? not really clear).


as someone proposed doing tests with writing random disk blocks, i would 
rather write make a program that would flip random memory bit every few 
minutes.


If you're assuming that even the computer itself ain't reliable, how the hell


Assuming hardware never fails is certainly wrong


And there's no practical point assuming it *always* fails, is there ?


could any FS be trustworthy then ??? IMHO, that's nonsense.

No it isn't. Sorry if i wasn't clear enough to explain it.


Well, if the thing that you try to defend against is plain hardware failure 
(memory bits flipping, CPU going mad, whatever), i just doubt that any kind 
of software layer could definitely solve it (checksums of checksums of… i/o 
buffers, to be safe ? seriously ? do you trust your DMA chip, also ?): here, 
one answer is ECC-RAM. Any practical FS will have to use RAM as a cache, no 
matter what. If your cache can't be trusted, you're screwed. That's it. You 
could build whatever clever mecanism into your on-disk layout to 
(only) improve robustness, but you will have to trust your executing 
environment.


Well, that's just my view of the matter. I'm a happy hammer user since 
years, and i felt you made up strong arguments against it without much 
experience with it.
Please try it, break it and if you can report anything that can help enhance 
it, be welcome to post a bug report.


--
Francis



Re: Unable to mount hammer file system Undo failed

2012-07-19 Thread Wojciech Puchar


OK, understood now, i think: you agree with temporarily loosing a bit of 
unreclaimed free-space on disk until time permits cleaning things up 
properly, afaiu softupdates (+journalling ? not really clear).


That it. And that's how original softupdates document describe it.
You may run quite safely without fsck, just not abuse that feature for too 
long!


No journalling. I am currently FreeBSD user, FreeBSD 9 added softupdates 
journalling, but REALLY it doesn't change much except extra writes on disk.


I found that you actually have to run full fsck now and then even with 
journalling. In theory it shouldn't find any inconsistences, in practice 
it always find minor ones.


As to end that topic my practices are:

- do not make huge filesystems or create large RAID arrays.
2 disk, one mirror from them, one filesystem.
- it takes like 30 minutes or less to fsck it, and the same time for 10 
such filesystems as it can go in parallel.


in case of crash i do fsck manually when pause isn't a problem.
at reboot i check only root filesystem, and (if it's separate) /usr, so i 
could execute all other checks remotely without rebooting.



Assuming hardware never fails is certainly wrong


And there's no practical point assuming it *always* fails, is there ?


Just that it fails sometimes is enough to assume it can.


could any FS be trustworthy then ??? IMHO, that's nonsense.

No it isn't. Sorry if i wasn't clear enough to explain it.


Well, if the thing that you try to defend against is plain hardware failure 
(memory bits flipping, CPU going mad, whatever), i just doubt that any kind 
of software layer could definitely solve it (checksums of checksums of? i/o


You are completely right.

What i point out that flat data layout makes chance of recovery far higher 
and chance of bad destruction far lower.


Any Tree-like structure produces a huge risk of losing much more data that 
was corrupted at first place.



That rule already prove true for UFS filesystem, as well for eg. fixed 
size database tables like .DBF format which i still use, not modern 
ones.


Using DBF files as example - you have indexes in separate files, but 
indexes are not crucial and can be rebuild.


So if any tree like structure (or hash type or whatever) would be invented 
to speed up filesystem access - great. But only as extra index, with 
crucial data (INODES!!) written as flat fixed record table at known place.


I don't say HAMMER is bad - contrary to ZFS which is 100% PURE SHIT(R) - 
but i don't agree it is (or will be) a total replacement of older UFS.


Hammer pseudo-filesystems, snapshots and on like replication are useful 
features but actually not that needed for everyone, and not without cost 
of extra complexity. No matter how smart Matthew Dillon is, it still be 
far more complex, and more risky.


That's why it's not good that swapcache doesn't support efficient caching 
of UFS as there are no vfs.ufs.double_buffer feature just like hammer.




-
Disclaimer ;): None of my practices, my ideas about safe filesystems, 
mirroring or anything else are not replacement of proper backup 
strategy!!! Please do not interpret anything i write about ideas against 
backups.


Re: Unable to mount hammer file system Undo failed

2012-07-19 Thread Francis GUDIN

Wojciech Puchar writes:

What i point out that flat data layout makes chance of recovery far higher 
and chance of bad destruction far lower.


Any Tree-like structure produces a huge risk of losing much more data that 
was corrupted at first place.


Not so sure about that statement, but well, let's agree we might disagree :)

You asked for a little documentation about its layout, workings; this may be 
a good fit: http://www.dragonflybsd.org/presentations/nycbsdcon08/


Regards,
--
Francis



Re: Unable to mount hammer file system Undo failed

2012-07-19 Thread Wojciech Puchar


Any Tree-like structure produces a huge risk of losing much more data that 
was corrupted at first place.


Not so sure about that statement, but well, let's agree we might disagree :)

disagreement is a source of all good ideas. but you should explain why.

my explanation below.



You asked for a little documentation about its layout, workings; this may be 
a good fit: http://www.dragonflybsd.org/presentations/nycbsdcon08/

this is about older hammer revision.

Matthew claimed some time ago that new hammer is completely different.

But after reading i understood that everything is in B-Tree. exactly 
what i call dangerous. B-Tree used to store everything, directory entries, inodes etc.


B-Tree are dangerous if they are used as the only way to access data. 
Corrupted B-Tree does mean no access to anything below it!!



What i see as a main difference between HAMMER and ZFS are:

1) practical - hammer is very fast, don't use gigabytes of RAM and lots of 
CPU speed. Not that i did a lot of tests but it seems like UFS speed, 
sometimes even more, rarely less.


It is actually USAFUL, cannot be said on ZFS ;)

2) basic way of storing data are similar, details are different, danger is 
similar


3) HAMMER have recovery program. It will need to read whole media. Assume 
2TB disk at 100MB/s - 2 seconds==6 hours.
ZFS doesn't have, there are few businesses that recover ZFS data for 
money. For sure they doesn't feel it's a crisis ;)



assume that i store my clients data in hammer filesystem and it crashed 
completely,  but disks are fine. Assume it's tuesday 16pm, last copy done 
automatically monday 17:30, failure found at 17pm, i am on place 18pm


I ask my client - what do you prefer:

- wait 6 hours and there is good deal of chance that most of your data 
will be recovered. If so, the little few would be found out and recovered 
from backup. If not we will start recovery from backup that would take 
another 6 hours?


- just clear things out and start recovery from backup, everything would 
be for sure recovered as it was yesterday after work?



the answer?


THE ANSWER:
---
1) divide disk space for metadata space and data space. amount of 
metadata space defined at filesystem creation, say 3% of whole drive.


2) data stored only in B-Tree leafs, and all B-Tree leafs stored in 
metadata space. few critical filesystem blocks stored here too at 
predefined place.


3) everything else stored in data space. B-Tree blocks excluding leafs, 
undo log, actual data.



4) everything else as it is already with modification to make sure every 
B-Tree leaf block will have data describing it properly. inodes having 
inode number inside, directory having it's inode number inside too. AFAIK 
it is already like that.


5) hammer recover modified to scan this 3% of space and then rebuild 
B-Tree. Will work faster or similar than fsck_ffs this way, in spite of 
being last resort tool.

---

THE RESULT: Fast and featureful filesystem that can always be quickly 
recovered even in last resort cases.


Re: Unable to mount hammer file system Undo failed

2012-07-19 Thread Francis GUDIN

Wojciech Puchar writes:



Any Tree-like structure produces a huge risk of losing much more data that 
was corrupted at first place.


Not so sure about that statement, but well, let's agree we might disagree :)

disagreement is a source of all good ideas. but you should explain why.


Well, arguing can be fun at times, but my free time is rather limited; i wished
that thread could die peacefully.

I think I made my points clear. As i'm far from being qualified to discuss
these topics, i'll just add a bit but won't repeat my statements about
prerequisites regarding where's the line must be drawn and which running
conditions are expected from the FS pov.


my explanation below.



You asked for a little documentation about its layout, workings; this may be 
a good fit: http://www.dragonflybsd.org/presentations/nycbsdcon08/

this is about older hammer revision.

Matthew claimed some time ago that new hammer is completely different.

But after reading i understood that everything is in B-Tree. exactly 
what i call dangerous. B-Tree used to store everything, directory entries, inodes etc.


I won't speak on behalf of Matt, but iiuc HAMMER2 will use other structures
than B-Tree, with the goal to reduce complexity.
The presentation at the link i gave you is rather outdated, and targets HAMMER
1 (first subversions of the FS in its first design).

B-Tree are dangerous if they are used as the only way to access data. 
Corrupted B-Tree does mean no access to anything below it!!



What i see as a main difference between HAMMER and ZFS are:

1) practical - hammer is very fast, don't use gigabytes of RAM and lots of 
CPU speed. Not that i did a lot of tests but it seems like UFS speed, 
sometimes even more, rarely less.


It is actually USAFUL, cannot be said on ZFS ;)


Sorry, i also just love ZFS for the business case i rely on it for. It has some
clearly nice features.

2) basic way of storing data are similar, details are different, danger is 
similar


No: this is wrong. I won't make a digest of papers on both of them for you.
Read about it.

3) HAMMER have recovery program. It will need to read whole media. Assume 
2TB disk at 100MB/s - 2 seconds==6 hours.
ZFS doesn't have, there are few businesses that recover ZFS data for 
money. For sure they doesn't feel it's a crisis ;)


I never had to use that recovery program. If you search the archives, only a
handful of people really had a need for it. Don't anticipate you'll need to use
it routinely.
The truth is: whatever happens (crash, lost power supply, sick HDD), you'll
just mount it, maybe some transactions will complete/be rolled back, and that's
it. A matter of 10 seconds.

Using recover onto a 2TB medium will be slow, of course. But you're then trying
to recover a full filesystem, including history for as much as was there before
the crash.

assume that i store my clients data in hammer filesystem and it crashed 
completely,  but disks are fine. Assume it's tuesday 16pm, last copy done 
automatically monday 17:30, failure found at 17pm, i am on place 18pm


I ask my client - what do you prefer:

- wait 6 hours and there is good deal of chance that most of your data 
will be recovered. If so, the little few would be found out and recovered 
from backup. If not we will start recovery from backup that would take 
another 6 hours?


Moot point. See above.

- just clear things out and start recovery from backup, everything would 
be for sure recovered as it was yesterday after work?



the answer?


The answer using hammer: use mirror-stream and have your data onto another
disk, connected to a different host with a state as of 1 minute ago in the
worst case.
Dead hardware ? Just swap them, switch slave pfs to master, and you're done.


THE ANSWER:
---
1) divide disk space for metadata space and data space. amount of 
metadata space defined at filesystem creation, say 3% of whole drive.


And then you're into the gosh! i never thought it'd store so many small files!
i'm screwed.

2) data stored only in B-Tree leafs, and all B-Tree leafs stored in 
metadata space. few critical filesystem blocks stored here too at 
predefined place.


3) everything else stored in data space. B-Tree blocks excluding leafs, 
undo log, actual data.



4) everything else as it is already with modification to make sure every 
B-Tree leaf block will have data describing it properly. inodes having 
inode number inside, directory having it's inode number inside too. AFAIK 
it is already like that.


5) hammer recover modified to scan this 3% of space and then rebuild 
B-Tree. Will work faster or similar than fsck_ffs this way, in spite of 
being last resort tool.

---

THE RESULT: Fast and featureful filesystem that can always be quickly 
recovered even in last resort cases.


I just don't follow what you meant, honestly. But well, show us the code if you
feel brave.

That'll be the last reply to this thread for me.

Good night,
--
Francis



Re: Unable to mount hammer file system Undo failed

2012-07-19 Thread Wojciech Puchar


Sorry, i also just love ZFS for the business case i rely on it for. It has 
some

clearly nice features.


sorry if your resoning for software is based on love, not logic then it's 
good idea to end topic.


Probably your business is more about deploying as much as possible and 
that's all.


Re: Unable to mount hammer file system Undo failed

2012-07-19 Thread Matthew Dillon
People who use HAMMER also tend to backup their filesystems using
the streaming mirroring feature.  You need a backup anyway, regardless.
HAMMER makes it easy, and this is the recommended method for dealing
with media faults on HDDs not backed by hardware RAID (and even if
they are).  You need to backup your data anyway, after all, regardless
of the filesystem (even ZFS's 'copies' feature has its limits due to
the fact that the copies are all being managed from the same machine).

FreeBSD's background fsck and mounting without an fsck (depending on
softupdates) has NEVER been well vetted to ensure that it works in
all situations.  There have been lots of complaints about related
failures over the years, mostly blamed on failed writes to disks or
people not having UPS's (since UFS was never designed to work with
a disk synchronization command, crashes from e.g. power failures could
seriously corrupt disks above and beyond lost sectors).  They can
claim it works better now, but I would never trust it.  Background fsck
itself can render a server unusable due to lost performance.

HAMMER has a 'hammer recover' command meant to be used when all else
fails.  It can be used directly with the bad/corrupted disk as the source
and a new disk as the destination.  It scans the disk, yes.  A full
fsck on a very large (2TB+) filled filesystem is almost as bad when it
starts having to seek around.

I have had numerous failed disks over the years and have never had to
actually use the recover command.  I always initialize a replacement
from one of the several live backups I keep.

HAMMER2 will have some more interesting features that flesh out the
live backup mechanic a bit better, making it possible to e.g. initialize
a replacement disk locally and leave the filesystem live using a remotely
served backup as the replacement is reloaded from the backup.  But it
isn't possible with HAMMER1, sorry.

-Matt
Matthew Dillon 
dil...@backplane.com


Re: Unable to mount hammer file system Undo failed

2012-07-19 Thread Matthew Dillon

:I have PFS slaves on a second disk.
:I have already fitted a new disk and the OS installation is complete.
: I will upgrade the Slaves to Master and then configure slaves for
:them so there is no problem.
:
:But I have lost the snapshot symlinks :-(
:In the PFSes I snapshotted every 5 minutes I have a lot of symlinks.
:
:Is there any easy way to recreate those symlinks from the snapshot IDs ?
:
:Thanks
:
:Siju

Try 'hammer snapls mountpt'.  The snapshots are recorded in meta-data
so if they're still there you can get them back.  You may have to
write a script to recreate the softlinks from the output.

-Matt
Matthew Dillon 
dil...@backplane.com


Re: Unable to mount hammer file system Undo failed

2012-07-19 Thread Siju George
On Fri, Jul 20, 2012 at 9:35 AM, Matthew Dillon
dil...@apollo.backplane.com wrote:

 Try 'hammer snapls mountpt'.  The snapshots are recorded in meta-data
 so if they're still there you can get them back.  You may have to
 write a script to recreate the softlinks from the output.


Yes the snapshots are all there.
The scripting is the problem ;-)
Don't worry I will look into it.

It would have been great if there was a command

hammer snaplinks pfs targetdirectory

to create links to a specific directory.

Thanks

Siju