FYI, I'm working on a workaround for broken devices.
As you note,
ome disks flat-out lie: you issue the
synchronize-cache command,
they say got it, boss, yet the data is still not on
stable storage.
Why do they do this? Because it performs better.
Well, duh --
ou can make stuff
It would be extremely helpful to know what brands/models of disks lie and which
don't. This information could be provided diplomatically simply as threads
documenting problems you are working on, stating the facts. Use of a specific
string of words would make searching for it easy. There
[EMAIL PROTECTED] wrote on 10/11/2008 09:36:02 PM:
On Oct 10, 2008, at 7:55 PM 10/10/, David Magda wrote:
If someone finds themselves in this position, what advice can be
followed to minimize risks?
Can you ask for two LUNs on different physical SAN devices and have
an expectation
On Thu, Oct 9, 2008 at 10:33 PM, Mike Gerdts [EMAIL PROTECTED] wrote:
On Thu, Oct 9, 2008 at 10:18 AM, Mike Gerdts [EMAIL PROTECTED] wrote:
On Thu, Oct 9, 2008 at 10:10 AM, Greg Shaw [EMAIL PROTECTED] wrote:
Nevada isn't production code. For real ZFS testing, you must use a
production
On Oct 10, 2008, at 7:55 PM 10/10/, David Magda wrote:
If someone finds themselves in this position, what advice can be
followed to minimize risks?
Can you ask for two LUNs on different physical SAN devices and have
an expectation of getting it?
--
Keith H. Bierman [EMAIL
2008/10/9 Bob Friesenhahn [EMAIL PROTECTED]:
On Thu, 9 Oct 2008, Miles Nordin wrote:
catastrophically. If this is really the situation, then ZFS needs to
give the sysadmin a way to isolate and fix the problems
deterministically before filling the pool with data, not just blame
the sysadmin
Hello all,
I think the problem here is the ZFS´ capacity for recovery from a failure.
Forgive me, but thinking about creating a code without failures, maybe the
hackers did forget that other people can make mistakes (if they can´t).
- ZFS does not need fsck.
Ok, that´s a great statement,
The circumstances where I have lost data have been when ZFS has not
handled a layer of redundancy. However, I am not terribly optimistic
of the prospects of ZFS on any device that hasn't committed writes
that ZFS thinks are committed.
FYI, I'm working on a workaround for broken devices. As
Hi Jeff,
On Sex, 2008-10-10 at 01:26 -0700, Jeff Bonwick wrote:
The circumstances where I have lost data have been when ZFS has not
handled a layer of redundancy. However, I am not terribly optimistic
of the prospects of ZFS on any device that hasn't committed writes
that ZFS thinks are
That sounds like a great idea for a tool Jeff. Would it be possible to build
that in as a zpool recover command?
Being able to run a tool like that and see just how bad the corruption is, but
know it's possible to recover an older version would be great. Is there any
chance of outputting
jb == Jeff Bonwick [EMAIL PROTECTED] writes:
rmc == Ricardo M Correia [EMAIL PROTECTED] writes:
jb We need a little more Code of Hammurabi in the storage
jb industry.
It seems like most of the work people have to do now is cleaning up
after the sloppyness of others. At least it takes
On Fri, Oct 10, 2008 at 06:15:16AM -0700, Marcelo Leal wrote:
- ZFS does not need fsck.
Ok, that?s a great statement, but i think ZFS needs one. Really does.
And in my opinion a enhanced zdb would be the solution. Flexibility.
Options.
About 99% of the problems reported as I need ZFS fsck
Eric Schrock wrote:
On Fri, Oct 10, 2008 at 06:15:16AM -0700, Marcelo Leal wrote:
- ZFS does not need fsck.
Ok, that?s a great statement, but i think ZFS needs one. Really does.
And in my opinion a enhanced zdb would be the solution. Flexibility.
Options.
About 99% of the problems
2008/10/10 Richard Elling [EMAIL PROTECTED]:
Timh Bergström wrote:
2008/10/9 Bob Friesenhahn [EMAIL PROTECTED]:
On Thu, 9 Oct 2008, Miles Nordin wrote:
catastrophically. If this is really the situation, then ZFS needs to
give the sysadmin a way to isolate and fix the problems
On Fri, Oct 10, 2008 at 06:15:16AM -0700, Marcelo
Leal wrote:
- ZFS does not need fsck.
Ok, that?s a great statement, but i think ZFS
needs one. Really does.
And in my opinion a enhanced zdb would be the
solution. Flexibility.
Options.
About 99% of the problems reported as I
On Sex, 2008-10-10 at 11:23 -0700, Eric Schrock wrote:
But I haven't actually heard a reasonable proposal for what a
fsck-like tool (i.e. one that could repair things automatically) would
actually *do*, let alone how it would work in the variety of situations
it needs to (compressed RAID-Z?)
Timh Bergström wrote:
2008/10/10 Richard Elling [EMAIL PROTECTED]:
Timh Bergström wrote:
2008/10/9 Bob Friesenhahn [EMAIL PROTECTED]:
On Thu, 9 Oct 2008, Miles Nordin wrote:
catastrophically. If this is really the situation, then ZFS needs to
give the sysadmin
On Oct 10, 2008, at 15:48, Victor Latushkin wrote:
I've mostly seen (2), because despite all the best practices out
there,
single vdev pools are quite common. In all such cases that I had my
hands on it was possible to recover pool by going back by one or two
txgs.
For better or worse
Or is there a way to mitigate a checksum error on non-redundant zpool?
It's just like the difference between non-parity, parity, and ECC memory.
Most filesystems don't have checksums (non-parity), so they don't even
know when they're returning corrupt data. ZFS without any replication
can
On Fri, Oct 10, 2008 at 9:14 PM, Jeff Bonwick [EMAIL PROTECTED] wrote:
Note: even in a single-device pool, ZFS metadata is replicated via
ditto blocks at two or three different places on the device, so that
a localized media failure can be both detected and corrected.
If you have two or more
His explanation: he invalidated the incorrect
uberblocks and forced zfs to revert to an earlier
state that was consistent.
Would someone be willing to document the steps required in order to do this
please?
I have a disk in a similar state:
# zpool import
pool: tank
id:
On Thu, Oct 9, 2008 at 4:53 AM, . [EMAIL PROTECTED] wrote:
While it's clearly my own fault for taking the risks I did, it's
still pretty frustrating knowing that all my data is likely still
intact and nicely checksummed on the disk but that none of it is
accessible due to some tiny filesystem
0n Thu, Oct 09, 2008 at 06:37:23AM -0500, Mike Gerdts wrote:
FWIW, I belive that I have hit the same type of bug as the OP in the
following combinations:
- T2000, LDoms 1.0, various builds of Nevada in control and guest
domains.
- Laptop, VirtualBox 1.6.2, Windows
On Thu, Oct 9, 2008 at 7:44 AM, Ahmed Kamal
[EMAIL PROTECTED] wrote:
In the past year I've lost more ZFS file systems than I have any other
type of file system in the past 5 years. With other file systems I
can almost always get some data back. With ZFS I can't get any back.
Unfortunely I can only agree to the doubts about running ZFS in
production environments, i've lost ditto-blocks, i''ve gotten
corrupted pools and a bunch of other failures even in
mirror/raidz/raidz2 setups with or without hardware mirrors/raid5/6.
Plus the insecurity of a sudden crash/reboot will
Perhaps I mis-understand, but the below issues are all based on Nevada,
not Solaris 10.
Nevada isn't production code. For real ZFS testing, you must use a
production release, currently Solaris 10 (update 5, soon to be update 6).
In the last 2 years, I've stored everything in my environment
On Thu, Oct 9, 2008 at 10:10 AM, Greg Shaw [EMAIL PROTECTED] wrote:
Nevada isn't production code. For real ZFS testing, you must use a
production release, currently Solaris 10 (update 5, soon to be update 6).
I misstated before in my LDoms case. The corrupted pool was on
Solaris 10, with
gs == Greg Shaw [EMAIL PROTECTED] writes:
gs Nevada isn't production code. For real ZFS testing, you must
gs use a production release, currently Solaris 10 (update 5, soon
gs to be update 6).
based on list feedback, my impression is that the results of a
``test'' confined to s10,
On Thu, 9 Oct 2008, Miles Nordin wrote:
catastrophically. If this is really the situation, then ZFS needs to
give the sysadmin a way to isolate and fix the problems
deterministically before filling the pool with data, not just blame
the sysadmin based on nebulous speculatory hindsight
On Thu, Oct 9, 2008 at 10:18 AM, Mike Gerdts [EMAIL PROTECTED] wrote:
On Thu, Oct 9, 2008 at 10:10 AM, Greg Shaw [EMAIL PROTECTED] wrote:
Nevada isn't production code. For real ZFS testing, you must use a
production release, currently Solaris 10 (update 5, soon to be update 6).
I misstated
Fajar A. Nugraha wrote:
On Fri, Oct 3, 2008 at 10:37 PM, Vasile Dumitrescu
[EMAIL PROTECTED] wrote:
VMWare 6.0.4 running on Debian unstable,
Linux bigsrv 2.6.26-1-amd64 #1 SMP Wed Sep 24 13:59:41 UTC 2008 x86_64
GNU/Linux
Solaris is vanilla snv_90 installed with no GUI.
in summary:
On Fri, Oct 3, 2008 at 10:37 PM, Vasile Dumitrescu
[EMAIL PROTECTED] wrote:
VMWare 6.0.4 running on Debian unstable,
Linux bigsrv 2.6.26-1-amd64 #1 SMP Wed Sep 24 13:59:41 UTC 2008 x86_64
GNU/Linux
Solaris is vanilla snv_90 installed with no GUI.
in summary: physical disks, assigned
Hi folks,
I just wanted to share the end of my adventure here and especially take the
time to thank Victor for helping me out of this mess.
I will let him explain the technical details (I am out of my depth here) but
bottom line he spent a couple of hours with me on the machine and sorted me
Vasile Dumitrescu wrote:
Hi folks,
I just wanted to share the end of my adventure here and especially take the
time to thank Victor for helping me out of this mess.
I will let him explain the technical details (I am out of my depth here) but
bottom line he spent a couple of hours with
Which VM solution was this ? VMware, VirtualBox, Xen,
other ? How were
the disks presented to the guest ? What are the
disks in the host,
real disks, files, something else ?
--
Darren J Moffat
___
zfs-discuss mailing list
35 matches
Mail list logo