date:20090903

Re: [zfs-discuss] Archiving and Restoring Snapshots

2009-09-03 Thread Ross

I agree, mailing that to all Sun customers is something I think is likely to 
turn around and bite you.

A lot of people are now going to use that to archive their data, and some of 
them are not going to be happy when months or years down the line they try to 
restore it and find that the 'zfs receive' just fails, with no hope of 
recovering their data.

And they're not going to blame the device that's corrupted a bit, they're going 
to come here blaming Sun and ZFS.  Archiving to a format with such a high risk 
of loosing everything sounds like incredibly bad advice.  Especially when the 
article actually uses the term create archives for long-term storage.

Long term storage doesn't mean storage that could break with the next 
upgrade, or storage that will fail if even a single bit has been corrupted 
anywhere in the process.  Sun really need to decide what they are doing with 
zfs send/receive because we're getting very mixed messages now.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] Problem with RAID-Z in builds snv_120 - snv_123

2009-09-03 Thread Adam Leventhal


Hey folk,

The are two problems with RAID-Z in builds snv_120 through snv_123 that
will both be resolved in build snv_124. The problems are as follows:

1. Data corruption on a RAID-Z system of any sort (raidz1, raidz2,  
raidz3)

can lead to spurious checksum errors being reported on devices that were
not used as part of the reconstruction.

These errors are harmless and can be cleared safely (zpool clear  
pool).



2. There is a far more serious problem with single-parity RAID-Z that  
can
lead to data corruption. This data corruption is recoverable as long  
as no

additional data corruption or drive failure occurs. That is to say, data
is fine provided there is not an additional problem. The problem is  
present

on all raidz1 configurations that use an odd number of children (disks)
e.g. 4+1, or 6+1. Note that raidz1 configurations with an even number of
children (e.g. 3+1), raidz2, and raidz3 are unaffected.

The recommended course of action is to roll back to build snv_119 or
earlier. If for some reason this is impossible, please email me  
PRIVATELY,

and we can discuss the best course of action for you. After rolling back
initiate a scrub. ZFS will identify and correct these errors, but if  
enough
accumulate it will (incorrectly) identify drives as faulty (which they  
likely

aren't). You can clear these failures (zpool clear pool).

Without rolling back, repeated scrubs will eventually remove all  
traces of

the data corruption. You may need to clear checksum failures as they're
identified to ensure that enough drives remain online.


For reference here's the bug:
  6869090 filebench on thumper with ZFS (snv_120) raidz causes  
checksum errors from all drives


Apologies for the bug and for any inconvenience this caused.

Below is a technical description of the two issues. This is for interest
only and does not contain additional discussion of symptoms or  
prescriptive

action.

Adam

---8---

1. In situations where a block read from a RAID-Z vdev fails to checksum
but there were no errors from any of the child vdevs (e.g. hard  
drives) we

must enter combinatorial reconstruction in which we attempt every
combination of data and parity until we find the correct data. The logic
was modified to scale to triple-parity RAID-Z and in doing so I  
introduced

a bug in which spurious errors reports may in some circumstances be
generated for vdevs that were not used as part of the data  
reconstruction.

These do not represent actual corruption or problems with the underlying
devices and can be ignored and cleared.


2. This one is far subtler and requires an understanding of how RAID-Z
writes work. For that I strongly recommend the following blog post from
Jeff Bonwick:

  http://blogs.sun.com/bonwick/entry/raid_z

Basically, RAID-Z writes full stripes every time; note that without  
careful
accounting it would be possible to effectively fragment the vdev such  
that

single sectors were free but useless since single-parity RAID-Z requires
two adjacent sectors to store data (one for data, one for parity). To
address this, RAID-Z rounds up its allocation to the next (nparity + 1).
This ensures that all space is accounted for. RAID-Z will thus skip
sectors that are unused based on this rounding. For example, under  
raidz1

a write of 1024 bytes would result in 512 bytes of parity, 512 bytes of
data on two devices and 512 bytes skipped.

To improve performance, ZFS aggregates multiple adjacent IOs into a  
single

large IO. Further, hard drives themselves can perform aggregation of
adjacent IOs. We noted that these skipped sectors were inhibiting
performance so added optional IOs that could be used to improve
aggregation. This yielded a significant performance boost for all RAID-Z
configurations.

Another nuance of single-parity RAID-Z is that while it normally lays  
down
stripes as P D D (parity, data, data, ...), it will switch every  
megabyte

to move the parity into the second position (data, parity, data, ...).
This was ostensibly to effect the same improvement as between RAID-4 and
RAID-5 -- distributed parity. However, to implement RAID-5 actually
requires full distribution of parity AND RAID-Z already distributes  
parity

by virtue of the skipped sectors and variable width stripes. In other
words, this was not a particularly valid optimization. It was  
accordingly

discarded for double- and tripe-parity RAID-Z. They contain no such
swapping.

The implementation of this swapping was not taken into account for the
optional IOs so rather than writing the optional IO into the skipped
sector, the optional IO overwrote the first sector of the subsequent
stripe with zeros.

The aggregation does not always happen so the corruption is ususally
not pervasive. Futher, raidz1 vdevs with odd numbers of children are  
more

likely to encounter the problem. Let's say we have a raidz1 vdev with
three children. Two writes of 1K each would look like this:

   disks
 0   1   2

Re: [zfs-discuss] zfs performance cliff when over 80% util, still occuring when pool in 6

2009-09-03 Thread John-Paul Drawneek

So I have poked and prodded the disks and they both seem fine.

Any yet my rpool is still slow.

Any ideas on what do do now.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Motherboard for home zfs/solaris file server

2009-09-03 Thread Karel Gardas

Hello,
your (open)solaris for Ecc support (which seems to have been dropped from 
200906) is misunderstanding. OS 2009.06 also supports ECC as 2005 did. Just 
install it and use my updated ecccheck.pl script to get informed about errors. 
Also you might verify that Solaris' memory scrubber is really running if you 
are that curious: 
http://developmentonsolaris.wordpress.com/2009/03/06/how-to-make-sure-memory-scrubber-is-running/
Karel
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] Read about ZFS backup - Still confused

2009-09-03 Thread Cork Smith

I am just a simple home user. When I was using linux, I backed up my home 
directory (which contained all my critical data) using tar. I backed up my 
linux partition using partimage. These backups were put on dvd's. That way I 
could restore (and have) even if the hard drive completely went belly up.

I would like to duplicate this scheme using zfs commands. I know I can copy a 
snapshot to a dvd but can I recover using just the snapshot or does it rely on 
the zfs file system on my hard drive being ok?

Cork
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Problem with RAID-Z in builds snv_120 - snv_123

2009-09-03 Thread Gary Gendel

Alan,

Thanks for the detailed explanation.  The rollback successfully fixed my 5-disk 
RAID-Z errors.  I'll hold off another upgrade attempt until 124 rolls out.  
Fortunately, I didn't do a zfs upgrade right away after installing 121.  For 
those that did, this could be very painful.

Gary
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Problem with RAID-Z in builds snv_120 - snv_123

2009-09-03 Thread Roman Naumenko

 Hey folk,
 
 The are two problems with RAID-Z in builds snv_120
 through snv_123 that
 will both be resolved in build snv_124. The problems
 are as follows:

Thanks for letting us know.

Is there is a way to get promt updates on such issues for Opensolaris? 
(except like reading a discussion list).

Paid support maybe is the answer? Is there any? 

--
Roman
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] How ZFS prefetch issues impact the use of Zvols are remote LUNs

2009-09-03 Thread eneal

Hi. As the subject indicates, I'm trying to understand the impact of  
the ZFS prefetch issues and if they only impact a local zfs  
filesystem versus say a zvol that remotely using the lun via iSCSI or  
Fibrechannel.

Can anyone comment on this?

The specific issues are consolidated here:

http://mail.opensolaris.org/pipermail/zfs-discuss/2009-July/030169.html

Thanks in advance.





This email and any files transmitted with it are confidential and are  
intended solely for the use of the individual or entity to whom they  
are addressed. This communication may contain material protected by  
the attorney-client privilege. If you are not the intended recipient,  
be advised that any use, dissemination, forwarding, printing or  
copying is strictly prohibited. If you have received this email in  
error, please contact the sender and delete all copies.




___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] [storage-discuss] How ZFS prefetch issues impact the use of Zvols are remote LUNs

2009-09-03 Thread eneal


Quoting en...@businessgrade.com:


Hi. As the subject indicates, I'm trying to understand the impact of
the ZFS prefetch issues and if they only impact a local zfs
filesystem versus say a zvol that remotely using the lun via iSCSI or
Fibrechannel.
Can anyone comment on this?

The specific issues are consolidated here:

http://mail.opensolaris.org/pipermail/zfs-discuss/2009-July/030169.html

Thanks in advance.


Sorry the question is not formatted properly. I mean to say, ..  
versus say a zvol that a remote host is using as a lun via iSCSI or FC




This email and any files transmitted with it are confidential and are  
intended solely for the use of the individual or entity to whom they  
are addressed. This communication may contain material protected by  
the attorney-client privilege. If you are not the intended recipient,  
be advised that any use, dissemination, forwarding, printing or  
copying is strictly prohibited. If you have received this email in  
error, please contact the sender and delete all copies.




___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Archiving and Restoring Snapshots

2009-09-03 Thread Lori Alt



I agree and Cindy Swearingen and I are talking to marketing to get this 
fixed.  Thanks to all for bringing this to our attention.


lori

On 09/03/09 00:55, Ross wrote:

I agree, mailing that to all Sun customers is something I think is likely to 
turn around and bite you.

A lot of people are now going to use that to archive their data, and some of 
them are not going to be happy when months or years down the line they try to 
restore it and find that the 'zfs receive' just fails, with no hope of 
recovering their data.

And they're not going to blame the device that's corrupted a bit, they're going to come 
here blaming Sun and ZFS.  Archiving to a format with such a high risk of loosing 
everything sounds like incredibly bad advice.  Especially when the article actually uses 
the term create archives for long-term storage.

Long term storage doesn't mean storage that could break with the next upgrade, or 
storage that will fail if even a single bit has been corrupted anywhere in the process. 
 Sun really need to decide what they are doing with zfs send/receive because we're getting very 
mixed messages now.
  


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Problem with RAID-Z in builds snv_120 - snv_123

2009-09-03 Thread Roman Naumenko

And a question here how to control number of dev version to install?

--
Roman
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Archiving and Restoring Snapshots

2009-09-03 Thread Richard Elling


On Sep 2, 2009, at 11:55 PM, Ross wrote:

I agree, mailing that to all Sun customers is something I think is  
likely to turn around and bite you.


Some points to help clarify the situation:

1. There is no other way to archive a dataset than using a snapshot

2. You cannot build a zpool on a tape

	3. The stability of the protocol is only a problem if it becomes  
impossible

   to run some version of OpenSolaris on hardware that is needed to
	   receive the snapshot. Given the ubiquity of virtualization and the  
x86
	   legacy, I don't think this is a problem for at least the expected  
lifetime

   of the storage medium.

A lot of people are now going to use that to archive their data, and  
some of them are not going to be happy when months or years down the  
line they try to restore it and find that the 'zfs receive' just  
fails, with no hope of recovering their data.


And they're not going to blame the device that's corrupted a bit,  
they're going to come here blaming Sun and ZFS.  Archiving to a  
format with such a high risk of loosing everything sounds like  
incredibly bad advice.  Especially when the article actually uses  
the term create archives for long-term storage.


Long term storage doesn't mean storage that could break with the  
next upgrade, or storage that will fail if even a single bit has  
been corrupted anywhere in the process.  Sun really need to decide  
what they are doing with zfs send/receive because we're getting very  
mixed messages now.


Data integrity is a problem for all archiving systems, not just ZFS.
 -- richard

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Problem with RAID-Z in builds snv_120 - snv_123

2009-09-03 Thread Jeff Victor


Roman Naumenko wrote:

Hey folk,

The are two problems with RAID-Z in builds snv_120 through snv_123 that
will both be resolved in build snv_124. The problems are as follows:


Thanks for letting us know.

Is there is a way to get promt updates on such issues for Opensolaris? 
(except like reading a discussion list).
  
Paid support maybe is the answer? Is there any? 
  
You can learn about support options for OpenSolaris 2009.06 at 
http://www.sun.com/service/opensolaris/index.jsp?intcmp=2166 . However, 
AFAIK OpenSolaris 2009.06 does not have the problem being discussed.


The snv_ builds are developer builds, and support contracts are not 
available for them.


So if you want the newest supportable features, choose OpenSolaris (the 
distro). If you want to test out new features fresh out of the oven 
you can test the snv_ builds.


--JeffV

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Archiving and Restoring Snapshots

2009-09-03 Thread Lori Alt



yes to all the comments below.  Those are all mitigating factors.  But I 
also agree with Ross and Mike and others that we should be more clear 
about when send/recv is appropriate and when it's not the best choice.  
We're looking into it.


Lori


On 09/03/09 10:06, Richard Elling wrote:

On Sep 2, 2009, at 11:55 PM, Ross wrote:

I agree, mailing that to all Sun customers is something I think is 
likely to turn around and bite you.


Some points to help clarify the situation:

1. There is no other way to archive a dataset than using a snapshot

2. You cannot build a zpool on a tape

3. The stability of the protocol is only a problem if it becomes 
impossible

   to run some version of OpenSolaris on hardware that is needed to
   receive the snapshot. Given the ubiquity of virtualization and 
the x86
   legacy, I don't think this is a problem for at least the 
expected lifetime

   of the storage medium.

A lot of people are now going to use that to archive their data, and 
some of them are not going to be happy when months or years down the 
line they try to restore it and find that the 'zfs receive' just 
fails, with no hope of recovering their data.


And they're not going to blame the device that's corrupted a bit, 
they're going to come here blaming Sun and ZFS.  Archiving to a 
format with such a high risk of loosing everything sounds like 
incredibly bad advice.  Especially when the article actually uses the 
term create archives for long-term storage.


Long term storage doesn't mean storage that could break with the 
next upgrade, or storage that will fail if even a single bit has 
been corrupted anywhere in the process.  Sun really need to decide 
what they are doing with zfs send/receive because we're getting very 
mixed messages now.


Data integrity is a problem for all archiving systems, not just ZFS.
 -- richard

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Problem with RAID-Z in builds snv_120 - snv_123

2009-09-03 Thread Collier Minerich

Please unsubscribe me

COLLIER

-Original Message-
From: zfs-discuss-boun...@opensolaris.org
[mailto:zfs-discuss-boun...@opensolaris.org] On Behalf Of Adam Leventhal
Sent: Thursday, September 03, 2009 2:08 AM
To: zfs-discuss@opensolaris.org discuss
Subject: [zfs-discuss] Problem with RAID-Z in builds snv_120 - snv_123

Hey folk,

The are two problems with RAID-Z in builds snv_120 through snv_123 that
will both be resolved in build snv_124. The problems are as follows:

1. Data corruption on a RAID-Z system of any sort (raidz1, raidz2,  
raidz3)
can lead to spurious checksum errors being reported on devices that were
not used as part of the reconstruction.

These errors are harmless and can be cleared safely (zpool clear  
pool).

2. There is a far more serious problem with single-parity RAID-Z that  
can
lead to data corruption. This data corruption is recoverable as long  
as no
additional data corruption or drive failure occurs. That is to say, data
is fine provided there is not an additional problem. The problem is  
present
on all raidz1 configurations that use an odd number of children (disks)
e.g. 4+1, or 6+1. Note that raidz1 configurations with an even number of
children (e.g. 3+1), raidz2, and raidz3 are unaffected.

The recommended course of action is to roll back to build snv_119 or
earlier. If for some reason this is impossible, please email me  
PRIVATELY,
and we can discuss the best course of action for you. After rolling back
initiate a scrub. ZFS will identify and correct these errors, but if  
enough
accumulate it will (incorrectly) identify drives as faulty (which they  
likely
aren't). You can clear these failures (zpool clear pool).

Without rolling back, repeated scrubs will eventually remove all  
traces of
the data corruption. You may need to clear checksum failures as they're
identified to ensure that enough drives remain online.

For reference here's the bug:
   6869090 filebench on thumper with ZFS (snv_120) raidz causes  
checksum errors from all drives

Apologies for the bug and for any inconvenience this caused.

Below is a technical description of the two issues. This is for interest
only and does not contain additional discussion of symptoms or  
prescriptive
action.

Adam

---8---

1. In situations where a block read from a RAID-Z vdev fails to checksum
but there were no errors from any of the child vdevs (e.g. hard  
drives) we
must enter combinatorial reconstruction in which we attempt every
combination of data and parity until we find the correct data. The logic
was modified to scale to triple-parity RAID-Z and in doing so I  
introduced
a bug in which spurious errors reports may in some circumstances be
generated for vdevs that were not used as part of the data  
reconstruction.
These do not represent actual corruption or problems with the underlying
devices and can be ignored and cleared.

2. This one is far subtler and requires an understanding of how RAID-Z
writes work. For that I strongly recommend the following blog post from
Jeff Bonwick:

   http://blogs.sun.com/bonwick/entry/raid_z

Basically, RAID-Z writes full stripes every time; note that without  
careful
accounting it would be possible to effectively fragment the vdev such  
that
single sectors were free but useless since single-parity RAID-Z requires
two adjacent sectors to store data (one for data, one for parity). To
address this, RAID-Z rounds up its allocation to the next (nparity + 1).
This ensures that all space is accounted for. RAID-Z will thus skip
sectors that are unused based on this rounding. For example, under  
raidz1
a write of 1024 bytes would result in 512 bytes of parity, 512 bytes of
data on two devices and 512 bytes skipped.

To improve performance, ZFS aggregates multiple adjacent IOs into a  
single
large IO. Further, hard drives themselves can perform aggregation of
adjacent IOs. We noted that these skipped sectors were inhibiting
performance so added optional IOs that could be used to improve
aggregation. This yielded a significant performance boost for all RAID-Z
configurations.

Another nuance of single-parity RAID-Z is that while it normally lays  
down
stripes as P D D (parity, data, data, ...), it will switch every  
megabyte
to move the parity into the second position (data, parity, data, ...).
This was ostensibly to effect the same improvement as between RAID-4 and
RAID-5 -- distributed parity. However, to implement RAID-5 actually
requires full distribution of parity AND RAID-Z already distributes  
parity
by virtue of the skipped sectors and variable width stripes. In other
words, this was not a particularly valid optimization. It was  
accordingly
discarded for double- and tripe-parity RAID-Z. They contain no such
swapping.

The implementation of this swapping was not taken into account for the
optional IOs so rather than writing the optional IO into the skipped
sector, the optional IO overwrote the first sector of the subsequent
stripe with zeros.

The

Re: [zfs-discuss] zfs performance cliff when over 80% util, still occuring when pool in 6

2009-09-03 Thread Collier Minerich

Please unsubscribe me

COLLIER


-Original Message-
From: zfs-discuss-boun...@opensolaris.org
[mailto:zfs-discuss-boun...@opensolaris.org] On Behalf Of John-Paul Drawneek
Sent: Thursday, September 03, 2009 2:13 AM
To: zfs-discuss@opensolaris.org
Subject: Re: [zfs-discuss] zfs performance cliff when over 80% util, still
occuring when pool in 6

So I have poked and prodded the disks and they both seem fine.

Any yet my rpool is still slow.

Any ideas on what do do now.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] [storage-discuss] How ZFS prefetch issues impact the use of Zvols are remote LUNs

2009-09-03 Thread Collier Minerich

Please unsubscribe me

COLLIER


-Original Message-
From: storage-discuss-boun...@opensolaris.org
[mailto:storage-discuss-boun...@opensolaris.org] On Behalf Of
en...@businessgrade.com
Sent: Thursday, September 03, 2009 8:52 AM
To: en...@businessgrade.com
Cc: zfs-discuss@opensolaris.org; storage-disc...@opensolaris.org
Subject: Re: [storage-discuss] How ZFS prefetch issues impact the use of
Zvols are remote LUNs

Quoting en...@businessgrade.com:

 Hi. As the subject indicates, I'm trying to understand the impact of
 the ZFS prefetch issues and if they only impact a local zfs
 filesystem versus say a zvol that remotely using the lun via iSCSI or
 Fibrechannel.
 Can anyone comment on this?

 The specific issues are consolidated here:

 http://mail.opensolaris.org/pipermail/zfs-discuss/2009-July/030169.html

 Thanks in advance.


Sorry the question is not formatted properly. I mean to say, ..  
versus say a zvol that a remote host is using as a lun via iSCSI or FC




This email and any files transmitted with it are confidential and are  
intended solely for the use of the individual or entity to whom they  
are addressed. This communication may contain material protected by  
the attorney-client privilege. If you are not the intended recipient,  
be advised that any use, dissemination, forwarding, printing or  
copying is strictly prohibited. If you have received this email in  
error, please contact the sender and delete all copies.



___
storage-discuss mailing list
storage-disc...@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/storage-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Read about ZFS backup - Still confused

2009-09-03 Thread Collier Minerich

Please unsubscribe me

COLLIER

-Original Message-
From: zfs-discuss-boun...@opensolaris.org
[mailto:zfs-discuss-boun...@opensolaris.org] On Behalf Of Cork Smith
Sent: Thursday, September 03, 2009 4:43 AM
To: zfs-discuss@opensolaris.org
Subject: [zfs-discuss] Read about ZFS backup - Still confused

I am just a simple home user. When I was using linux, I backed up my home
directory (which contained all my critical data) using tar. I backed up my
linux partition using partimage. These backups were put on dvd's. That way I
could restore (and have) even if the hard drive completely went belly up.

I would like to duplicate this scheme using zfs commands. I know I can copy
a snapshot to a dvd but can I recover using just the snapshot or does it
rely on the zfs file system on my hard drive being ok?

Cork
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Archiving and Restoring Snapshots

2009-09-03 Thread Collier Minerich

Please unsubscribe me

COLLIER

-Original Message-
From: zfs-discuss-boun...@opensolaris.org
[mailto:zfs-discuss-boun...@opensolaris.org] On Behalf Of Lori Alt
Sent: Thursday, September 03, 2009 8:59 AM
To: Ross
Cc: zfs-discuss@opensolaris.org
Subject: Re: [zfs-discuss] Archiving and Restoring Snapshots

I agree and Cindy Swearingen and I are talking to marketing to get this 
fixed.  Thanks to all for bringing this to our attention.

lori

On 09/03/09 00:55, Ross wrote:
 I agree, mailing that to all Sun customers is something I think is likely
to turn around and bite you.

 A lot of people are now going to use that to archive their data, and some
of them are not going to be happy when months or years down the line they
try to restore it and find that the 'zfs receive' just fails, with no hope
of recovering their data.

 And they're not going to blame the device that's corrupted a bit, they're
going to come here blaming Sun and ZFS.  Archiving to a format with such a
high risk of loosing everything sounds like incredibly bad advice.
Especially when the article actually uses the term create archives for
long-term storage.

 Long term storage doesn't mean storage that could break with the next
upgrade, or storage that will fail if even a single bit has been corrupted
anywhere in the process.  Sun really need to decide what they are doing
with zfs send/receive because we're getting very mixed messages now.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Problem with RAID-Z in builds snv_120 - snv_123

2009-09-03 Thread Collier Minerich

Please unsubscribe me

COLLIER

-Original Message-
From: zfs-discuss-boun...@opensolaris.org
[mailto:zfs-discuss-boun...@opensolaris.org] On Behalf Of Gary Gendel
Sent: Thursday, September 03, 2009 7:45 AM
To: zfs-discuss@opensolaris.org
Subject: Re: [zfs-discuss] Problem with RAID-Z in builds snv_120 - snv_123

Alan,

Thanks for the detailed explanation.  The rollback successfully fixed my
5-disk RAID-Z errors.  I'll hold off another upgrade attempt until 124 rolls
out.  Fortunately, I didn't do a zfs upgrade right away after installing
121.  For those that did, this could be very painful.

Gary
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Archiving and Restoring Snapshots

2009-09-03 Thread Collier Minerich

Please unsubscribe me

COLLIER

-Original Message-
From: zfs-discuss-boun...@opensolaris.org
[mailto:zfs-discuss-boun...@opensolaris.org] On Behalf Of Richard Elling
Sent: Thursday, September 03, 2009 9:06 AM
To: Ross
Cc: zfs-discuss@opensolaris.org
Subject: Re: [zfs-discuss] Archiving and Restoring Snapshots

On Sep 2, 2009, at 11:55 PM, Ross wrote:

 I agree, mailing that to all Sun customers is something I think is  
 likely to turn around and bite you.

Some points to help clarify the situation:

1. There is no other way to archive a dataset than using a snapshot

2. You cannot build a zpool on a tape

3. The stability of the protocol is only a problem if it becomes  
impossible
   to run some version of OpenSolaris on hardware that is needed to
   receive the snapshot. Given the ubiquity of virtualization and
the  
x86
   legacy, I don't think this is a problem for at least the expected

lifetime
   of the storage medium.

 A lot of people are now going to use that to archive their data, and  
 some of them are not going to be happy when months or years down the  
 line they try to restore it and find that the 'zfs receive' just  
 fails, with no hope of recovering their data.

 And they're not going to blame the device that's corrupted a bit,  
 they're going to come here blaming Sun and ZFS.  Archiving to a  
 format with such a high risk of loosing everything sounds like  
 incredibly bad advice.  Especially when the article actually uses  
 the term create archives for long-term storage.

 Long term storage doesn't mean storage that could break with the  
 next upgrade, or storage that will fail if even a single bit has  
 been corrupted anywhere in the process.  Sun really need to decide  
 what they are doing with zfs send/receive because we're getting very  
 mixed messages now.

Data integrity is a problem for all archiving systems, not just ZFS.
  -- richard

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Problem with RAID-Z in builds snv_120 - snv_123

2009-09-03 Thread Collier Minerich

Please unsubscribe me

COLLIER


-Original Message-
From: zfs-discuss-boun...@opensolaris.org
[mailto:zfs-discuss-boun...@opensolaris.org] On Behalf Of Roman Naumenko
Sent: Thursday, September 03, 2009 9:03 AM
To: zfs-discuss@opensolaris.org
Subject: Re: [zfs-discuss] Problem with RAID-Z in builds snv_120 - snv_123

And a question here how to control number of dev version to install?

--
Roman
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Problem with RAID-Z in builds snv_120 - snv_123

2009-09-03 Thread Collier Minerich

Please unsubscribe me

COLLIER

-Original Message-
From: zfs-discuss-boun...@opensolaris.org
[mailto:zfs-discuss-boun...@opensolaris.org] On Behalf Of Jeff Victor
Sent: Thursday, September 03, 2009 9:10 AM
To: Roman Naumenko
Cc: zfs-discuss@opensolaris.org
Subject: Re: [zfs-discuss] Problem with RAID-Z in builds snv_120 - snv_123

Roman Naumenko wrote:
 Hey folk,

 The are two problems with RAID-Z in builds snv_120 through snv_123 that
 will both be resolved in build snv_124. The problems are as follows:

 Thanks for letting us know.

 Is there is a way to get promt updates on such issues for Opensolaris? 
 (except like reading a discussion list).

 Paid support maybe is the answer? Is there any? 

You can learn about support options for OpenSolaris 2009.06 at 
http://www.sun.com/service/opensolaris/index.jsp?intcmp=2166 . However, 
AFAIK OpenSolaris 2009.06 does not have the problem being discussed.

The snv_ builds are developer builds, and support contracts are not 
available for them.

So if you want the newest supportable features, choose OpenSolaris (the 
distro). If you want to test out new features fresh out of the oven 
you can test the snv_ builds.

--JeffV

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Problem with RAID-Z in builds snv_120 - snv_123

2009-09-03 Thread Collier Minerich

Please unsubscribe me

COLLIER

-Original Message-
From: zfs-discuss-boun...@opensolaris.org
[mailto:zfs-discuss-boun...@opensolaris.org] On Behalf Of Roman Naumenko
Sent: Thursday, September 03, 2009 8:39 AM
To: zfs-discuss@opensolaris.org
Subject: Re: [zfs-discuss] Problem with RAID-Z in builds snv_120 - snv_123

 Hey folk,

 The are two problems with RAID-Z in builds snv_120
 through snv_123 that
 will both be resolved in build snv_124. The problems
 are as follows:

Thanks for letting us know.

Is there is a way to get promt updates on such issues for Opensolaris? 
(except like reading a discussion list).

Paid support maybe is the answer? Is there any? 

--
Roman
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] [storage-discuss] How ZFS prefetch issues impact the use of Zvols are remote LUNs

2009-09-03 Thread Collier Minerich

Please unsubscribe me

COLLIER


-Original Message-
From: storage-discuss-boun...@opensolaris.org
[mailto:storage-discuss-boun...@opensolaris.org] On Behalf Of
en...@businessgrade.com
Sent: Thursday, September 03, 2009 8:49 AM
To: zfs-discuss@opensolaris.org; storage-disc...@opensolaris.org
Subject: [storage-discuss] How ZFS prefetch issues impact the use of Zvols
are remote LUNs

Hi. As the subject indicates, I'm trying to understand the impact of  
the ZFS prefetch issues and if they only impact a local zfs  
filesystem versus say a zvol that remotely using the lun via iSCSI or  
Fibrechannel.
Can anyone comment on this?

The specific issues are consolidated here:

http://mail.opensolaris.org/pipermail/zfs-discuss/2009-July/030169.html

Thanks in advance.






This email and any files transmitted with it are confidential and are  
intended solely for the use of the individual or entity to whom they  
are addressed. This communication may contain material protected by  
the attorney-client privilege. If you are not the intended recipient,  
be advised that any use, dissemination, forwarding, printing or  
copying is strictly prohibited. If you have received this email in  
error, please contact the sender and delete all copies.



___
storage-discuss mailing list
storage-disc...@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/storage-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Archiving and Restoring Snapshots

2009-09-03 Thread Collier Minerich

Please unsubscribe me

COLLIER

-Original Message-
From: zfs-discuss-boun...@opensolaris.org
[mailto:zfs-discuss-boun...@opensolaris.org] On Behalf Of Lori Alt
Sent: Thursday, September 03, 2009 9:14 AM
To: Richard Elling
Cc: zfs-discuss@opensolaris.org; Ross
Subject: Re: [zfs-discuss] Archiving and Restoring Snapshots

yes to all the comments below.  Those are all mitigating factors.  But I 
also agree with Ross and Mike and others that we should be more clear 
about when send/recv is appropriate and when it's not the best choice.  
We're looking into it.

Lori

On 09/03/09 10:06, Richard Elling wrote:
 On Sep 2, 2009, at 11:55 PM, Ross wrote:

 I agree, mailing that to all Sun customers is something I think is 
 likely to turn around and bite you.

 Some points to help clarify the situation:

 1. There is no other way to archive a dataset than using a snapshot

 2. You cannot build a zpool on a tape

 3. The stability of the protocol is only a problem if it becomes 
 impossible
to run some version of OpenSolaris on hardware that is needed to
receive the snapshot. Given the ubiquity of virtualization and 
 the x86
legacy, I don't think this is a problem for at least the 
 expected lifetime
of the storage medium.

 A lot of people are now going to use that to archive their data, and 
 some of them are not going to be happy when months or years down the 
 line they try to restore it and find that the 'zfs receive' just 
 fails, with no hope of recovering their data.

 And they're not going to blame the device that's corrupted a bit, 
 they're going to come here blaming Sun and ZFS.  Archiving to a 
 format with such a high risk of loosing everything sounds like 
 incredibly bad advice.  Especially when the article actually uses the 
 term create archives for long-term storage.

 Long term storage doesn't mean storage that could break with the 
 next upgrade, or storage that will fail if even a single bit has 
 been corrupted anywhere in the process.  Sun really need to decide 
 what they are doing with zfs send/receive because we're getting very 
 mixed messages now.

 Data integrity is a problem for all archiving systems, not just ZFS.
  -- richard

 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Archiving and Restoring Snapshots

2009-09-03 Thread Ross

 Some points to help clarify the situation:
 
 1. There is no other way to archive a dataset than
 n using a snapshot

Other than tar, star, and numerous other archive utilities.  The fact that ZFS 
doesn't have any alternative built in doesn't mean that dumping a send to a 
file suddenly becomes a good idea.  Especially when there is no step in those 
instructions to verify that the archived file is actually a valid zfs send 
stream.

   2. You cannot build a zpool on a tape

So you need a different solution for tape, possibly streaming the files to 
tape, with the zpool configuration also documented and backed up somehow if 
necessary.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Problem with RAID-Z in builds snv_120 - snv_123

2009-09-03 Thread Ross

Yeah, I wouldn't mind knowing that too.  With the old snv builds I just 
downloaded the appropriate image, with OpenSolaris and the development 
repository, is there any way to pick a particular build?
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] nuke lots of snapshots

2009-09-03 Thread Jacob Ritorto

Sorry if this is a faq, but I just got a time sensitive dictim from the 
higherups to disable and remove all remnants of rolling snapshots on our 
DR filer.  Is there a way for me to nuke all snapshots with a single 
command, or to I have to manually destroy all 600+ snapshots with zfs 
destroy?


osol 2008.11


thx
jake
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] nuke lots of snapshots

2009-09-03 Thread Gaëtan Lehmann



Le 3 sept. 09 à 19:57, Jacob Ritorto a écrit :

Sorry if this is a faq, but I just got a time sensitive dictim from  
the higherups to disable and remove all remnants of rolling  
snapshots on our DR filer.  Is there a way for me to nuke all  
snapshots with a single command, or to I have to manually destroy  
all 600+ snapshots with zfs destroy?



  zfs list -r -t snapshot -o name -H pool | xargs -tl zfs destroy

should destroy all the snapshots in a pool

Gaëtan


--
Gaëtan Lehmann
Biologie du Développement et de la Reproduction
INRA de Jouy-en-Josas (France)
tel: +33 1 34 65 29 66fax: 01 34 65 29 09
http://voxel.jouy.inra.fr  http://www.itk.org
http://www.mandriva.org  http://www.bepo.fr



PGP.sig
Description: Ceci est une signature électronique PGP
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] nuke lots of snapshots

2009-09-03 Thread Jacob Ritorto


Gaëtan Lehmann wrote:


  zfs list -r -t snapshot -o name -H pool | xargs -tl zfs destroy

should destroy all the snapshots in a pool



Thanks Gaëtan.  I added 'grep auto' to filter on just the rolling snaps 
and found that xargs wouldn't let me put both flags on the same dash, so:


zfs list -r -t snapshot -o name -H poolName | grep auto | xargs -t -l 
zfs destroy



worked for me.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Problem with RAID-Z in builds snv_120 - snv_123

2009-09-03 Thread Ross Walker


On Sep 3, 2009, at 1:25 PM, Ross myxi...@googlemail.com wrote:

Yeah, I wouldn't mind knowing that too.  With the old snv builds I  
just downloaded the appropriate image, with OpenSolaris and the  
development repository, is there any way to pick a particular build?


I just do a 'pkg list entire' and then install the build I want with a  
'pkg install entire-build'


-Ross

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] (no subject)

2009-09-03 Thread Peter Schow


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Problem with RAID-Z in builds snv_120 - snv_123

2009-09-03 Thread Roman Naumenko

 On Sep 3, 2009, at 1:25 PM, Ross
 myxi...@googlemail.com wrote:
 
  Yeah, I wouldn't mind knowing that too.  With the
 old snv builds I  
  just downloaded the appropriate image, with
 OpenSolaris and the  
  development repository, is there any way to pick a
 particular build?
 
 I just do a 'pkg list entire' and then install the
 build I want with a  'pkg install entire-build'

Ross, can you provide details?  Doesn't it show the latest? 

uname -a
SunOS zsan00 5.11 snv_118 i86pc i386 i86pc Solaris

r...@zsan00:~# pkg list entire
NAME (PUBLISHER)  VERSION STATE  UFIX
entire0.5.11-0.118installed  u---

r...@zsan00:~# pkg refresh

r...@zsan00:~# pkg list entire
NAME (PUBLISHER)  VERSION STATE  UFIX
entire0.5.11-0.118installed  u---

--
Roman Naumenko
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Problem with RAID-Z in builds snv_120 - snv_123

2009-09-03 Thread Roman Naumenko

Hey, web-admins, you see what happens when mailling list is screwed up from the 
beginning?

--
Roman
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Problem with RAID-Z in builds snv_120 - snv_123

2009-09-03 Thread Simon Breden

Hi Adam,

Thanks for the info on this. Some people, including myself, reported seeing 
checksum errors within mirrors too. Is it considered that these checksum errors 
within mirrors could also be related to this bug, or is there another bug 
related to checksum errors within mirrors that I should take a look at?
Search for 'mirror' here:
http://opensolaris.org/jive/thread.jspa?threadID=111316tstart=0

Cheers,
Simon

And good luck with the fix for build 124. Are talking days or weeks for the fix 
to be available, do you think? :)
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] snv_110 - snv_121 produces checksum errors on Raid-Z pool

2009-09-03 Thread Simon Breden

So what's the consensus on checksum errors appearing within mirror vdevs?
Is it caused the same bug announced by Adam, or is something else causing it?
If so, what's the bug id?

Cheers,
Simon
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] check a zfs rcvd file

2009-09-03 Thread dick hoogendijk

On Wed, 2 Sep 2009 13:06:35 -0500 (CDT)
Bob Friesenhahn bfrie...@simple.dallas.tx.us wrote:
 Nothing prevents validating the self-verifying archive file via this
 zfs recv -vn  technique.

Does this verify the ZFS format/integrity of the stream?
Or is the only way to do that to zfs recv the stream into ZFS?

-- 
Dick Hoogendijk -- PGP/GnuPG key: 01D2433D
+ http://nagual.nl/ | SunOS 10u7 5/09 | OpenSolaris 2010.02 B121
+ All that's really worth doing is what we do for others (Lewis Carrol)
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Problem with RAID-Z in builds snv_120 - snv_123

2009-09-03 Thread Adam Leventhal

Hey Simon,

 Thanks for the info on this. Some people, including myself, reported seeing
 checksum errors within mirrors too. Is it considered that these checksum
 errors within mirrors could also be related to this bug, or is there another
 bug related to checksum errors within mirrors that I should take a look at?

Absolutely not. That is an unrelated issue. This problem is isolated to
RAID-Z.

 And good luck with the fix for build 124. Are talking days or weeks for the
 fix to be available, do you think? :) -- 

Days or hours.

Adam

-- 
Adam Leventhal, Fishworks http://blogs.sun.com/ahl
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] check a zfs rcvd file

2009-09-03 Thread Lori Alt


On 09/03/09 14:21, dick hoogendijk wrote:

On Wed, 2 Sep 2009 13:06:35 -0500 (CDT)
Bob Friesenhahn bfrie...@simple.dallas.tx.us wrote:
  

Nothing prevents validating the self-verifying archive file via this
zfs recv -vn  technique.



Does this verify the ZFS format/integrity of the stream?
Or is the only way to do that to zfs recv the stream into ZFS?

  
The -n option does some verification.  It verifies that the record 
headers distributed throughout the stream are syntactically valid.  
Since each record header contains a length field which allows the next 
header to be found, one bad header will cause the processing of the 
stream to abort.  But it doesn't verify the content of the data 
associated with each record. 

We might want to implement an option to enhance zfs recv -n to calculate 
a checksum of each dataset's records as it's reading the stream and then 
verify the checksum when the dataset's END record is seen.   I'm 
looking at integrating a utility which allows the metadata in a stream 
to be dumped for debugging purposes (zstreamdump).   It also verifies 
that the data in the stream agrees with the checksum.


lori
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Problem with RAID-Z in builds snv_120 - snv_123

2009-09-03 Thread Simon Breden

OK, thanks Adam.
I'll look elsewhere for the mirror checksum error issue. In fact there's 
already a response here, which I shall check up on:
http://opensolaris.org/jive/thread.jspa?messageID=413169#413169

Thanks again, and I look forward to grabbing 124 soon.

Cheers,
Simon
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] [osol-discuss] zfs data loss after crash

2009-09-03 Thread Shawn Walker


[cc'ing zfs-discuss]

Scott Feldstein wrote:

Hi,
I had an opensolaris instance (b118) completely crash the other day 
(entirely my fault) and I took one of my mirrored zfs drives and 
imported into another system in order to retrieve the data.  I imported 
the data via zpool import -f id pool and I just noticed that the 
data is way out of date.  Most of the files look like I have lost 2 
weeks worth of data.


So, my questions are:

1) Is this expected?

2) Is there anyway to get my recent data back?


Someone on zfs-discuss is likely to be able to help you better.

Cheers,
--
Shawn Walker
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Read about ZFS backup - Still confused

2009-09-03 Thread Cork Smith

Let me try rephrasing this. I would like the ability to restore so my system 
mirrors its state at the time when I backed it up given the old hard drive is 
now a door stop.

Cork
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] snv_110 - snv_121 produces checksum errors on Raid-Z pool

2009-09-03 Thread Simon Breden

Thanks Gaëtan.

What's the bug id for this iommu bug on Intel platforms?

In my case, I have an AMD processor with ECC RAM, so probably not related to 
the Intel iommu bug.

I'm seeing the checksum errors in a mirrored rpool using SSDs so maybe it could 
be something like cosmic rays causing occasional random bits to flip? After 
clearing the errors and scrubbing the pool a couple of times until the errors 
were fixed, I have not seen any new checksum errors, and I'm using 121 at the 
moment, though I should probably drop back to 117 to avoid the RAID-Z bug, 
although I have a RAID-Z2 vedev and not a RAID-Z1 vdev so I should not 
encounter the more serious problem mentioned.

After the errors reported during the scrub on snv 121, I run a scrub on snv 
118 and find the same
 amount of error, all on rpool/dump. I dropped that zvol, rerun the scrub 
 again still on snv 118 
without any error. After a reboot on snv 121 and a new scrub, no checksum 
error are reported.

You did #zfs destroy rpool/dump ?
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] snv_110 - snv_121 produces checksum errors on Raid-Z pool

2009-09-03 Thread Frank Middleton


It was someone from Sun that recently asked me to repost here
about the checksum problem on mirrored drives. I was reluctant
to do so because you and Bob might start flames again, and you
did! You both sound very defensive, but of course I would never
make an unsubstantiated speculation that you might have vulnerable
hardware :-). But in case you do, please don't shoot the
messenger...

Instead of being negative, how about some conjectures of your
own about this?. here's a summary of what is happening:

An old machine with mirrored drives and a suspect mobo (maybe
not checking PCI parity) gets checksum errors on reboot and scrub.
With copies=1 it fails to repair them. With copies=2 it apparently
fixes them, but zcksummon shows quite clearly that on a scrub,
zfs finds and repairs them again on every scrub, even though
scrub shows no errors. Typically these files are system
libraries and unless you actually replace them, they are
never truly repaired.

Although I really don't think this is caused by cosmic rays,
are you also saying that PCs without ECC on memory and/or buses
will *never* experience a glitch? You obviously don't play the
lottery :-) [ZFS errors due to memory hits seem far more likely
than winning a 6 ball lottery for typical retail consumer loads]

On 09/02/09 06:54 PM, Tim Cook wrote:


Define more systems.  How many people do you think are on 121?  And of


Absolutely no idea. Enough, though.
 

those, how many are on the zfs mailing list?  And of those, how many


Probably - all of them (yes, this is an unsubstantiated speculation).


have done a scrub recently to see the checksum errors?  Do you have some
proof to validate your beliefs?


If you had read the thread carefully, you would note that a scrub actually
clears the errors (but zcksummon shows that they really aren't cleared). And
doesn't the guide tell us to run scrubs frequently? I am sure we all dutifully
do so :-). I'd be quite happy to send you the proof.


REGARDLESS, had you read all the posts to this thread, you'd know you've
already been proven wrong:


Wrong about what? Reading posts before they are posted?

I have read every post most carefully. Having experienced checksum
failures on mirrored drives for 4 months now (and there's a CR
against snv115 for a similar problem), what exactly do you think I
am trying to prove, or what beliefs? After 4 months of hearing the
hardware being blamed for the checksum problem (which is easy to
reproduce against snv111b), all I'm doing is agreeing that it is
likely triggered by some kind of soft hardware glitch, we just
don't know what the glitch might be. The SPoFs on this machine
are the disk controller, the PCI bus, and memory, (and cpu, of
course). Take your pick.

FWIW it always picks on SUNWcsl (libdlpi.so.1) - 3 or 4 times now,
and more recently, /usr/share/doc/SUNWmusicbrainz/COPYING.bz2.
I am skeptical that the disk controller is picking on certain
files, so that leaves memory and the bus. Take your pick. New
files get added to the list quite infrequently. But it could also
be a pure software bug - some kind of race condition, perhaps.


On Wed, Sep 2, 2009 at 11:15 AM, Brent Jones br...@servuhome.net
mailto:br...@servuhome.net wrote:
I see this issue on each of my X4540's, 64GB of ECC memory, 1TB drives.
Rolling back to snv_118 does not reveal any checksum errors, only
snc_121

So, the commodity hardware here doesn't hold up, unless Sun isn't
validating their equipment (not likely, as these servers have had no
hardware issues prior to this build)


Exactly. My whole point. Glad to hear that Sun hardware is as reliable as
ever!  I hope Richard's new and improved zcksummon will shed more light
on this...

Cheers -- Frank
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Read about ZFS backup - Still confused

2009-09-03 Thread Trevor Pretty





Cork

To answer your question just use tar for everything. It's about the
best we've got. :-(

When the disk turns into a doorstop re-install OpenSolaris/Solaris and
then tar back all your data. I keep a complete list of EVER change I
make on any OS (including the Redmond one) so I can re-create the
machine.

And, IMHO - And I
know I will get shot at for saying it, but

One reason why I would not use ZFS root in a real live production
environment, is not having the equivalent of ufsdump/ufsrestore so I
can do a bare metal restore. ZFS root works great on my laptop, but I
know lots who still rely on ufsdump to a local tape drive for quick
bare metal restores. The only good news in UNIX is much more tidy then
Windows and there is very little that is not in /home (or /export/home)
that gets changed throughout the OSes life.

Unless somebody know
better


Cork Smith wrote:

  Let me try rephrasing this. I would like the ability to restore so my system mirrors its state at the time when I backed it up given the old hard drive is now a door stop.

Cork
  












www.eagle.co.nz
This email is confidential and may be legally 
privileged. If received in error please destroy and immediately notify 
us.


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Archiving and Restoring Snapshots

2009-09-03 Thread Tim Cook

On Fri, Sep 4, 2009 at 12:17 AM, Ross myxi...@googlemail.com wrote:

 Hi Richard,

 Actually, reading your reply has made me realise I was overlooking
 something when I talked about tar, star, etc...  How do you backup a ZFS
 volume?  That's something traditional tools can't do.  Are snapshots the
 only way to create a backup or archive of those?

 Personally I'm quite happy with snapshots - we have a ZFS system at work
 that's replicating all of it's data to an offsite ZFS store using snapshots.
  Using ZFS as a backup store is something I'm quite happy with, it's just
 storing just a snapshot file that makes me nervous.


The correct answer is ndmp.  Whether Sun will ever add it to opensolaris is
another subject entirely though.

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

47 matches

Mail list logo