Re: [zfs-discuss] Another user looses his pool (10TB) in this case and 40 days work

2010-02-15 Thread Orvar Korvar
Yes, if you value your data you should change from USB drives to normal drives. I heard that USB did some strange things? Normal connection such as SATA is more reliable. -- This message posted from opensolaris.org ___ zfs-discuss mailing list

Re: [zfs-discuss] Another user looses his pool (10TB) in this case and 40 days work

2010-02-14 Thread Bruno Damour
Hello, I'm now thinking there is some _real_ bug in the way zfs handles files systems created with the pool itself (ie tank filesystem when zpool is tank, usually mounted as /tank) My own experiens shows that zfs is unable to send/receive recursively (snapshots, child fs) properly when the

Re: [zfs-discuss] Another user looses his pool (10TB) in this case and 40 days work

2010-02-13 Thread Andy Stenger
I had a very similar problem. 8 external USB drives running OpenSolaris native. When I moved the machine into a different room and powered it back up (there were a couple of reboots and a couple of broken usb cables and drive shut downs in between), I got the same error. Loosing that much data is

Re: [zfs-discuss] Another user looses his pool (10TB) in this case and 40 days work

2010-02-13 Thread Remco Lengers
I just have the say this, and I don't mean it in a bad way... If you really care about your data why then use usb drives with lose cables and (apparently no backup) USB connected drives for data backup are okay, for playing around and getting to know ZFS seems also okay. Using it for

Re: [zfs-discuss] Another user looses his pool (10TB) in this case and 40 days work

2009-08-04 Thread Roch Bourbonnais
Le 26 juil. 09 à 01:34, Toby Thain a écrit : On 25-Jul-09, at 3:32 PM, Frank Middleton wrote: On 07/25/09 02:50 PM, David Magda wrote: Yes, it can be affected. If the snapshot's data structure / record is underneath the corrupted data in the tree then it won't be able to be reached.

Re: [zfs-discuss] Another user looses his pool (10TB) in this case and 40 days work

2009-08-04 Thread Toby Thain
On 4-Aug-09, at 9:28 AM, Roch Bourbonnais wrote: Le 26 juil. 09 à 01:34, Toby Thain a écrit : On 25-Jul-09, at 3:32 PM, Frank Middleton wrote: On 07/25/09 02:50 PM, David Magda wrote: Yes, it can be affected. If the snapshot's data structure / record is underneath the corrupted data

Re: [zfs-discuss] Another user looses his pool (10TB) in this case and 40 days work

2009-08-02 Thread Germano Caronni
Have you considered this? *Maybe* a little time travel to an old uberblock could help you? http://www.opensolaris.org/jive/thread.jspa?threadID=85794 -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org

Re: [zfs-discuss] Another user looses his pool (10TB) in this case and 40 days work

2009-07-29 Thread Richard Elling
On Jul 28, 2009, at 6:34 PM, Eric D. Mudama wrote: On Mon, Jul 27 at 13:50, Richard Elling wrote: On Jul 27, 2009, at 10:27 AM, Eric D. Mudama wrote: Can *someone* please name a single drive+firmware or RAID controller+firmware that ignores FLUSH CACHE / FLUSH CACHE EXT commands? Or worse,

Re: [zfs-discuss] Another user looses his pool (10TB) in this case and 40 days work

2009-07-28 Thread Ross
I think people can understand the concept of missing flushes. The big conceptual problem is how this manages to hose an entire filesystem, which is assumed to have rather a lot of data which ZFS has already verified to be ok. Hardware ignoring flushes and loosing recent data is understandable,

Re: [zfs-discuss] Another user looses his pool (10TB) in this case and 40 days work

2009-07-28 Thread Rennie Allen
Can *someone* please name a single drive+firmware or RAID controller+firmware that ignores FLUSH CACHE / FLUSH CACHE EXT commands? Or worse, responds ok when the flush hasn't occurred? I think it would be a shorter list if one were to name the drives/controllers that actually implement a

Re: [zfs-discuss] Another user looses his pool (10TB) in this case and 40 days work

2009-07-28 Thread Rennie Allen
This is also (theoretically) why a drive purchased from Sun is more that expensive then a drive purchased from your neighbourhood computer shop: It's more significant than that. Drives aimed at the consumer market are at a competitive disadvantage if they do handle cache flush

Re: [zfs-discuss] Another user looses his pool (10TB) in this case and 40 days work

2009-07-28 Thread Eric D. Mudama
On Mon, Jul 27 at 13:50, Richard Elling wrote: On Jul 27, 2009, at 10:27 AM, Eric D. Mudama wrote: Can *someone* please name a single drive+firmware or RAID controller+firmware that ignores FLUSH CACHE / FLUSH CACHE EXT commands? Or worse, responds ok when the flush hasn't occurred? two

Re: [zfs-discuss] Another user looses his pool (10TB) in this case and 40 days work

2009-07-27 Thread Marcelo Leal
That's only one element of it Bob. ZFS also needs devices to fail quickly and in a predictable manner. A consumer grade hard disk could lock up your entire pool as it fails. The kit Sun supply is more likely to fail in a manner ZFS can cope with. I agree 100%. Hardware, firmware,

Re: [zfs-discuss] Another user looses his pool (10TB) in this case and 40 days work

2009-07-27 Thread Ross
Heh, I'd kill for failures to be handled in 2 or 3 seconds. I saw the failure of a mirrored iSCSI disk lock the entire pool for 3 minutes. That has been addressed now, but device hangs have the potential to be *very* disruptive. -- This message posted from opensolaris.org

Re: [zfs-discuss] Another user looses his pool (10TB) in this case and 40 days work

2009-07-27 Thread Eric D. Mudama
On Sun, Jul 26 at 1:47, David Magda wrote: On Jul 25, 2009, at 16:30, Carson Gaspar wrote: Frank Middleton wrote: Doesn't this mean /any/ hardware might have this problem, albeit with much lower probability? No. You'll lose unwritten data, but won't corrupt the pool, because the on-disk

Re: [zfs-discuss] Another user looses his pool (10TB) in this case and 40 days work

2009-07-27 Thread Thomas Burgess
i was under the impression it was virtualbox and it's default setting that ignored the command, not the hard drive On Mon, Jul 27, 2009 at 1:27 PM, Eric D. Mudama edmud...@bounceswoosh.orgwrote: On Sun, Jul 26 at 1:47, David Magda wrote: On Jul 25, 2009, at 16:30, Carson Gaspar wrote:

Re: [zfs-discuss] Another user looses his pool (10TB) in this case and 40 days work

2009-07-27 Thread Chris Ridd
On 27 Jul 2009, at 18:49, Thomas Burgess wrote: i was under the impression it was virtualbox and it's default setting that ignored the command, not the hard drive Do other virtualization products (eg VMware, Parallels, Virtual PC) have the same default behaviour as VirtualBox? I've a

Re: [zfs-discuss] Another user looses his pool (10TB) in this case and 40 days work

2009-07-27 Thread Adam Sherman
On 27-Jul-09, at 13:54 , Chris Ridd wrote: i was under the impression it was virtualbox and it's default setting that ignored the command, not the hard drive Do other virtualization products (eg VMware, Parallels, Virtual PC) have the same default behaviour as VirtualBox? I've a suspicion

Re: [zfs-discuss] Another user looses his pool (10TB) in this case and 40 days work

2009-07-27 Thread Mike Gerdts
On Mon, Jul 27, 2009 at 12:54 PM, Chris Riddchrisr...@mac.com wrote: On 27 Jul 2009, at 18:49, Thomas Burgess wrote: i was under the impression it was virtualbox and it's default setting that ignored the command, not the hard drive Do other virtualization products (eg VMware, Parallels,

Re: [zfs-discuss] Another user looses his pool (10TB) in this case and 40 days work

2009-07-27 Thread David Magda
On Mon, July 27, 2009 13:59, Adam Sherman wrote: Also, I think it may have already been posted, but I haven't found the option to disable VirtualBox' disk cache. Anyone have the incantation handy? http://forums.virtualbox.org/viewtopic.php?f=8t=13661start=0 It tells VB not to ignore the

Re: [zfs-discuss] Another user looses his pool (10TB) in this case and 40 days work

2009-07-27 Thread Frank Middleton
On 07/27/09 01:27 PM, Eric D. Mudama wrote: Everyone on this list seems to blame lying hardware for ignoring commands, but disks are relatively mature and I can't believe that major OEMs would qualify disks or other hardware that willingly ignore commands. You are absolutely correct, but if

Re: [zfs-discuss] Another user looses his pool (10TB) in this case and 40 days work

2009-07-27 Thread Richard Elling
On Jul 27, 2009, at 10:27 AM, Eric D. Mudama wrote: On Sun, Jul 26 at 1:47, David Magda wrote: On Jul 25, 2009, at 16:30, Carson Gaspar wrote: Frank Middleton wrote: Doesn't this mean /any/ hardware might have this problem, albeit with much lower probability? No. You'll lose

Re: [zfs-discuss] Another user looses his pool (10TB) in this case and 40 days work

2009-07-27 Thread Adam Sherman
On 27-Jul-09, at 15:14 , David Magda wrote: Also, I think it may have already been posted, but I haven't found the option to disable VirtualBox' disk cache. Anyone have the incantation handy? http://forums.virtualbox.org/viewtopic.php?f=8t=13661start=0 It tells VB not to ignore the

Re: [zfs-discuss] Another user looses his pool (10TB) in this case and 40 days work

2009-07-27 Thread Nigel Smith
David Magda wrote: This is also (theoretically) why a drive purchased from Sun is more that expensive then a drive purchased from your neighbourhood computer shop: Sun (and presumably other manufacturers) takes the time and effort to test things to make sure that when a drive says I've

Re: [zfs-discuss] Another user looses his pool (10TB) in this case and 40 days work

2009-07-27 Thread Toby Thain
On 27-Jul-09, at 3:44 PM, Frank Middleton wrote: On 07/27/09 01:27 PM, Eric D. Mudama wrote: Everyone on this list seems to blame lying hardware for ignoring commands, but disks are relatively mature and I can't believe that major OEMs would qualify disks or other hardware that willingly

Re: [zfs-discuss] Another user looses his pool (10TB) in this case and 40 days work

2009-07-26 Thread David Magda
On Jul 25, 2009, at 16:30, Carson Gaspar wrote: Frank Middleton wrote: Doesn't this mean /any/ hardware might have this problem, albeit with much lower probability? No. You'll lose unwritten data, but won't corrupt the pool, because the on-disk state will be sane, as long as your iSCSI

Re: [zfs-discuss] Another user looses his pool (10TB) in this case and 40 days work

2009-07-26 Thread David Magda
On Jul 25, 2009, at 15:32, Frank Middleton wrote: Can you comment on if/how mirroring or raidz mitigates this, or tree corruption in general? I have yet to lose a pool even on a machine with fairly pathological problems, but it is mirrored (and copies=2). Presumably at least on of the drives

Re: [zfs-discuss] Another user looses his pool (10TB) in this case and 40 days work

2009-07-26 Thread Frank Middleton
On 07/25/09 04:30 PM, Carson Gaspar wrote: No. You'll lose unwritten data, but won't corrupt the pool, because the on-disk state will be sane, as long as your iSCSI stack doesn't lie about data commits or ignore cache flush commands. Why is this so difficult for people to understand? Let me

Re: [zfs-discuss] Another user looses his pool (10TB) in this case and 40 days work

2009-07-26 Thread Bob Friesenhahn
On Sun, 26 Jul 2009, David Magda wrote: That's the whole point of this thread: what should happen, or what should the file system do, when the drive (real or virtual) lies about the syncing? It's just as much a problem with any other POSIX file system (which have to deal with fsync(2))--ZFS

Re: [zfs-discuss] Another user looses his pool (10TB) in this case and 40 days work

2009-07-26 Thread Toby Thain
On 26-Jul-09, at 11:08 AM, Frank Middleton wrote: On 07/25/09 04:30 PM, Carson Gaspar wrote: No. You'll lose unwritten data, but won't corrupt the pool, because the on-disk state will be sane, as long as your iSCSI stack doesn't lie about data commits or ignore cache flush commands. Why is

Re: [zfs-discuss] Another user looses his pool (10TB) in this case and 40 days work

2009-07-25 Thread roland
Running this kind of setup absolutely can give you NO garanties at all. Virtualisation, OSOL/zfs on WinXP. It's nice to play with and see it working but would I TRUST precious data to it? No way! why not? if i write some data trough virtualization layer which goes straight trough to raw disk -

Re: [zfs-discuss] Another user looses his pool (10TB) in this case and 40 days work

2009-07-25 Thread Bob Friesenhahn
On Sat, 25 Jul 2009, roland wrote: When that happens, ZFS believes the data is safely written, but a power cut or crash can cause severe problems with the pool. didn`t i read a million times that zfs ensures an always consistent state and is self healing, too? so, if new blocks are always

Re: [zfs-discuss] Another user looses his pool (10TB) in this case and 40 days work

2009-07-25 Thread roland
As soon as you have more then one disk in the equation, then it is vital that the disks commit their data when requested since otherwise the data on disk will not be in a consistent state. ok, but doesn`t that refer only to the most recent data? why can i loose a whole 10TB pool including all the

Re: [zfs-discuss] Another user looses his pool (10TB) in this case and 40 days work

2009-07-25 Thread David Magda
On Jul 25, 2009, at 12:24, roland wrote: why can i loose a whole 10TB pool including all the snapshots with the logging/transactional nature of zfs? Because ZFS does not (yet) have an (easy) way to go back a previous state. That's what this bug is about: need a way to rollback to an

Re: [zfs-discuss] Another user looses his pool (10TB) in this case and 40 days work

2009-07-25 Thread roland
thanks for the explanation ! one more question: there are situations where the disks doing strange things (like lying) have caused the ZFS data structures to become wonky. The 'broken' data structure will cause all branches underneath it to be lost--and if it's near the top of the tree, it

Re: [zfs-discuss] Another user looses his pool (10TB) in this case and 40 days work

2009-07-25 Thread David Magda
On Jul 25, 2009, at 14:17, roland wrote: thanks for the explanation ! one more question: there are situations where the disks doing strange things (like lying) have caused the ZFS data structures to become wonky. The 'broken' data structure will cause all branches underneath it to be

Re: [zfs-discuss] Another user looses his pool (10TB) in this case and 40 days work

2009-07-25 Thread Frank Middleton
On 07/25/09 02:50 PM, David Magda wrote: Yes, it can be affected. If the snapshot's data structure / record is underneath the corrupted data in the tree then it won't be able to be reached. Can you comment on if/how mirroring or raidz mitigates this, or tree corruption in general? I have yet

Re: [zfs-discuss] Another user looses his pool (10TB) in this case and 40 days work

2009-07-25 Thread Carson Gaspar
Frank Middleton wrote: Finally, a number of posters blamed VB for ignoring a flush, but according to the evil tuning guide, without any application syncs, ZFS may wait up to 5 seconds before issuing a synch, and there must be all kinds of failure modes even on bare hardware where it never gets a

Re: [zfs-discuss] Another user looses his pool (10TB) in this case and 40 days work

2009-07-25 Thread Toby Thain
On 25-Jul-09, at 3:32 PM, Frank Middleton wrote: On 07/25/09 02:50 PM, David Magda wrote: Yes, it can be affected. If the snapshot's data structure / record is underneath the corrupted data in the tree then it won't be able to be reached. Can you comment on if/how mirroring or raidz

Re: [zfs-discuss] Another user looses his pool (10TB) in this case and 40 days work

2009-07-23 Thread Frank Middleton
On 07/21/09 01:21 PM, Richard Elling wrote: I never win the lottery either :-) Let's see. Your chance of winning a 49 ball lottery is apparently around 1 in 14*10^6, although it's much better than that because of submatches (smaller payoffs for matches on less than 6 balls). There are about

Re: [zfs-discuss] Another user looses his pool (10TB) in this case and 40 days work

2009-07-22 Thread Russel
Thanks for the feed back George. I hope we get the tools soon. At home I have now blown the ZFS away now and creating a HW raid-5 set :-( Hopefully in the future when the tools are there I will return to ZFS. To All : The ECC discussion was very interesting as I had never considered it that

Re: [zfs-discuss] Another user looses his pool (10TB) in this case and 40 days work

2009-07-22 Thread Alexander Skwar
Hi. Good to Know! But how do we deal with that on older sStems, which don't have the patch applied, once it is out? Thanks, Alexander On Tuesday, July 21, 2009, George Wilson george.wil...@sun.com wrote: Russel wrote: OK. So do we have an zpool import --xtg 56574 mypoolname or help to do

Re: [zfs-discuss] Another user looses his pool (10TB) in this case and 40 days work

2009-07-22 Thread George Wilson
Once these bits are available in Opensolaris then users will be able to upgrade rather easily. This would allow you to take a liveCD running these bits and recover older pools. Do you currently have a pool which needs recovery? Thanks, George Alexander Skwar wrote: Hi. Good to Know! But

Re: [zfs-discuss] Another user looses his pool (10TB) in this case and 40 days work

2009-07-22 Thread Mario Goebbels
To All : The ECC discussion was very interesting as I had never considered it that way! I willl be buying ECC memory for my home machine!! You have to make sure your mainboard, chipset and/or CPU support it, otherwise any ECC modules will just work like regular modules. The mainboard needs

Re: [zfs-discuss] Another user looses his pool (10TB) in this case and 40 days work

2009-07-21 Thread Ross
My understanding of the root cause of these issues is that the vast majority are happening with consumer grade hardware that is reporting to ZFS that writes have succeeded, when in fact they are still in the cache. When that happens, ZFS believes the data is safely written, but a power cut or

Re: [zfs-discuss] Another user looses his pool (10TB) in this case and 40 days work

2009-07-21 Thread George Wilson
Russel wrote: OK. So do we have an zpool import --xtg 56574 mypoolname or help to do it (script?) Russel We are working on the pool rollback mechanism and hope to have that soon. The ZFS team recognizes that not all hardware is created equal and thus the need for this mechanism. We are

Re: [zfs-discuss] Another user looses his pool (10TB) in this case and 40 days work

2009-07-21 Thread Richard Elling
On Jul 20, 2009, at 12:48 PM, Frank Middleton wrote: On 07/19/09 06:10 PM, Richard Elling wrote: Not that bad. Uncommitted ZFS data in memory does not tend to live that long. Writes are generally out to media in 30 seconds. Yes, but memory hits are instantaneous. On a reasonably busy system

Re: [zfs-discuss] Another user looses his pool (10TB) in this case and 40 days work

2009-07-20 Thread Russel
Well I did have a UPS on the machine :-) but the machine hung and I had to power it off... (yep it was vertual, but that happens on direct HW too, and virtualisasion is the happening ting at sun and else where! I have a version of the data backed up, but will take ages (10days) to restore). --

Re: [zfs-discuss] Another user looses his pool (10TB) in this case and 40 days work

2009-07-20 Thread Rob Logan
the machine hung and I had to power it off. kinda getting off the zpool import --tgx -3 request, but hangs are exceptionally rare and usually ram or other hardware issue, solairs usually abends on software faults. r...@pdm # uptime 9:33am up 1116 day(s), 21:12, 1 user, load average:

Re: [zfs-discuss] Another user looses his pool (10TB) in this case and 40 days work

2009-07-20 Thread Russel
OK. So do we have an zpool import --xtg 56574 mypoolname or help to do it (script?) Russel -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Another user looses his pool (10TB) in this case and 40 days work

2009-07-20 Thread Frank Middleton
On 07/19/09 06:10 PM, Richard Elling wrote: Not that bad. Uncommitted ZFS data in memory does not tend to live that long. Writes are generally out to media in 30 seconds. Yes, but memory hits are instantaneous. On a reasonably busy system there may be buffers in queue all the time. You may

Re: [zfs-discuss] Another user looses his pool (10TB) in this case and 40 days work

2009-07-19 Thread Brent Jones
On Sat, Jul 18, 2009 at 7:39 PM, Russelno-re...@opensolaris.org wrote: Yes you'll find my name all over VB at the moment, but I have found it to be stable (don't install the addons disk for solaris!!, use 3.0.2, and for me winXP32bit and OpenSolaris 2009.6 has been rock solid, it was

Re: [zfs-discuss] Another user looses his pool (10TB) in this case and 40 days work

2009-07-19 Thread Markus Kovero
. heinäkuuta 2009 11:24 To: zfs-discuss@opensolaris.org Subject: Re: [zfs-discuss] Another user looses his pool (10TB) in this case and 40 days work bj == Brent Jones br...@servuhome.net writes: bj many levels of fail here, pft. Virtualbox isn't unstable in any of my experience. It doesn't

Re: [zfs-discuss] Another user looses his pool (10TB) in this case and 40 days work

2009-07-19 Thread dick hoogendijk
On Sun, 19 Jul 2009 00:00:06 -0700 Brent Jones br...@servuhome.net wrote: No offense, but you trusted 10TB of important data, running in OpenSolaris from inside Virtualbox (not stable) on top of Windows XP (arguably not stable, especially for production) on probably consumer grade hardware

Re: [zfs-discuss] Another user looses his pool (10TB) in this case and 40 days work

2009-07-19 Thread Ross
While I agree with Brent, I think this is something that should be stressed in the ZFS documentation. Those of us with long term experience of ZFS know that it's really designed to work with hardware meeting quite specific requirements. Unfortunately, that isn't documented anywhere, and more

Re: [zfs-discuss] Another user looses his pool (10TB) in this case and 40 days work

2009-07-19 Thread dick hoogendijk
On Sun, 19 Jul 2009 01:48:40 PDT Ross no-re...@opensolaris.org wrote: As far as I can see, the ZFS Administrator Guide is sorely lacking in any warning that you are risking data loss if you run on consumer grade hardware. And yet, ZFS is not only for NON-consumer grade hardware is it? the

Re: [zfs-discuss] Another user looses his pool (10TB) in this case and 40 days work

2009-07-19 Thread Russel
Guys guys please chill... First thanks to the info about virtualbox option to bypass the cache (I don't suppose you can give me a reference for that info? (I'll search the VB site :-)) As this was not clear to me. I use VB like others use vmware etc to run solaris because its the ONLY way I can,

Re: [zfs-discuss] Another user looses his pool (10TB) in this case and 40 days work

2009-07-19 Thread Ross
From the experience myself and others have had, and Sun's approach with their Amber Road storage (FISHWORKS - fully integrated *hardware* and software), my feeling is very much that ZFS was designed by Sun to run on Sun's own hardware, and as such, they were able to make certain assumptions

Re: [zfs-discuss] Another user looses his pool (10TB) in this case and 40 days work

2009-07-19 Thread Ross
Heh, yes, I assumed similar things Russel. I also assumed that a faulty disk in a raid-z set wouldn't hang my entire pool indefinitely, that hot plugging a drive wouldn't reboot Solaris, and that my pool would continue working after I disconnected one half of an iscsi mirror. I also like

Re: [zfs-discuss] Another user looses his pool (10TB) in this case and 40 days work

2009-07-19 Thread Bob Friesenhahn
On Sun, 19 Jul 2009, Ross wrote: The success of any ZFS implementation is *very* dependent on the hardware you choose to run it on. To clarify: The success of any filesystem implementation is *very* dependent on the hardware you choose to run it on. ZFS requires that the hardware cache

Re: [zfs-discuss] Another user looses his pool (10TB) in this case and 40 days work

2009-07-19 Thread Toby Thain
On 19-Jul-09, at 7:12 AM, Russel wrote: Guys guys please chill... First thanks to the info about virtualbox option to bypass the cache (I don't suppose you can give me a reference for that info? (I'll search the VB site :-)) I posted about that insane default, six months ago. Obviously ZFS

Re: [zfs-discuss] Another user looses his pool (10TB) in this case and 40 days work

2009-07-19 Thread Ross
That's only one element of it Bob. ZFS also needs devices to fail quickly and in a predictable manner. A consumer grade hard disk could lock up your entire pool as it fails. The kit Sun supply is more likely to fail in a manner ZFS can cope with. -- This message posted from opensolaris.org

Re: [zfs-discuss] Another user looses his pool (10TB) in this case and 40 days work

2009-07-19 Thread Frank Middleton
On 07/19/09 05:00 AM, dick hoogendijk wrote: (i.e. non ECC memory should work fine!) / mirroring is a -must- ! Yes, mirroring is a must, although it doesn't help much if you have memory errors (see several other threads on this topic):

Re: [zfs-discuss] Another user looses his pool (10TB) in this case and 40 days work

2009-07-19 Thread Bob Friesenhahn
On Sun, 19 Jul 2009, Frank Middleton wrote: Yes, mirroring is a must, although it doesn't help much if you have memory errors (see several other threads on this topic): http://en.wikipedia.org/wiki/Dynamic_random_access_memory#Errors_and_error_correction Tests[ecc]give widely varying error

Re: [zfs-discuss] Another user looses his pool (10TB) in this case and 40 days work

2009-07-19 Thread Miles Nordin
r == Ross no-re...@opensolaris.org writes: tt == Toby Thain t...@telegraphics.com.au writes: r ZFS was never designed to run on consumer hardware, this is markedroid garbage, as well as post-facto apologetics. Don't lower the bar. Don't blame the victim. tt I posted about that

Re: [zfs-discuss] Another user looses his pool (10TB) in this case and 40 days work

2009-07-19 Thread Richard Elling
Frank Middleton wrote: On 07/19/09 05:00 AM, dick hoogendijk wrote: (i.e. non ECC memory should work fine!) / mirroring is a -must- ! Yes, mirroring is a must, although it doesn't help much if you have memory errors (see several other threads on this topic):

Re: [zfs-discuss] Another user looses his pool (10TB) in this case and 40 days work

2009-07-19 Thread Bob Friesenhahn
On Sun, 19 Jul 2009, Miles Nordin wrote: r == Ross no-re...@opensolaris.org writes: tt == Toby Thain t...@telegraphics.com.au writes: r ZFS was never designed to run on consumer hardware, this is markedroid garbage, as well as post-facto apologetics. Don't lower the bar. Don't blame

Re: [zfs-discuss] Another user looses his pool (10TB) in this case and 40 days work

2009-07-19 Thread Gavin Maltby
dick hoogendijk wrote: true. Furthermore, much so-called consumer hardware is very good these days. My guess is ZFS should work quite reliably on that hardware. (i.e. non ECC memory should work fine!) / mirroring is a -must- ! No, ECC memory is a must too. ZFS checksumming verifies and

Re: [zfs-discuss] Another user looses his pool (10TB) in this case and 40 days work

2009-07-19 Thread David Magda
On Jul 19, 2009, at 20:13, Gavin Maltby wrote: No, ECC memory is a must too. ZFS checksumming verifies and corrects data read back from a disk, but once it is read from disk it is stashed in memory for your application to use - without ECC you erode confidence that what you read from

Re: [zfs-discuss] Another user looses his pool (10TB) in this case and 40 days work

2009-07-19 Thread Bob Friesenhahn
On Sun, 19 Jul 2009, David Magda wrote: Right, because once (say) Apple incorporates ZFS into Mac OS X they'll also start shipping MacBooks and iMacs with ECC. If it's so necessary we might as well have any kernel that has ZFS in it only allow 'zpool create' to be run if the kernel detects

Re: [zfs-discuss] Another user looses his pool (10TB) in this case and 40 days work

2009-07-19 Thread Gavin Maltby
Hi, David Magda wrote: On Jul 19, 2009, at 20:13, Gavin Maltby wrote: No, ECC memory is a must too. ZFS checksumming verifies and corrects data read back from a disk, but once it is read from disk it is stashed in memory for your application to use - without ECC you erode confidence that

Re: [zfs-discuss] Another user looses his pool (10TB) in this case and 40 days work

2009-07-19 Thread Richard Elling
Gavin Maltby wrote: Hi, David Magda wrote: On Jul 19, 2009, at 20:13, Gavin Maltby wrote: No, ECC memory is a must too. ZFS checksumming verifies and corrects data read back from a disk, but once it is read from disk it is stashed in memory for your application to use - without ECC you

Re: [zfs-discuss] Another user looses his pool (10TB) in this case and 40 days work

2009-07-19 Thread Andre van Eyssen
On Sun, 19 Jul 2009, Richard Elling wrote: I do, even though I have a small business. Neither InDesign nor Illustrator will be ported to Linux or OpenSolaris in my lifetime... besides, iTunes rocks and it is the best iPhone developer's environment on the planet. Richard, I think the point

Re: [zfs-discuss] Another user looses his pool (10TB) in this case and 40 days work

2009-07-18 Thread Orvar Korvar
Sorry to hear that, but you do know that VirtualBox is not really stable? VirtualBox does show some instability from time to time. You havent read the VirtualBox forums? I would advice against VirtualBox for saving all your data in ZFS. I would use OpenSolaris without virtualization. I hope

Re: [zfs-discuss] Another user looses his pool (10TB) in this case and 40 days work

2009-07-18 Thread Russel
Yes you'll find my name all over VB at the moment, but I have found it to be stable (don't install the addons disk for solaris!!, use 3.0.2, and for me winXP32bit and OpenSolaris 2009.6 has been rock solid, it was (seems) to be opensolaris failed with extract_boot_list doesn't belong to 101,