Re: [zfs-discuss] Another user looses his pool (10TB) in this case and 40
> wow, talk about a knee jerk reaction... Not at all. A long thread is started where the user lost his pool, and discussion shows it's a known problem. I love ZFS and I'm still very nervous about the risk of loosing an entire pool. > As has been described many times over the past few > years, there is a manual procedure. Yes, but there are a few issues with this: 1. The OP doesn't seem to have been able to get anybody to help him recover his pool. The natural assumption reading a thread like this is that ZFS pool corruption happens, and you loose your data. 2. While the procedure may have been mentioned, I've never seen a link to official documentation on it. 3. My understanding from reading Victors threads (although I may be wrong) is that this recovery takes a significant amount of time. > You probably won't lose all of your data. Statistically speaking, there > are very few people who have seen this. There are many more cases > where ZFS detected and repaired corruption. Yes, but statistics don't matter when emotions come into play, and I'm afraid with something like this it's going to scare off a lot of people who read about it. It might be rare, but people don't think like that. Why do you think so many play the lottery ;-) The other point is that system admins like to have control over their own data. It's their job on the line if things go wrong, and if they see a major problem like this without an obvious solution and which they would have very little control over if it happens, they're going to get very nervous about implementing it. >From a psychological point of view, this issue is very damaging to zfs. On the flip side, once the recovery tool is available, this will turn into a good positive for zfs. I don't believe I've heard of any other bug that causes complete loss of the pool, so with a recovery tool, zfs should have an enviable ability to safeguard data. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] How Virtual Box handles the IO
On Fri, Jul 31, 2009 at 7:58 PM, Frank Middleton wrote: > Has anyone ever actually lost a pool on Sun hardware other than > by losing too many replicas or operator error? As you have so Yes, I have lost a pool when running on Sun hardware. http://mail.opensolaris.org/pipermail/zfs-discuss/2007-September/013233.html Quite likely related to: http://bugs.opensolaris.org/view_bug.do?bug_id=6684721 In other words, it was a buggy Sun component that didn't do the right thing with cache flushes. -- Mike Gerdts http://mgerdts.blogspot.com/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Another user looses his pool (10TB) in this case and 40
+-- | On 2009-07-31 17:00:54, Jason A. Hoffman wrote: | | I have thousands and thousands and thousands of zpools. I started | collecting such zpools back in 2005. None have been lost. I don't have thousands and thousands of zpools, but I do have more than would fit in a breadbox. And bigger, too. ZFS: Verifying, cuddling and wrangling my employer's business critical data since 2007. (No bits were harmed in the production of this storage network.) (No, really. We validated their checksums.) -- bda cyberpunk is dead. long live cyberpunk. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] How Virtual Box handles the IO
I understand > that the ZILs are allocated out of the general pool. There is one intent log chain per dataset (file system or zvol). The head of each log the log is kept in the main pool. Without slog(s) we allocate (and chain) blocks from the main pool. If separate intent log(s) exist then blocks are allocated and chained there. If we fail to allocate from the slog(s) then we revert to allocation from the main pool. Is there a ZIL for the ZILs, or does this make no sense? There is no ZIL for the ZILs. Note the ZIL is not a journal (like ext3 or ufs logging). It simply contains records of system calls (including data) that need to be replayed if the system crashes and those records have not been committed in a transaction group. Hope that helps: Neil. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Lundman home NAS
Finding a SATA card that would work with Solaris, and be hot-swap, and more than 4 ports, sure took a while. Oh and be reasonably priced ;) Double the price of the dual core Atom did not seem right. The SATA card was a close fit to the jumper were the power-switch cable attaches, as you can see in one of the photos. This is because the MV8 card is quite long, and has the big plastic SATA sockets. It does fit, but it was the tightest spot. I also picked the 5-in-3 drive cage that had the "shortest" depth listed, 190mm. For example the Supermicro M35T is 245mm, another 5cm. Not sure that would fit. Lund Nathan Fiedler wrote: Yes, please write more about this. The photos are terrific and I appreciate the many useful observations you've made. For my home NAS I chose the Chenbro ES34069 and the biggest problem was finding a SATA/PCI card that would work with OpenSolaris and fit in the case (technically impossible without a ribbon cable PCI adapter). After seeing this, I may reconsider my choice. For the SATA card, you mentioned that it was a close fit with the case power switch. Would removing the backplane on the card have helped? Thanks n On Fri, Jul 31, 2009 at 5:22 AM, Jorgen Lundman wrote: I have assembled my home RAID finally, and I think it looks rather good. http://www.lundman.net/gallery/v/lraid5/p1150547.jpg.html Feedback is welcome. I have yet to do proper speed tests, I will do so in the coming week should people be interested. Even though I have tried to use only existing, and cheap, parts the end sum became higher than I expected. Final price is somewhere in the 47,000 yen range. (Without hard disks) If I were to make and sell these, they would be 57,000 or so, so I do not really know if anyone would be interested. Especially since SOHO NAS devices seem to start around 80,000. Anyway, sure has been fun. Lund ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss -- Jorgen Lundman | Unix Administrator | +81 (0)3 -5456-2687 ext 1017 (work) Shibuya-ku, Tokyo| +81 (0)90-5578-8500 (cell) Japan| +81 (0)3 -3375-1767 (home) ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Zfs deduplication
I don't think is at liberty to discuss ZFS Deduplication at this point in time: http://www.itworld.com/storage/71307/sun-tussles-de-duplication-startup Hopefully, the matter is resolved and discussions can proceed openly. "Send lawyers, guns and money." - Warren Zevon -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Another user looses his pool (10TB) in this case and 40
On 31-Jul-09, at 20:00 , Jason A. Hoffman wrote: I have thousands and thousands and thousands of zpools. I started collecting such zpools back in 2005. None have been lost. Best regards, Jason Jason A. Hoffman, PhD | Founder, CTO, Joyent Inc. I believe I have about a TB of data on at least one of Jason's pools and it seems to still be around. ;) A. -- Adam Sherman CTO, Versature Corp. Tel: +1.877.498.3772 x113 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] I Still Have My Data
My test setup of 8 x 2G virtual disks under Virtual Box on top of Mac OS X is running nicely! I haven't lost a *single* byte of data. ;) A. -- Adam Sherman CTO, Versature Corp. Tel: +1.877.498.3772 x113 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] How Virtual Box handles the IO
Great to hear a few success stories! We have been experimentally running ZFS on really crappy hardware and it has never lost a pool. Running on VB with ZFS/iscsi raw disks we have yet to see any errors at all. On sun4u with lsi sas/sata it is really rock solid. And we've been going out of our way to break it because of bad experiences with ntfs, ext2 and UFS as well as many disk failures (ever had fsck run amok?). On 07/31/09 12:11 PM, Richard Elling wrote: Making flush be a nop destroys the ability to check for errors thus breaking the trust between ZFS and the data on medium. -- richard Can you comment on the issue that the underlying disks were, as far as we know, never powered down? My understanding is that disks usually try to flush their caches as quickly as possible to make room for more data, so in this scenario things were probably quiet after the guest crash, so likely what ever was in the cache would have been flushed anyway, certainly by the time the OP restarted VB and the guest. Could you also comment on CR 6667683. which I believe is proposed as a solution for recovery in this very rare case? I understand that the ZILs are allocated out of the general pool. Is there a ZIL for the ZILs, or does this make no sense? As the one who started the whole ECC discussion, I don't think anyone has ever claimed that lack of ECC caused this loss of a pool or that it could. AFAIK lack of ECC can't be a problem at all on RAIDZ vdevs, only with single drives or plain mirrors. I've suggested an RFE for the mirrored case to double buffer the writes in this case, but disabling checksums pretty much fixes the problem if you don't have ECC, so it isn't worth pursuing. You can disable checksum per file system, so this is an elegant solution if you don't have ECC memory but you do mirror. No mirror IMO is suicidal with any file system. Has anyone ever actually lost a pool on Sun hardware other than by losing too many replicas or operator error? As you have so eloquently pointed out, building a reliable storage system is an engineering problem. There are a lot of folks out there who are very happy with ZFS on decent hardware. On crappy hardware you get what you pay for... Cheers -- Frank (happy ZFS evangelist) ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Another user looses his pool (10TB) in this case and 40
Brian wrote: I must say this thread has also damaged the view I have of ZFS. Ive been considering just getting a Raid 5 controller and going the linux route I had planned on. That'll be you loss. I've never managed to loose a pool and I've all sorts of unreliable media and all sorts of nasty ways to break them! Whatever you choose, don't forget to back up your data. -- Ian. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Another user looses his pool (10TB) in this case and 40
On 31-Jul-09, at 7:15 PM, Richard Elling wrote: wow, talk about a knee jerk reaction... On Jul 31, 2009, at 3:23 PM, Dave Stubbs wrote: I don't mean to be offensive Russel, but if you do ever return to ZFS, please promise me that you will never, ever, EVER run it virtualized on top of NTFS (a.k.a. worst file system ever) in a production environment. Microsoft Windows is a horribly unreliable operating system in situations where things like protecting against data corruption are important. Microsoft knows this Oh WOW! Whether or not our friend Russel virtualized on top of NTFS (he didn't - he used raw disk access) this point is amazing! This point doesn't matter. VB sits between the guest OS and the raw disk and drops cache flush requests. System5 - based on this thread I'd say you can't really make this claim at all. Solaris suffered a crash and the ZFS filesystem lost EVERYTHING! And there aren't even any recovery tools? As has been described many times over the past few years, there is a manual procedure. HANG YOUR HEADS!!! Recovery from the same situation is EASY on NTFS. There are piles of tools out there that will recover the file system, and failing that, locate and extract data. The key parts of the file system are stored in multiple locations on the disk just in case. It's been this way for over 10 years. ZFS also has redundant metadata written at different places on the disk. ZFS, like NTFS, issues cache flush requests with the expectation that the disk honors that request. Can anyone name a widely used transactional or journaled filesystem or RDBMS that *doesn't* need working barriers? I'd say it seems from this thread that my data is a lot safer on NTFS than it is on ZFS! Nope. NTFS doesn't know when data is corrupted. Until it does, it is blissfully ignorant. People still choose systems that don't even know which side of a mirror is good. Do they ever wonder what happens when you turn off a busy RAID-1? Or why checksumming and COW make a difference? This thread hasn't shaken my preference for ZFS at all; just about everything else out there relies on nothing more than dumb luck to maintain integrity. --Toby I can't believe my eyes as I read all these responses blaming system engineering and hiding behind ECC memory excuses and "well, you know, ZFS is intended for more Professional systems and not consumer devices, etc etc." My goodness! You DO realize that Sun has this website called opensolaris.org which actually proposes to have people use ZFS on commodity hardware, don't you? I don't see a huge warning on that site saying "ATTENTION: YOU PROBABLY WILL LOSE ALL YOUR DATA". You probably won't lose all of your data. Statistically speaking, there are very few people who have seen this. There are many more cases where ZFS detected and repaired corruption. ... -- richard ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Another user looses his pool (10TB) in this case and 40
On Jul 31, 2009, at 20:00, Jason A. Hoffman wrote: On Jul 31, 2009, at 4:54 PM, Bob Friesenhahn wrote: On Fri, 31 Jul 2009, Brian wrote: I must say this thread has also damaged the view I have of ZFS. Ive been considering just getting a Raid 5 controller and going the linux route I had planned on. Thankfully, the zfs users who have never lost a pool do not spend much time posting about their excitement at never losing a pool. Otherwise this list would be even more overwelming. I have not yet lost a pool, and this includes the one built on USB drives which might be ignoring cache sync requests. I have thousands and thousands and thousands of zpools. I started collecting such zpools back in 2005. None have been lost. Also a reminder that on-disk redundancy (RAID-5, 6, Z, etc.) is no substitute for backups. Your controller (or software RAID) can hose data in many circumstances as well. CERN's study revealed a bug in the WD disk firmware (fixed in a later version) interacting with their 3Ware controllers that caused have caused 80% of the errors they experienced. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Another user looses his pool (10TB) in this case and 40
On Jul 31, 2009, at 4:54 PM, Bob Friesenhahn wrote: On Fri, 31 Jul 2009, Brian wrote: I must say this thread has also damaged the view I have of ZFS. Ive been considering just getting a Raid 5 controller and going the linux route I had planned on. Thankfully, the zfs users who have never lost a pool do not spend much time posting about their excitement at never losing a pool. Otherwise this list would be even more overwelming. I have not yet lost a pool, and this includes the one built on USB drives which might be ignoring cache sync requests. I have thousands and thousands and thousands of zpools. I started collecting such zpools back in 2005. None have been lost. Best regards, Jason Jason A. Hoffman, PhD | Founder, CTO, Joyent Inc. ja...@joyent.com http://joyent.com/ mobile: +1-415-279-6196 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Another user looses his pool (10TB) in this case and 40
On Fri, 31 Jul 2009, Brian wrote: I must say this thread has also damaged the view I have of ZFS. Ive been considering just getting a Raid 5 controller and going the linux route I had planned on. Thankfully, the zfs users who have never lost a pool do not spend much time posting about their excitement at never losing a pool. Otherwise this list would be even more overwelming. I have not yet lost a pool, and this includes the one built on USB drives which might be ignoring cache sync requests. Bob -- Bob Friesenhahn bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer,http://www.GraphicsMagick.org/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Another user looses his pool (10TB) in this case and 40
On Jul 31, 2009, at 19:26, Brian wrote: I must say this thread has also damaged the view I have of ZFS. Ive been considering just getting a Raid 5 controller and going the linux route I had planned on. It's your data, and you are responsible for it. So this thread, if nothing else, allows you to make a informed decision. I think that where most other file systems don't detect or ignore the corner cases that have always existed (cf. CERN's data integrity study), ZFS brings them to light. To some extent it's a matter updating some of the available tools so that ZFS can recover some of these cases in a more graceful fashion. It should also be noted though, that nobody notices when things go right. :) There are people who have been running ZFS on humongous pools for a while. It's just that we always have the worst-case scenarios showing up on the list. :) ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Another user looses his pool (10TB) in this case and 40
I must say this thread has also damaged the view I have of ZFS. Ive been considering just getting a Raid 5 controller and going the linux route I had planned on. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Another user looses his pool (10TB) in this case and 40
wow, talk about a knee jerk reaction... On Jul 31, 2009, at 3:23 PM, Dave Stubbs wrote: I don't mean to be offensive Russel, but if you do ever return to ZFS, please promise me that you will never, ever, EVER run it virtualized on top of NTFS (a.k.a. worst file system ever) in a production environment. Microsoft Windows is a horribly unreliable operating system in situations where things like protecting against data corruption are important. Microsoft knows this Oh WOW! Whether or not our friend Russel virtualized on top of NTFS (he didn't - he used raw disk access) this point is amazing! This point doesn't matter. VB sits between the guest OS and the raw disk and drops cache flush requests. System5 - based on this thread I'd say you can't really make this claim at all. Solaris suffered a crash and the ZFS filesystem lost EVERYTHING! And there aren't even any recovery tools? As has been described many times over the past few years, there is a manual procedure. HANG YOUR HEADS!!! Recovery from the same situation is EASY on NTFS. There are piles of tools out there that will recover the file system, and failing that, locate and extract data. The key parts of the file system are stored in multiple locations on the disk just in case. It's been this way for over 10 years. ZFS also has redundant metadata written at different places on the disk. ZFS, like NTFS, issues cache flush requests with the expectation that the disk honors that request. I'd say it seems from this thread that my data is a lot safer on NTFS than it is on ZFS! Nope. NTFS doesn't know when data is corrupted. Until it does, it is blissfully ignorant. I can't believe my eyes as I read all these responses blaming system engineering and hiding behind ECC memory excuses and "well, you know, ZFS is intended for more Professional systems and not consumer devices, etc etc." My goodness! You DO realize that Sun has this website called opensolaris.org which actually proposes to have people use ZFS on commodity hardware, don't you? I don't see a huge warning on that site saying "ATTENTION: YOU PROBABLY WILL LOSE ALL YOUR DATA". You probably won't lose all of your data. Statistically speaking, there are very few people who have seen this. There are many more cases where ZFS detected and repaired corruption. I recently flirted with putting several large Unified Storage 7000 systems on our corporate network. The hype about ZFS is quite compelling and I had positive experience in my lab setting. But because of not having Solaris capability on our staff we went in another direction instead. Interesting. The 7000 systems completely shield you from the underlying OS. You administer the system via a web browser interface. There is no OS to learn with these systems, just like you don't go around requiring Darwin knowledge to use your iPhone. Reading this thread, I'm SO glad we didn't put ZFS in production in ANY way. Guys, this is the real world. Stuff happens. It doesn't matter what the reason is - hardware lying about cache commits, out- of-order commits, failure to use ECC memory, whatever. It is ABSOLUTELY unacceptable for the filesystem to be entirely lost. No excuse or rationalization of any type can be justified. There MUST be at least the base suite of tools to deal with this stuff. without it, ZFS simply isn't ready yet. At the risk of being redundant, redundant there is a procedure. The fine folks at Sun, like Victor Latushkin, have helped people recover such pools, as has been pointed out in this thread several times. This is not the sort of procedure easily done over an open forum, it is more efficient to recover via a service call. Microsoft talks about NTFS in Windows 2008[*] as, "Self-healing NTFS preserves as much data as possible, based on the type of corruption detected." Regarding catastrophic failures they note, "Self-healing NTFS accepts the mount request, but if the volume is known to have some form of corruption, a repair is initiated immediately. The exception to this would be a catastrophic failure that requires an offline recovery method—such as manual recovery—to minimize the loss of data." Do you consider that any different than the current state of ZFS? [*] http://technet.microsoft.com/en-us/library/cc771388(WS.10).aspx -- richard ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Another user looses his pool (10TB) in this case and 40
> I don't mean to be offensive Russel, but if you do > ever return to ZFS, please promise me that you will > never, ever, EVER run it virtualized on top of NTFS > (a.k.a. worst file system ever) in a production > environment. Microsoft Windows is a horribly > unreliable operating system in situations where > things like protecting against data corruption are > important. Microsoft knows this Oh WOW! Whether or not our friend Russel virtualized on top of NTFS (he didn't - he used raw disk access) this point is amazing! System5 - based on this thread I'd say you can't really make this claim at all. Solaris suffered a crash and the ZFS filesystem lost EVERYTHING! And there aren't even any recovery tools? HANG YOUR HEADS!!! Recovery from the same situation is EASY on NTFS. There are piles of tools out there that will recover the file system, and failing that, locate and extract data. The key parts of the file system are stored in multiple locations on the disk just in case. It's been this way for over 10 years. I'd say it seems from this thread that my data is a lot safer on NTFS than it is on ZFS! I can't believe my eyes as I read all these responses blaming system engineering and hiding behind ECC memory excuses and "well, you know, ZFS is intended for more Professional systems and not consumer devices, etc etc." My goodness! You DO realize that Sun has this website called opensolaris.org which actually proposes to have people use ZFS on commodity hardware, don't you? I don't see a huge warning on that site saying "ATTENTION: YOU PROBABLY WILL LOSE ALL YOUR DATA". I recently flirted with putting several large Unified Storage 7000 systems on our corporate network. The hype about ZFS is quite compelling and I had positive experience in my lab setting. But because of not having Solaris capability on our staff we went in another direction instead. Reading this thread, I'm SO glad we didn't put ZFS in production in ANY way. Guys, this is the real world. Stuff happens. It doesn't matter what the reason is - hardware lying about cache commits, out-of-order commits, failure to use ECC memory, whatever. It is ABSOLUTELY unacceptable for the filesystem to be entirely lost. No excuse or rationalization of any type can be justified. There MUST be at least the base suite of tools to deal with this stuff. without it, ZFS simply isn't ready yet. I am saving a copy of this thread to show my colleagues and also those Sun Microsystems sales people that keep calling. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] feature proposal
dick hoogendijk wrote: On Fri, 31 Jul 2009 18:38:16 +1000 Tristan Ball wrote: Because it means you can create zfs snapshots from a non solaris/non local client... Like a linux nfs client, or a windows cifs client. So if I want a snapshot of i.e. "rpool/export/home/dick" I can do a "zfs snapshot rpool/export/home/dick", But your command requires that it be run on the NFS/CIFS *server* directly. The 'mkdir' command version can be run on the server or on any NFS or CIFS client. It's possible (likely even) that regular users would not be allowed to login to server machines, but if given the right access, they can still use the mkdir version to create their own snapshots from a client. but what is the exact syntax for the same snapshot using this other method? As I understand it, if rpool/export/home/dick is mounted on /home/dick, then the syntax would be cd /home/dick/.zfs/snapshot mkdir mysnapshot -Kyle ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] feature proposal
On Fri, 31 Jul 2009 18:38:16 +1000 Tristan Ball wrote: > Because it means you can create zfs snapshots from a non solaris/non > local client... > > Like a linux nfs client, or a windows cifs client. So if I want a snapshot of i.e. "rpool/export/home/dick" I can do a "zfs snapshot rpool/export/home/dick", but what is the exact syntax for the same snapshot using this other method? -- Dick Hoogendijk -- PGP/GnuPG key: 01D2433D + http://nagual.nl/ | SunOS 10u7 5/09 | OpenSolaris 2009.06 rel + All that's really worth doing is what we do for others (Lewis Carrol) ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Managing ZFS Replication
>If I recall correctly, modifiable snapshot properties aren't support >in older versions of ZFS :( >I wrote the script on Opensolaris 2008.11, which did have modifiable >snapshot properties. >Can you upgrade your pool versions possibly? I could, just don't know of S10 would allow it on that side? I figured looking at the script it was for Solaris and not OpenSolaris, my bad. I can just ignore those errors on the send side, it's working just fine. Thanks! jlc ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] The importance of ECC RAM for ZFS
On 31.07.09 22:04, Kurt Olsen wrote: On Jul 24, 2009, at 22:17, Bob Friesenhahn wrote: Most of the issues that I've read on this list would have been "solved" if there was a mechanism where the user / sysadmin could tell ZFS to simply go back until it found a TXG that worked. The trade off is that any transactions (and their data) after the working one would be lost. But at least you're not left with an un- importable pool. I'm curious as to why people think rolling back txgs don't come with additional costs beyond losing recent transactions. What are the odds that the data blocks that were replaced by the discarded transactions haven't been overwritten? Odds depend on lots of factors - activity in the pool, free space, block selection policy, metaslab cursor positions etc. I have seen examples of successful recovery to a point in time which is around 9 hours before last synced txg. Sometimes it is enough to roll one txg back, sometimes it requires going back and trying a a few older ones. Without a snapshot to hold the references aren't those blocks considered free and available for reuse? As soon as transaction group is synced, blocks freed during that transaction group time are released back to the pool, and potentially can be overwritten during next txg. Don't get me wrong, I do think that rolling back to previous uberblocks should be an option v. total pool loss, but it doesn't seem like one can reliably say that their data is in some known good state. If fact thanks to the fact that everything is checksummed one can say that pool is in a good shape as reliably as current checksum in use allows. victor ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Managing ZFS Replication
On Fri, Jul 31, 2009 at 10:25 AM, Joseph L. Casale wrote: >>I came up with a somewhat custom script, using some pre-existing >>scripts I found about the land. >> >>http://www.brentrjones.com/?p=45 > > Brent, > That was super helpful. I had to make some simple changes to the ssh > syntax as I use a specific user and identity file going from Solaris > 10 to OpenSolaris 0906 but I am getting this message: > > The Source snapshot does exist on the Destination, clear to send a new one! > Taking snapshot: /sbin/zfs snapshot mypool2/back...@2009-07-31t16:34:54Z > receiving incremental stream of mypool2/back...@2009-07-31t16:34:54Z into > mypool/back...@2009-07-31t16:34:54Z > received 39.7GB stream in 2244 seconds (18.1MB/sec) > cannot set property for 'mypool2/back...@2009-07-31t16:34:54Z': snapshot > properties cannot be modified > cannot set property for 'mypool2/back...@2009-58-30t21:58:15Z': snapshot > properties cannot be modified > cannot set property for 'mypool2/back...@2009-07-31t16:34:54Z': snapshot > properties cannot be modified > > Is that intended to modify the properties of a snapshot? Does that work > in some other version of Solaris other than 10u7? > > Thanks so much for that pointer! > jlc > ___ > zfs-discuss mailing list > zfs-discuss@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > If I recall correctly, modifiable snapshot properties aren't support in older versions of ZFS :( I wrote the script on Opensolaris 2008.11, which did have modifiable snapshot properties. Can you upgrade your pool versions possibly? -- Brent Jones br...@servuhome.net ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] The importance of ECC RAM for ZFS
> On Jul 24, 2009, at 22:17, Bob Friesenhahn wrote: > > Most of the issues that I've read on this list would > have been > "solved" if there was a mechanism where the user / > sysadmin could tell > ZFS to simply go back until it found a TXG that > worked. > > The trade off is that any transactions (and their > data) after the > working one would be lost. But at least you're not > left with an un- > importable pool. I'm curious as to why people think rolling back txgs don't come with additional costs beyond losing recent transactions. What are the odds that the data blocks that were replaced by the discarded transactions haven't been overwritten? Without a snapshot to hold the references aren't those blocks considered free and available for reuse? Don't get me wrong, I do think that rolling back to previous uberblocks should be an option v. total pool loss, but it doesn't seem like one can reliably say that their data is in some known good state. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Managing ZFS Replication
>I came up with a somewhat custom script, using some pre-existing >scripts I found about the land. > >http://www.brentrjones.com/?p=45 Brent, That was super helpful. I had to make some simple changes to the ssh syntax as I use a specific user and identity file going from Solaris 10 to OpenSolaris 0906 but I am getting this message: The Source snapshot does exist on the Destination, clear to send a new one! Taking snapshot: /sbin/zfs snapshot mypool2/back...@2009-07-31t16:34:54Z receiving incremental stream of mypool2/back...@2009-07-31t16:34:54Z into mypool/back...@2009-07-31t16:34:54Z received 39.7GB stream in 2244 seconds (18.1MB/sec) cannot set property for 'mypool2/back...@2009-07-31t16:34:54Z': snapshot properties cannot be modified cannot set property for 'mypool2/back...@2009-58-30t21:58:15Z': snapshot properties cannot be modified cannot set property for 'mypool2/back...@2009-07-31t16:34:54Z': snapshot properties cannot be modified Is that intended to modify the properties of a snapshot? Does that work in some other version of Solaris other than 10u7? Thanks so much for that pointer! jlc ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Install and boot from USB stick?
> How can i implement that change, after installing the > OS? Or do I need to build my own livecd? Boot from the livecd, attach the usb stick, open a terminal window, "pfexec bash" starts a root shell, "zpool import -f rpool" should find and import the zpool from the usb stick. Mount the root filesystem from the usb stick; zfs set mountpoint=legacy rpool/ROOT/opensolaris mount -F zfs rpool/ROOT/opensolaris /mnt And edit /mnt/kernel/drv/scsa2usb.conf E.g. try attribute-override-list = "vid=* reduced-cmd-support=true"; Try to boot from the usb stick, using the "reboot" command. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Best ways to contribute WAS: Fed up with ZFS causing data loss
On Jul 31, 2009, at 7:24 AM, m...@bruningsystems.com wrote: Hi Ross, Ross wrote: #3 zfs unlike other things like the build system are extremely well documented. There are books on it, code to read and even instructors (Max Bruning) who can teach you about the internals. My project even rganized a free online training for this Again, brilliant if you're a programmer. I think it is a misconception that a course about internals is meant only for programmers. An internals course should teach how the system works. If you are a programmer, this should help you to do programming on the system. If you are an admin, it should help you in your admin work by giving you a better understanding of what the system is doing. If you are a user, it should help you to make better use of the system. In short, I think anyone who is working with Solaris/OpenSolaris can benefit. I agree with Max, 110%. As an example, for the USENIX Technical Conference I put together a full day tutorial on ZFS. It was really 2.5 days of tutorial crammed into one day, but hey, you get more than you pay for sometimes :-). I kept the level above the source code, but touching on the structure of the system, the on-disk format, why nvlists are used, and discussed a few of the acronyms seen in various messages or references. To get into the source code level, you are looking at a week or more of lecture (and growing). I am planning a sys-admin oriented version of this tutorial for the USENIX LISA conference in November. I intend to move away from the technical (how this is done) and more towards the operational (practical, sane implementations). If anyone has suggestions for topics to be covered, please drop me a line. Also, if anyone wants to schedule some time at their site for training, I'm more than happy to travel :-) -- richard ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Fed up with ZFS causing data loss
On Thu, 30 Jul 2009, Ross wrote: Yes, I did miss that one, but could you remind me what exactly are the sd and ssd drivers? I can find lots of details about configuring them, but no basic documentation telling me what they are. Is your system lacking manual pages? I find excruciating details on my Solaris 10 system. Bob -- Bob Friesenhahn bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer,http://www.GraphicsMagick.org/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] How Virtual Box handles the IO
Thanks for following up with this, Russel. On Jul 31, 2009, at 7:11 AM, Russel wrote: After all the discussion here about VB, and all the finger pointing I raised a bug on VB about flushing. Remember I am using RAW disks via the SATA emulation in VB the disks are WD 2TB drives. Also remember the HOST machine NEVER crashed or stopped. BUT the guest OS OpenSolaris was hung and so I powered off the VIRTUAL host. OK, this is what the VB engineer had to say after reading this and another thread I had pointed him to. (he missed the fast I was using RAW not supprising as its a rather long thread now!) === Just looked at those two threads, and from what I saw all vital information is missing - no hint whatsoever on how the user set up his disks, nothing about what errors should be dealt with and so on. So hard to say anything sensible, especially as people seem most interested in assigning blame to some product. ZFS doesn't deserve this, and VirtualBox doesn't deserve this either. In the first place, there is absolutely no difference in how the IDE and SATA devices handle the flush command. The documentation just wasn't updated to talk about the SATA controller. Thanks for pointing this out, it will be fixed in the next major release. If you want to get the information straight away: just replace "piix3ide" with "ahci", and all other flushing behavior settings apply as well. See a bit further below of what it buys you (or not). What I haven't mentioned is the rationale behind the current behavior. The reason for ignoring flushes is simple: the biggest competitor does it by default as well, and one gets beaten up by every reviewer if VirtualBox is just a few percent slower than you know what. Forget about arguing with reviewers. That said, a bit about what flushing can achieve - or not. Just keep in mind that VirtualBox doesn't really buffer anything. In the IDE case every read and write requests gets handed more or less straight (depending on the image format complexity) to the host OS. So there is absolutely nothing which can be lost if one assumes the host OS doesn't crash. In the SATA case things are slightly more complicated. If you're using anything but raw disks or flat file VMDKs, the behavior is 100% identical to IDE. If you use raw disks or flat file VMDKs, we activate NCQ support in the SATA device code, which means that the guest can push through a number of commands at once, and they get handled on the host via async I/O. Again - if the host OS works reliably there is nothing to lose. The problem with this thought process is that since the data is not on medium, a fault that occurs between the flush request and the bogus ack goes undetected. The OS trusts when the disk said "the data is on the medium" that the data is on the medium with no errors. This problem also affects "hardware" RAID arrays which provide nonvolatile caches. If the array acks a write and flush, but the data is not yet committed to medium, then if the disk fails, the data must remain in nonvolatile cache until it can be committed to the medium. A use case may help, suppose the power goes out. Most arrays have enough battery to last for some time. But if power isn't restored prior to the batteries discharging, then there is a risk of data loss. For ZFS, cache flush requests are not gratuitous. One critical case is the uberblock or label update. ZFS does: 1. update labels 0 and 2 2. flush 3. check for errors 4. update labels 1 and 3 5. flush 6. check for errors Making flush be a nop destroys the ability to check for errors thus breaking the trust between ZFS and the data on medium. -- richard The only thing what flushing can potentially improve is the behavior when the host OS crashes. But that depends on many assumptions on what the respective OS does, the filesystems do etc etc. Hope those facts can be the basis of a real discussion. Feel free to raise any issue you have in this context, as long as it's not purely hypothetical. === -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Install and boot from USB stick?
How can i implement that change, after installing the OS? Or do I need to build my own livecd? -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Zfs deduplication
Will the material ever be posted. Looks there is some huge bugs with zfs deduplication that the organizers do not want to post it also there is no indication on sun website if there will be a deduplication feature. I think its best they concentrate on improving zfs performance and speed with compression enabled. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Install and boot from USB stick?
> Well, here is the error: > > ... usb stick reports(?) scsi error: medium may have changed ... That's strange. The media in a flash memory stick can't be changed - although most sticks report that they do have removable media. Maybe this stick needs one of the workarounds that can be enabled in /kernel/drv/scsa2usb.conf ? -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] sam-fs on zfs-pool
No, just did "zfs create -V". and I didn't change the size of the zpool or zvol at any time.. regards, Tobias Jim Klimov schrieb: Concerning the reservations, here's a snip from "man zfs": The reservation is kept equal to the volume's logical size to prevent unexpected behavior for consumers. Without the reservation, the volume could run out of space, resulting in undefined behavior or data corrup- tion, depending on how the volume is used. These effects can also occur when the volume size is changed while it is in use (particularly when shrinking the size). Extreme care should be used when adjusting the volume size. Though not recommended, a "sparse volume" (also known as "thin provisioning") can be created by specifying the -s option to the zfs create -V command, or by changing the reservation after the volume has been created. A "sparse volume" is a volume where the reservation is less then the volume size. Consequently, writes to a sparse volume can fail with ENOSPC when the pool is low on space. For a sparse volume, changes to volsize are not reflected in the reservation. Did you do anything like this? HTH, //Jim ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Install and boot from USB stick?
> I've found it only works for USB sticks up to 4GB :( > If I tried a USB stick bigeer than that, it didn't boot. Works for me on 8GB USB sticks. It is possible that the stick you've tried has some issues with the Solaris USB drivers, and needs to have one of the workarounds from the scsa2usb.conf file enabled. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Install and boot from USB stick?
Well, here is the error: Cant seem to find anything on google. Only thing i found were some source code where it seems this error accour: http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/uts/common/io/scsi/impl/scsi_subr.c Suggestions? -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Best ways to contribute WAS: Fed up with ZFS causing data loss
Hi Ross, Ross wrote: #3 zfs unlike other things like the build system are extremely well documented. There are books on it, code to read and even instructors (Max Bruning) who can teach you about the internals. My project even rganized a free online training for this Again, brilliant if you're a programmer. I think it is a misconception that a course about internals is meant only for programmers. An internals course should teach how the system works. If you are a programmer, this should help you to do programming on the system. If you are an admin, it should help you in your admin work by giving you a better understanding of what the system is doing. If you are a user, it should help you to make better use of the system. In short, I think anyone who is working with Solaris/OpenSolaris can benefit. max ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] sam-fs on zfs-pool
Concerning the reservations, here's a snip from "man zfs": The reservation is kept equal to the volume's logical size to prevent unexpected behavior for consumers. Without the reservation, the volume could run out of space, resulting in undefined behavior or data corrup- tion, depending on how the volume is used. These effects can also occur when the volume size is changed while it is in use (particularly when shrinking the size). Extreme care should be used when adjusting the volume size. Though not recommended, a "sparse volume" (also known as "thin provisioning") can be created by specifying the -s option to the zfs create -V command, or by changing the reservation after the volume has been created. A "sparse volume" is a volume where the reservation is less then the volume size. Consequently, writes to a sparse volume can fail with ENOSPC when the pool is low on space. For a sparse volume, changes to volsize are not reflected in the reservation. Did you do anything like this? HTH, //Jim -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] sam-fs on zfs-pool
> If I understand you right it is as you said. > Here's an example and you can see what happened. > The sam-fs is filled to only 6% and the zvol ist full. I'm afraid I was not clear with my question, so I'd elaborate, then. It remains standing as: during this situation, can you write new data into SAMFS? That is, can you fill it up from these 6% used? Or does the system complain that it can't write more data? The way I see this discussion (and maybe I'm wrong), it's thus: * Your zvol starts sparse (not using much space from the pool, but with a "quota" of 405Gb). That is, you don't have a "reservation" for these 405Gb to grab them as soon as you create the zvol and not let any other datasets use this space. * Your zvol allocates blocks from the pool to keep the data written by SAMFS, and the disk space consumed from the pool grows until the zvol hits the quota (405Gb of allocated blocks = 100% of quota). * SAMFS writes data to the zvol and never tells the zvol that you deleted some files so these blocks can be unallocated. * The zvol could release unused blocks - if it ever knew they are unused. If this is all true, then your zvol now consumes 405Gb from the pool, and your SAMFS thinks it uses 6% of the block device with its 25Gb of saved files. However, (and this is the salt of my question) the situation does not prevent you from writing the other 380Gb into the SAMFS without errors and complaints, and probably not changing the amount of space "used" in the ZFS pool and in the zvol dataset either. Is this assumption correct? If it is, then I'd see the situation as a big inconvenience and a way to improve interaction between SAMFS and ZFS as its storage (and/or fix a regression if this worked better in previous releases). But it's not a bug per se. However, if you can't write much data into the SAMFS now, it is definitely a bad bug. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Fed up with ZFS causing data loss
sd is the older scsi-disk driver, ssd is the new scsi-disk driver (part of the leadville driver package) that allowed for more than 256 luns per target... We've had systems that used the sd drivers until we upgraded to newer, sun provided drivers for qlogic / emulex cards, which then were using the ssd driver. Both handle scsi (or emulated scsi from the different device driver layers) devices. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Fed up with ZFS causing data loss
You might check the hardware compatibility list at Sun's site.. It might list the driver that will be used for the card your looking at... I'm not sure, it's been a while since I've looked at it... -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Fed up with ZFS causing data loss
Heh, that's one thing I love about Linux & Solaris - the amount of info you can find if you know what you're doing is scary. However, while that will work for the Marvell SATA card I do have fitted in a server, it's not going to help for the others - they are all items I'm researching for our next system, and I don't have them to hand right now. But from what you're saying, is the sd driver used for more than just SCSI hard disk and cd-rom devices? The man page for sd says nothing about it being used for other devices, although googling the SATA driver does reveal a link there. Is the SD driver used as the framework for all of these (SATA, SAS, iSER, Adaptec)? If so, these tunables really could be a godsend! Ross PS. I'm all for digging, but when you're in over your head it's time to shout for help ;-) -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Install and boot from USB stick?
Hi: On Fri, Jul 31, 2009 at 07:54, Jürgen Keil wrote: >> The GRUB menu is presented, no problem there, and >> then the opensolaris progress bar. But im unable to >> find a way to view any details on whats happening >> there. The progress bar just keep scrolling and >> scrolling. > > Press the ESC key; this should switch back from > graphics to text mode and most likely you'll see > that the OS is waiting for some console user input. I've found it only works for USB sticks up to 4GB :( If I tried a USB stick bigeer than that, it didn't boot. Cheers. -- Pablo Méndez Hernández ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] How Virtual Box handles the IO
After all the discussion here about VB, and all the finger pointing I raised a bug on VB about flushing. Remember I am using RAW disks via the SATA emulation in VB the disks are WD 2TB drives. Also remember the HOST machine NEVER crashed or stopped. BUT the guest OS OpenSolaris was hung and so I powered off the VIRTUAL host. OK, this is what the VB engineer had to say after reading this and another thread I had pointed him to. (he missed the fast I was using RAW not supprising as its a rather long thread now!) === Just looked at those two threads, and from what I saw all vital information is missing - no hint whatsoever on how the user set up his disks, nothing about what errors should be dealt with and so on. So hard to say anything sensible, especially as people seem most interested in assigning blame to some product. ZFS doesn't deserve this, and VirtualBox doesn't deserve this either. In the first place, there is absolutely no difference in how the IDE and SATA devices handle the flush command. The documentation just wasn't updated to talk about the SATA controller. Thanks for pointing this out, it will be fixed in the next major release. If you want to get the information straight away: just replace "piix3ide" with "ahci", and all other flushing behavior settings apply as well. See a bit further below of what it buys you (or not). What I haven't mentioned is the rationale behind the current behavior. The reason for ignoring flushes is simple: the biggest competitor does it by default as well, and one gets beaten up by every reviewer if VirtualBox is just a few percent slower than you know what. Forget about arguing with reviewers. That said, a bit about what flushing can achieve - or not. Just keep in mind that VirtualBox doesn't really buffer anything. In the IDE case every read and write requests gets handed more or less straight (depending on the image format complexity) to the host OS. So there is absolutely nothing which can be lost if one assumes the host OS doesn't crash. In the SATA case things are slightly more complicated. If you're using anything but raw disks or flat file VMDKs, the behavior is 100% identical to IDE. If you use raw disks or flat file VMDKs, we activate NCQ support in the SATA device code, which means that the guest can push through a number of commands at once, and they get handled on the host via async I/O. Again - if the host OS works reliably there is nothing to lose. The only thing what flushing can potentially improve is the behavior when the host OS crashes. But that depends on many assumptions on what the respective OS does, the filesystems do etc etc. Hope those facts can be the basis of a real discussion. Feel free to raise any issue you have in this context, as long as it's not purely hypothetical. === -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Fed up with ZFS causing data loss
> you could have tried "man sd" and "man ssd" D'oh. I'm far too used to downloading documentation online... when you come from a windows background having driver manuals on your system is rather unexpected :) Thanks James. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Fed up with ZFS causing data loss
Where to find information?Do some searching... 1st off, run through the drivers loaded via modinfo - look to see if anything there is specific to your card. prtconf -v | pg - again, looking for your controller card... once you find it - look at the driver listed or tied to it, or failing that, the vendor-id string if it's available. If that fails, look at the device in /dev/dsk - follow the symlink to it's /devices entry, see what the major number is, refer to /etc/name_to_major - find the major device number, see the driver (or alias) associated with it. Can also refer to the /etc/path_to_inst to gleen some info on the disks (looking for partial /devices pathing as you find in the above note, then see which driver is used for the disk device - regardless of the driver used for the controller, the disks *should* normally use either the sd or ssd driver. Once you figure out which driver is used for your card, look in /kernel/drv for the driver name, and see if there's a .conf file with the same name prefix. Look for tunables for the card driver (using the driver.conf name), as well as the sd or ssd driver (via sd.conf, ssd.conf) - spend a few minutes searching via your favorite web search engine. A little time spent digging can resolve a lot of problems... -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Lundman home NAS
Yes, please write more about this. The photos are terrific and I appreciate the many useful observations you've made. For my home NAS I chose the Chenbro ES34069 and the biggest problem was finding a SATA/PCI card that would work with OpenSolaris and fit in the case (technically impossible without a ribbon cable PCI adapter). After seeing this, I may reconsider my choice. For the SATA card, you mentioned that it was a close fit with the case power switch. Would removing the backplane on the card have helped? Thanks n On Fri, Jul 31, 2009 at 5:22 AM, Jorgen Lundman wrote: > I have assembled my home RAID finally, and I think it looks rather good. > > http://www.lundman.net/gallery/v/lraid5/p1150547.jpg.html > > Feedback is welcome. > > I have yet to do proper speed tests, I will do so in the coming week should > people be interested. > > Even though I have tried to use only existing, and cheap, parts the end sum > became higher than I expected. Final price is somewhere in the 47,000 yen > range. (Without hard disks) > > If I were to make and sell these, they would be 57,000 or so, so I do not > really know if anyone would be interested. Especially since SOHO NAS devices > seem to start around 80,000. > > Anyway, sure has been fun. > > Lund ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] surprisingly poor performance
The things I'd pay most attention to would be all single threaded 4K, 32K, and 128K writes to the raw device. Maybe sure the SSD has a capacitor and enable the write cache on the device. -r Le 5 juil. 09 à 12:06, James Lever a écrit : On 04/07/2009, at 3:08 AM, Bob Friesenhahn wrote: It seems like you may have selected the wrong SSD product to use. There seems to be a huge variation in performance (and cost) with so-called "enterprise" SSDs. SSDs with capacitor-backed write caches seem to be fastest. Do you have any methods to "correctly" measure the performance of an SSD for the purpose of a slog and any information on others (other than anecdotal evidence)? cheers, James ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss smime.p7s Description: S/MIME cryptographic signature ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] article on btrfs, comparison with zfs
An introduction to btrfs, from somebody who used to work on ZFS: http://www.osnews.com/story/21920/A_Short_History_of_btrfs *very* interesting article.. Not sure why James didn't directly link to it, but courteous of Valerie Aurora (formerly Henson) http://lwn.net/Articles/342892/ I'm trying to understand the argument against the SLAB allocator approach. If I understood correctly how BTRFS allocates space, changing and deleting files may just punch randomly sized holes into the disk layout. How's that better? Regards, -mg ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Lundman home NAS
i used a 4u case for mine, it's MASSIVE...i used this case here http://members.multiweb.nl/nan1/img/norco05.jpg http://www.newegg.com/Product/Product.aspx?Item=N82E16811219021 it's an awesome case for the money...i plan to build another one soon. On Fri, Jul 31, 2009 at 8:22 AM, Jorgen Lundman wrote: > > I have assembled my home RAID finally, and I think it looks rather good. > > http://www.lundman.net/gallery/v/lraid5/p1150547.jpg.html > > Feedback is welcome. > > I have yet to do proper speed tests, I will do so in the coming week should > people be interested. > > Even though I have tried to use only existing, and cheap, parts the end sum > became higher than I expected. Final price is somewhere in the 47,000 yen > range. (Without hard disks) > > If I were to make and sell these, they would be 57,000 or so, so I do not > really know if anyone would be interested. Especially since SOHO NAS devices > seem to start around 80,000. > > Anyway, sure has been fun. > > Lund > > -- > Jorgen Lundman | > Unix Administrator | +81 (0)3 -5456-2687 ext 1017 (work) > Shibuya-ku, Tokyo| +81 (0)90-5578-8500 (cell) > Japan| +81 (0)3 -3375-1767 (home) > ___ > zfs-discuss mailing list > zfs-discuss@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Lundman home NAS
I have assembled my home RAID finally, and I think it looks rather good. http://www.lundman.net/gallery/v/lraid5/p1150547.jpg.html Feedback is welcome. I have yet to do proper speed tests, I will do so in the coming week should people be interested. Even though I have tried to use only existing, and cheap, parts the end sum became higher than I expected. Final price is somewhere in the 47,000 yen range. (Without hard disks) If I were to make and sell these, they would be 57,000 or so, so I do not really know if anyone would be interested. Especially since SOHO NAS devices seem to start around 80,000. Anyway, sure has been fun. Lund -- Jorgen Lundman | Unix Administrator | +81 (0)3 -5456-2687 ext 1017 (work) Shibuya-ku, Tokyo| +81 (0)90-5578-8500 (cell) Japan| +81 (0)3 -3375-1767 (home) ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] sam-fs on zfs-pool
Hi Jim, first of all I'm sure this behaviour is a bug or has been changed sometime in the past, because I've used this configuration a lot of times. If I understand you right it is as you said. Here's an example and you can see what happened. The sam-fs is filled to only 6% and the zvol ist full. *archiv1:~ # zfs list* NAME USED AVAIL REFER MOUNTPOINT sampool 405G2.49G 18K/sampool sampool/samdev1 405G 0K 405G- *archiv1:~ # samcmd f* File systems samcmd 4.6.85 11:18:32 Jul 28 2009 samcmd on archiv1 ty eq state device_name status high low mountpoint server ms 1 on samfs1 m2d 80% 70% /samfs md11 on /dev/zvol/dsk/sampool/samdev1 *archiv1:~ # samcmd m* Mass storage status samcmd 4.6.85 11:19:09 Jul 28 2009 samcmd on archiv1 ty eq statususestate ord capacity free rapart high low ms 1 m2d 6%on 405.000G380.469G 1M16 80% 70% md11 6%on 0 405.000G 380.469G Jim Klimov schrieb: Hello tobex, While the original question may have been answered by posts above, I'm interested: when you say "according to zfs list the zvol is 100% full", does it only mean that it uses all 20Gb on the pool (like a non-sparse uncompressed file), or does it also imply that you can't write into the samfs although its structures are only 20% used? If by any chance the latter - I think it would count as a bug. If the former - see the posts above for explanations and workarounds :) Thanks in advance for such detail, Jim ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Fed up with ZFS causing data loss
On Thu, 30 Jul 2009 23:55:19 -0700 (PDT) Ross wrote: > Yes, I did miss that one, but could you remind me what exactly are the > sd and ssd drivers? I can find lots of details about configuring them, > but no basic documentation telling me what they are. you could have tried "man sd" and "man ssd" Devicessd(7D) NAME sd - SCSI disk and ATAPI/SCSI CD-ROM device driver SYNOPSIS s...@target,lun:partition Devices ssd(7D) NAME ssd - Fibre Channel Arbitrated Loop disk device driver SYNOPSIS s...@port,target:partition You won't see an ssd instance on x86, only on sparc. James C. McPherson -- Senior Kernel Software Engineer, Solaris Sun Microsystems http://blogs.sun.com/jmcp http://www.jmcp.homeunix.com/blog Kernel Conference Australia - http://au.sun.com/sunnews/events/2009/kernel ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] sam-fs on zfs-pool
Hello tobex, While the original question may have been answered by posts above, I'm interested: when you say "according to zfs list the zvol is 100% full", does it only mean that it uses all 20Gb on the pool (like a non-sparse uncompressed file), or does it also imply that you can't write into the samfs although its structures are only 20% used? If by any chance the latter - I think it would count as a bug. If the former - see the posts above for explanations and workarounds :) Thanks in advance for such detail, Jim -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Fed up with ZFS causing data loss
Interesting, thanks Miles. Up to this week I've never heard that any of this was tunable, but I'm more than happy to go in that direction if that's the way to do it. :-) Can anybody point me in the direction of where I find documentation for tunables for the Marvell SATA driver, the LSI SAS driver (for 1064E), the Adaptec RAID driver, and the iSER driver? -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] [n/zfs-discuss] Strange speeds with x4500, Solaris 10 10/08
> Ross wrote: > > Great idea, much neater than most of my suggestions > too :-) > > > What is? Please keep some context for those of us on > email! x25-e drives as a mirrored boot volume on an x4500, partitioning off some of the space for the slog. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Fed up with ZFS causing data loss
Hi, Most of the time while waiting on a disk to fail is spent in disk drivers and not ZFS itself. If you want to lower the timeouts than you can do so by configuring different timeouts for sd. ssd or any other driver you are using. See http://wikis.sun.com/display/StorageDev/Retry-Reset+Parameters So if you are using sd driver it will try sd_io_time * sd_retry_count times where by default sd_io_time is 60s and sd_retry_count is 5 or 3 depending on device type (fc or not) IIRC. Try to lower the timeout or number of retries or both. -- Robert Milkowski http://milek.blogspot.com -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] feature proposal
Because it means you can create zfs snapshots from a non solaris/non local client... Like a linux nfs client, or a windows cifs client. T dick hoogendijk wrote: On Wed, 29 Jul 2009 17:34:53 -0700 Roman V Shaposhnik wrote: On the read-write front: wouldn't it be cool to be able to snapshot things by: $ mkdir .zfs/snapshot/ I've followed this thread but I fail to see the advantages of this. I guess I miss something here. Can you explain to me why the above would be better (nice to have) then "zfs create whate...@now?" ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] feature proposal
Le 31 juil. 09 à 10:24, dick hoogendijk a écrit : On Wed, 29 Jul 2009 17:34:53 -0700 Roman V Shaposhnik wrote: On the read-write front: wouldn't it be cool to be able to snapshot things by: $ mkdir .zfs/snapshot/ I've followed this thread but I fail to see the advantages of this. I guess I miss something here. Can you explain to me why the above would be better (nice to have) then "zfs create whate...@now?" Because it can be done on any host mounting this file system through a network protocol like NFS or CIFS. A nice feature for a NAS. Gaëtan -- Gaëtan Lehmann Biologie du Développement et de la Reproduction INRA de Jouy-en-Josas (France) tel: +33 1 34 65 29 66fax: 01 34 65 29 09 http://voxel.jouy.inra.fr http://www.itk.org http://www.mandriva.org http://www.bepo.fr PGP.sig Description: Ceci est une signature électronique PGP ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] feature proposal
dick hoogendijk wrote: On Wed, 29 Jul 2009 17:34:53 -0700 Roman V Shaposhnik wrote: On the read-write front: wouldn't it be cool to be able to snapshot things by: $ mkdir .zfs/snapshot/ I've followed this thread but I fail to see the advantages of this. I guess I miss something here. Can you explain to me why the above would be better (nice to have) then "zfs create whate...@now?" Many more systems have a mkdir (or equivalent) command than have a zfs command. -- Andrew ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] feature proposal
On Wed, 29 Jul 2009 17:34:53 -0700 Roman V Shaposhnik wrote: > On the read-write front: wouldn't it be cool to be able to snapshot > things by: > $ mkdir .zfs/snapshot/ I've followed this thread but I fail to see the advantages of this. I guess I miss something here. Can you explain to me why the above would be better (nice to have) then "zfs create whate...@now?" -- Dick Hoogendijk -- PGP/GnuPG key: 01D2433D + http://nagual.nl/ | SunOS 10u7 5/09 | OpenSolaris 2009.06 rel + All that's really worth doing is what we do for others (Lewis Carrol) ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Best ways to contribute WAS: Fed up with ZFS causing data loss
I'm going to reply because I think you're being a little short sighted here. In response to your 'patches welcome' comment. I'd love to submit a patch, but since I've had no programming training and my only real experience is with Visual Basic, I doubt I'm going to be much use. I'm more likely to fundamentally break something than help. > > Saying "if you don't like it, patch it" is an ignorant cop-out, and a > > troll response to people's problems with software. Yup, I'm with Robby here, and I'll explain why below. > bs. I'm entirely *outside* of Sun and just tired of hearing whining and > complaints about features not implemented. So the facts are a bit more > clear in case you think I'm ignorant... Your attitude is showing through. Chill dude. > #1 The source has been available and modified from those outside sun for I > think 3 years?? Great. I'm not a programmer, explain again how this helps me? > #2 I fully agree the threshold to contribute is *significantly* high. (I'm > working on a project to reduce this) Ok, that does sound helpful, thanks. > #3 zfs unlike other things like the build system are extremely well > documented. There are books on it, code to read and even instructors > (Max Bruning) who can teach you about the internals. My project even > rganized a free online training for this Again, brilliant if you're a programmer. > This isn't zfs-haters or zfs-. > > Use it, love it or help out... I'll take all three thanks, and let me explain how I do that: I use this forum as a way to share my experiences with other ZFS users, and with the ZFS developers. In case you hadn't noticed, this is a community, and there are more people here than just Sun developers. If I find something I don't like, yes, I'm vocal. Sometimes I'm wrong and people educate me, other times people agree with me and if the consensus is that it's serious enough I file a bug. What you need to understand is that software isn't created by developers alone. A full team is made up of interface architects, designers, programmers, testers and support staff. No one person has all the skills needed. I don't program, and any attempt I made there would likely be more of a hindrance than a help, so instead I test. I've put in huge amounts of time testing Solaris, far more than I would need to if we were just implementing it internally, and if I find a problem I first discuss it in these forums, and then write a bug report if it's necessary. I've reported 10 bugs to Sun so far, 6 of which have now been fixed. Hell, there's an old post from me that on its own resulted in lively discussion from a dozen or more people, cumulating in 3-4 significant bug reports, and a 15 page PDF writeup I created, summarizing several weeks of testing and discussion for the developers. I've also done my bit by coming on these forums, sharing my experience with others, and helping other people when they come across a problem that I already know how to solve. You coming here and accusing me of not helping is incredibly narrow minded. I may not be a programmer, but I've put in hundreds of hours of work on ZFS, helping to improve the quality of the product and doing my bit to get more people using it. > documentation, patches to help lower the barrier of entry, irc support, > donations, detailed and accurate feedback on needed features and lots of > other things welcomed.. maybe there's a more productive way to get what > you need implemented? > > I think what I'm really getting at is instead of dumping on this list > all the problems that need to be fixed and the long drawn out stories.. > File a bug report.. put the time in to explore the issue on your own.. > I'd bet that if even 5% of the developers using zfs sent a patch of some > nature we would avoid this whole thread. "dumping on the list". You mean sharing problems with the community so other people can be aware of the issues? As I explained above, this is a community, and in case you haven't realised yet, it's not made up entirely of programmers. A lot of us on here are network admins, and to us these posts are valuable. I come here regularly to read posts exactly like this from other people because that way I get to benefit from the experiences of other admins. > Call me a troll if you like.. I'm still going to lose my tact every once > in a while when all I see is whiny/noisy threads for days.. I actually > don't mean to single you out.. there just seems to be a lot of > negativity lately.. These "whiney" threads are very helpful to those of us actually using the software, but don't worry, I also have a tendency to lose my tact when faced with "whiney" programmers. :-p -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] [n/zfs-discuss] Strange speeds with x4500, Solaris 10 10/08
Ross wrote: Great idea, much neater than most of my suggestions too :-) What is? Please keep some context for those of us on email! -- Ian. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss