Re: [zfs-discuss] ZFS loses configuration
You will have to uncomment the zpool import -a line in /mnt/eon0/.exec for this to automatically import your pools on startup. (took a while before I found this too...) Other than that, for me, EON is great! br, syljua -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Old quota tools
Hi List, Does anybody have scripts available which mimic the ufs quota tools on zfs. A tool I use relies on the old quota tools (quota, edquota, quotaon, quotaoff, repquota, quotacheck). I use zfs filesystem quota and reservation for /export/home/username filesystems. I would like the tool to manage the quotas but I'm not able to change the code. I can only change the quota commands but it still expect the commands to behave like the ufs quota commands. Thanks, Martijn -- YoungGuns Kasteleinenkampweg 7b 5222 AX 's-Hertogenbosch T. 073 623 56 40 F. 073 623 56 39 www.youngguns.nl KvK 18076568 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] etc on separate pool
Is it possible to have /etc on separate zfs pool in OpenSolaris? The purpose is to have rw non-persistent main pool and rw persistent /etc... I've tried to make legacy etcpool/etc file system and mount it in /etc/vfstab... Is it possible to extend boot-archive in such a way that it include most of the files necessary for mounting /etc from separate pool? Have someone tried such configurations? -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zones and other filesystems
On 01/21/10 17:03, Thomas Burgess wrote: I'm pretty new to opensolaris. I come from FreeBSD. Naturally, after using FreeBSD forr awhile i've been big on the use of FreeBSD jails so i just had to try zones. I've figured out how to get zones running but now i'm stuck and need help. Is there anything like nullfs in opensolaris... or maybe there is a more solaris way of doing what i need to do. Basically, what i'd like to do is give a specific zone access to 2 zfs filesystems which are available to the global zone. my new zones are in: /export/home/zone1 /export/home/zone2 What i'd like to do is give them access to: /tank/nas/Video /tank/nas/JeffB # zonecfg -z zone1 add dataset set name=tank/nas/Video end add dataset set name=tank/nas/JeffB end exit # zoneadm -z zone1 reboot Thanks, Zoram i'm sure i looked over something hugely easy and important...thanks. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Dedup memory overhead
On 21 janv. 2010, at 22:55, Daniel Carosone wrote: On Thu, Jan 21, 2010 at 05:04:51PM +0100, erik.ableson wrote: What I'm trying to get a handle on is how to estimate the memory overhead required for dedup on that amount of storage. We'd all appreciate better visibility of this. This requires: - time and observation and experience, and - better observability tools and (probably) data exposed for them I'd guess that since every written block is going to go and ask for the hash keys, this should result in this data living in the ARC based on the MFU ruleset. The theory being that as a result if I can determine the maximum memory requirement for these keys, I know what my minimum memory baseline requirements will be to guarantee that I won't be caught short. So the question is how much memory or L2ARC would be necessary to ensure that I'm never going back to disk to read out the hash keys. I think that's a wrong-goal for optimisation. For performance (rather than space) issues, I look at dedup as simply increasing the size of the working set, with a goal of reducing the amount of IO (avoided duplicate writes) in return. True. but as a practical aspect, we've seen that overall performance drops off the cliff if you overstep your memory bounds and the system is obliged to go to disk to evaluate a new block to write against the hash keys. Compounded by the fact that the ARC is full so it's obliged to go straight to disk, further exacerbating the problem. It's this particular scenario that I'm trying to avoid and from a business aspect of selling ZFS based solutions (whether to a client or to an internal project) we need to be able to ensure that the performance is predictable with no surprises. Realizing of course that all of this is based on a slew of uncontrollable variables (size of the working set, IO profiles, ideal block sizes, etc.). The empirical approach of give it lots and we'll see if we need to add an L2ARC later is not really viable for many managers (despite the fact that the real world works like this). The trouble is that the hash function produces (we can assume) random hits across the DDT, so the working set depends on the amount of data and the rate of potentially dedupable writes as well as the actual dedup hit ratio. A high rate of writes also means a large amount of data in ARC waiting to be written at the same time. This makes analysis very hard (and pushes you very fast towards that very steep cliff, as we've all seen). I don't think it would be random since _any_ write operation on a deduplicated filesystem would require a hash check, forcing them to live in the MFU. However I agree that a high write rate would result in memory pressure on the ARC which could result in the eviction of the hash keys. So the next factor to include in memory sizing is the maximum write rate (determined by IO availability). So with a team of two GbE cards, I could conservatively say that I need to size for inbound write IO of 160MB/s, worst case accumulated for the 30 second flush cycle so, say about 5GB of memory (leaving aside ZIL issues etc.). Noting that this is all very back of the napkin estimations, and I also need to have some idea of what my physical storage is capable of ingesting which could add to this value. I also think a threshold on the size of blocks to try deduping would help. If I only dedup blocks (say) 64k and larger, i might well get most of the space benefit for much less overhead. Well - since my primary use case is iSCSI presentation to VMware backed by zvols and I can manually force the block size on volume creation to 64, this reduces the unpredictability a little bit. That's based on the hypothesis that zvols use a fixed block size. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] ZFS raidz2 on Aberdeen SCSI DAS problems
Hi, I'm trying to build OpenSolaris storage server but I'm expiriencing regular zpool corruptions after one or two days of operation. I would like if someone would comment on my hardware that I use in this setup, and give me some pointers how to troubleshoot this. Machine that Opensolaris is installed on has Supermicro Intel X7DCT motherboard, and LSI22320SE SGL SCSI HBA. Aberdeen XDAS P6 Series - 3U SCSI DAS is attached to HBA, (16 bays with 2Tb hitachi drives), and drives are configured as Pass Through. I built just one testPool with one vdev containing 8 drives, in raidz2. This is the zpool status output after the zpool crash: pool: testPool state: DEGRADED status: One or more devices are faulted in response to persistent errors. Sufficient replicas exist for the pool to continue functioning in a degraded state. action: Replace the faulted device, or use 'zpool clear' to mark the device repaired. scrub: resilver completed after 0h0m with 0 errors on Fri Jan 22 09:29:51 2010 config: NAME STATE READ WRITE CKSUM testPool UNAVAIL 0 0 0 insufficient replicas raidz2 UNAVAIL 128 0 insufficient replicas c10t0d0 FAULTED 695 3 too many errors c10t0d1 FAULTED 589 3 too many errors c10t0d2 ONLINE 2 0 0 c10t0d3 ONLINE 2 1 0 6K resilvered c10t0d4 ONLINE 4 4 0 5.50K resilvered c10t0d5 ONLINE 2 8 0 4K resilvered c10t0d6 DEGRADED1 9 3 too many errors c10t0d7 ONLINE 3 8 0 3.50K resilvered errors: 3 data errors, use '-v' for a list And this is the relevant lines from my /var/adm/messages: Jan 22 08:02:54 diskgot firmware SCSI bus reset. Jan 22 08:02:54 disk log info = 0 Jan 22 08:03:07 disk scsi: [ID 365881 kern.info] /p...@0,0/pci8086,6...@4/pci10b5 ,8...@0/pci1000,1...@8 (mpt0): Jan 22 08:03:07 diskRev. 8 LSI, Inc. 1030 found. Jan 22 08:03:07 disk scsi: [ID 365881 kern.info] /p...@0,0/pci8086,6...@4/pci10b5 ,8...@0/pci1000,1...@8 (mpt0): Jan 22 08:03:07 diskmpt0 supports power management. Jan 22 08:03:07 disk scsi: [ID 365881 kern.info] /p...@0,0/pci8086,6...@4/pci10b5 ,8...@0/pci1000,1...@8 (mpt0): Jan 22 08:03:07 diskmpt0 unrecognized capability 0x6. Jan 22 08:03:10 disk scsi: [ID 365881 kern.info] /p...@0,0/pci8086,6...@4/pci10b5 ,8...@0/pci1000,1...@8 (mpt0): Jan 22 08:03:10 diskmpt0: IOC Operational. Jan 22 08:03:13 disk scsi: [ID 365881 kern.info] /p...@0,0/pci8086,6...@4/pci10b5 ,8...@0/pci1000,1...@8,1 (mpt1): Jan 22 08:03:13 diskRev. 8 LSI, Inc. 1030 found. Jan 22 08:03:13 disk scsi: [ID 365881 kern.info] /p...@0,0/pci8086,6...@4/pci10b5 ,8...@0/pci1000,1...@8,1 (mpt1): Jan 22 08:03:13 diskmpt1 supports power management. Jan 22 08:03:13 disk scsi: [ID 365881 kern.info] /p...@0,0/pci8086,6...@4/pci10b5 ,8...@0/pci1000,1...@8,1 (mpt1): Jan 22 08:03:13 diskmpt1 unrecognized capability 0x0. Jan 22 08:03:13 disk scsi: [ID 365881 kern.info] /p...@0,0/pci8086,6...@4/pci10b5 ,8...@0/pci1000,1...@8,1 (mpt1): Jan 22 08:03:13 diskmpt1: IOC Operational. Jan 22 08:04:50 disk fmd: [ID 441519 daemon.error] SUNW-MSG-ID: ZFS-8000-FD, TYP E: Fault, VER: 1, SEVERITY: Major Jan 22 08:04:50 disk EVENT-TIME: Fri Jan 22 08:04:50 GMT 2010 Jan 22 08:04:50 disk PLATFORM: X7DCT, CSN: 0123456789, HOSTNAME: disk Jan 22 08:04:50 disk SOURCE: zfs-diagnosis, REV: 1.0 Jan 22 08:04:50 disk EVENT-ID: 857d4e64-9a2f-e6fb-94c2-9337566aa6c9 Jan 22 08:04:50 disk DESC: The number of I/O errors associated with a ZFS device exceeded Jan 22 08:04:50 disk acceptable levels. Refer to http://sun.com/msg/ZFS -8000-FD for more information. Jan 22 08:04:50 disk AUTO-RESPONSE: The device has been offlined and marked as f aulted. An attempt Jan 22 08:04:50 disk will be made to activate a hot spare if available. Jan 22 08:04:50 disk IMPACT: Fault tolerance of the pool may be compromised. Jan 22 08:04:50 disk REC-ACTION: Run 'zpool status -x' and replace the bad devic e. Can I build ZFS storage server with this kind of hardware? If yes, how can I troubleshoot the problem? -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] etc on separate pool
On 22 Jan 2010, at 08:55, Alexander wrote: Is it possible to have /etc on separate zfs pool in OpenSolaris? The purpose is to have rw non-persistent main pool and rw persistent /etc... I've tried to make legacy etcpool/etc file system and mount it in /etc/vfstab... Is it possible to extend boot-archive in such a way that it include most of the files necessary for mounting /etc from separate pool? Have someone tried such configurations? What does the live CD do? Cheers, Chris ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] etc on separate pool
Is it possible to extend boot-archive in such a way that it include most of the files necessary for mounting /etc from separate pool? Have someone tried such configurations? What does the live CD do? I'm not sure that it is the same configuration, but maybe it is quite similar... LiveCD has ramdisk which is mounted on boot. /etc is on this ramdisk... And in real system configuration we need some way to sync real /etc and ramdisk (or boot archive) /etc. With ramdisk this may be a problem. But 1) I don't understand deeply how livecd boots (maybe need to look at this process more attentively) 2) In my opinion its boot process is quite specific and quite different from real sustem behavior. I'm not sure that this practices may be adopted... However, I'm not confident (1)... -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] 2gig file limit on ZFS?
On Thursday 21 Jan 2010 22:00:55 Daniel Carosone wrote: Best would be to plug the ext3 disk into something that can read it fully, and copy over the network. Linux, NetBSD, maybe newer opensolaris. Note that this could be running in a VM on the same box, if necessary. Yep, done. Ubuntu gave me some grief about mounting part elements of a raid set, but I managed it and the files are now copying over to the OS box happily. The problem must have been the drivers loaded to read the ext3 partition. split(1)? Har, har!!! :-) ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Does OpenSolaris mpt driver support LSI 2008 controller
http://lsi.com/storage_home/products_home/internal_raid/megaraid_sas/6gb_s_value_line/sas9260-8i/index.html 2009.06 didn't have the drivers integrated, so those aren't the open source ones. As i said, it is possible that 2010.03 will resolve this. But we do not put development releases in production. From: Tim Cook [mailto:t...@cook.ms] Sent: Thursday, January 21, 2010 5:45 PM To: Moshe Vainer Cc: zfs-discuss@opensolaris.org Subject: Re: [zfs-discuss] Does OpenSolaris mpt driver support LSI 2008 controller On Thu, Jan 21, 2010 at 7:37 PM, Moshe Vainer mvai...@doyenz.commailto:mvai...@doyenz.com wrote: Vanilla 2009.06, mr_sas drivers from LSI website. To answer your other question - the mpt driver is very solid on 2009.06 Are you sure those are the open source drivers he's referring to? LSI has a habit of releasing their own drivers with similar names. It sounds to me like that's what you were using. On that front, exactly where did you find the driver? They have nothing listed on the downloads page: http://lsi.com/storage_home/products_home/host_bus_adapters/sas_hbas/internal/sas9211-8i/index.html?locale=ENremote=1 -- --Tim ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Does OpenSolaris mpt driver support LSI 2008 controller
I thought i made it very clear - mr_sas drivers from LSI website. No intention to bash anything, just a user experience. Sorry if that was misunderstood. From: Tim Cook [mailto:t...@cook.ms] Sent: Thursday, January 21, 2010 6:07 PM To: Moshe Vainer Cc: zfs-discuss@opensolaris.org Subject: Re: [zfs-discuss] Does OpenSolaris mpt driver support LSI 2008 controller On Thu, Jan 21, 2010 at 8:05 PM, Moshe Vainer mvai...@doyenz.commailto:mvai...@doyenz.com wrote: http://lsi.com/storage_home/products_home/internal_raid/megaraid_sas/6gb_s_value_line/sas9260-8i/index.html 2009.06 didn't have the drivers integrated, so those aren't the open source ones. As i said, it is possible that 2010.03 will resolve this. But we do not put development releases in production. You should probably make that clear from the start then. You just bashed the opensource drivers based on your experience with something completely different. -- --Tim ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs zvol available space vs used space vs reserved space
Hi dan, Thanks for your reply. I'm not sure about that, as it shows different values for different zvol. # zfs list -o space NAME AVAIL USED USEDSNAP USEDDS USEDREFRESERV USEDCHILD tank 1.33T 27.2T 0 32.0K 0 27.2T tank/test1 2.28T 1T 0 51.4G 973G 0 tank/test22.33T 1T 0 1.31G 1023G 0 tank/test3 1.38T 100G 0 50.4G 49.6G 0 Thanks, Younes On Thu, Jan 21, 2010 at 10:48 PM, Daniel Carosone d...@geek.com.au wrote: On Thu, Jan 21, 2010 at 07:33:47PM -0800, Younes wrote: Hello all, I have a small issue with zfs. I create a volume 1TB. # zfs get all tank/test01 NAMEPROPERTY VALUE SOURCE tank/test01 used 1T - tank/test01 available 2.26T - tank/test01 referenced79.4G - tank/test01 reservation none default tank/test01 refreservation1T local tank/test01 usedbydataset 79.4G - tank/test01 usedbychildren0 - tank/test01 usedbyrefreservation 945G - I've trimmed some not relevant properties. What bugs me is the available:2.26T. Any ideas on why is that? That's the available space in the rest of the pool. This includes space that could be used (ie, available for) potential snapshots of the volume (which would show in usedbychildren), since the volume size is a refreservation not a reservation. -- Dan. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] [ext3-discuss] 2gig file limit on ZFS?
Hi, it would be very good to know the version of driver used for ext3fs. From where and how was the driver installed? Best regards, Milan Richard Elling píše v čt 21. 01. 2010 v 12:08 -0800: CC'ed to ext3-disc...@opensolaris.org because this is an ext3 on Solaris issue. ZFS has no problem with large files, but the older ext3 did. See also the ext3 project page and documentation, especially http://hub.opensolaris.org/bin/view/Project+ext3/Project_status -- richard On Jan 21, 2010, at 11:58 AM, Michelle Knight wrote: Hi Folks, Situation, 64 bit Open Solaris on AMD. 2009-6 111b - I can't successfully update the OS. I've got three external 1.5 Tb drives in a raidz pool connected via USB. Hooked on to an IDE channel is a 750gig hard drive that I'm copying the data off. It is an ext3 drive from an Ubuntu server. Copying is being done on the machine using the cp command as root. So far, two files have failed... /mirror2/applications/Microsoft/Operating Systems/Virtual PC/vm/XP-SP2/XP-SP2 Hard Disk.vhd: File too large /mirror2/applications/virtualboximages/xp/xp.tar.bz2: File too large The files are... -rwxr-x--- 1 adminapplications 4177570654 Nov 4 08:02 xp.tar.bz2 -rwxr-x--- 1 adminapplications 2582259712 Feb 14 2007 XP-SP2 Hard Disk.vhd The system is a home server and contains files of all types and sizes. Any ideas please? ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Remove ZFS Mount Points
Can I move the below mounts under / ? rpool/export/export rpool/export/home /export/home It was a result of the default install... Thaks ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs send/receive as backup - reliability?
On Thu, Jan 21, 2010 at 11:28 AM, Richard Elling richard.ell...@gmail.com wrote: On Jan 21, 2010, at 3:55 AM, Julian Regel wrote: Until you try to pick one up and put it in a fire safe! Then you backup to tape from x4540 whatever data you need. In case of enterprise products you save on licensing here as you need a one client license per x4540 but in fact can backup data from many clients which are there. Which brings up full circle... What do you then use to backup to tape bearing in mind that the Sun-provided tools all have significant limitations? Poor choice of words. Sun resells NetBackup and (IIRC) that which was formerly called NetWorker. Thus, Sun does provide enterprise backup solutions. (Symantec nee Veritas) NetBackup and (EMC nee Legato) Networker are different products that compete in the enterprise backup space. Under the covers NetBackup uses gnu tar to gather file data for the backup stream. At one point (maybe still the case), one of the claimed features of netbackup is that if a tape is written without multiplexing, you can use gnu tar to extract data. This seems to be most useful when you need to recover master and/or media servers and to be able to extract your data after you no longer use netbackup. -- Mike Gerdts http://mgerdts.blogspot.com/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Remove ZFS Mount Points
On Fri, 22 Jan 2010, Tony MacDoodle wrote: Can I move the below mounts under / ? rpool/export/export rpool/export/home /export/home Sure. Just copy the data out of the directory, do a zfs destroy on the two filesystems, and copy it back. For example: # mkdir /save # cp -r /export/home /save # zfs destroy rpool/export/home # zfs destroy rpool/export # mkdir /export # mv /save/home /export # rmdir /save I'm sure there are other ways to do it, but that's the gist. Regards, markm ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Removing large holey file does not free space 6792701 (still)
Hello,I mentioned this problem a year ago here and filed 6792701 and I know it has been discussed since. It should have been fixed in snv_118, but I can still trigger the same problem. This is only triggered if the creation of a large file is aborted, for example by loss of power, crash or SIGINT to mkfile(1M). The bug should probably be reopened but I post it here since some people where seeing something similar.Example and attached zdb output:filer01a:/$ uname -a SunOS filer01a 5.11 snv_130 i86pc i386 i86pc Solarisfiler01a:/$ zpool create zpool01 raidz2 c4t0d0 c4t1d0 c4t2d0 c4t4d0 c4t5d0 c4t6d0filer01a:/$ zpool create zpool01 raidz2 c4t0d0 c4t1d0 c4t2d0 c4t4d0 c4t5d0 c4t6d0filer01a:/$ zfs list zpool01 NAME USED AVAIL REFER MOUNTPOINTzpool01 123K 5.33T 42.0K /zpool01filer01a:/$ df -h /zpool01Filesystem Size Used Avail Use% Mounted onzpool015.4T 42K 5.4T 1% /zpool01filer01a:/$ mkfile 1024G /zpool01/largefile ^C filer01a:/$ zfs list zpool01NAME USED AVAIL REFER MOUNTPOINTzpool01 160G 5.17T 160G /zpool01filer01a:/$ ls -hl /zpool01/largefile -rw--- 1 root root 1.0T 2010-01-22 15:02 /zpool01/largefilefiler01a:/$ rm /zpool01/largefilefiler01a:/$ sync filer01a:/$ zfs list zpool01 NAME USED AVAIL REFER MOUNTPOINTzpool01 160G 5.17T 160G /zpool01filer01a:/$ df -h /zpool01Filesystem Size Used Avail Use% Mounted onzpool015.4T 161G 5.2T 3% /zpool01filer01a:/$ ls -l /zpool01total 0filer01a:/$ zfs list -t all zpool01 NAME USED AVAIL REFER MOUNTPOINTzpool01 160G 5.17T 160G /zpool01filer01a:/$ zpool export zpool01 filer01a:/$ zpool import zpool01 filer01a:/$ zfs list zpool01 NAME USED AVAIL REFER MOUNTPOINTzpool01 160G 5.17T 160G /zpool01filer01a:/$ zfs -ddd zpool01cut Object lvl iblk dblk dsize lsize %full typecut5 5 16K 128K 160G 1T 15.64 ZFS plain file/cut zpool01.zdb Description: Binary data Henrikhttp://sparcv9.blogspot.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] etc on separate pool
Hi Alexander, I'm not sure about the OpenSolaris release specifically, but for the SXCE and Solaris 10 releases, we provide this requirement: http://docs.sun.com/app/docs/doc/817-2271/zfsboot-1?a=view * Solaris OS Components – All subdirectories of the root file system that are part of the OS image, with the exception of /var, must be in the same dataset as the root file system. In addition, all Solaris OS components must reside in the root pool with the exception of the swap and dump devices. Maybe someone else can comment on their OpenSolaris experiences. I'm not sure we've done enough testing to relax this requirement for OpenSolaris releases. In the meantime, I would suggest following the above requirement until we're sure alternate configurations are supportable. Thanks, Cindy On 01/22/10 05:17, Alexander wrote: Is it possible to extend boot-archive in such a way that it include most of the files necessary for mounting /etc from separate pool? Have someone tried such configurations? What does the live CD do? I'm not sure that it is the same configuration, but maybe it is quite similar... LiveCD has ramdisk which is mounted on boot. /etc is on this ramdisk... And in real system configuration we need some way to sync real /etc and ramdisk (or boot archive) /etc. With ramdisk this may be a problem. But 1) I don't understand deeply how livecd boots (maybe need to look at this process more attentively) 2) In my opinion its boot process is quite specific and quite different from real sustem behavior. I'm not sure that this practices may be adopted... However, I'm not confident (1)... ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] etc on separate pool
On 01/22/10 01:55, Alexander wrote: Is it possible to have /etc on separate zfs pool in OpenSolaris? The purpose is to have rw non-persistent main pool and rw persistent /etc... I've tried to make legacy etcpool/etc file system and mount it in /etc/vfstab... Is it possible to extend boot-archive in such a way that it include most of the files necessary for mounting /etc from separate pool? Have someone tried such configurations? There have been efforts (some ongoing) to enable what you are trying to do, but they involve substantial changes to Solaris configuration and system administration. As Solaris works right now, it is not supported to have /etc in a separate dataset, let alone a separate pool. Lori ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs zvol available space vs used space vs reserved space
Younes, Including your zpool list output for tank would be helpful because zfs list includes the AVAILABLE pool space. Determining volume space is a bit trickier because volume size is set at creation time but the allocated size might not be consumed. I include a simple example below that might help. The Nevada release revises the zpool list output to include SIZE, ALLOC, and FREE, which helps clarify the values. Cindy A mirrored pool tank of 2 x 136GB disks: # zpool create tank mirror c1t1d0 c1t2d0 # zpool list tank NAME SIZE USED AVAILCAP HEALTH ALTROOT tank 136G 76.5K 136G 0% ONLINE - Review how much space is available for datasets: # zfs list -r tank NAME USED AVAIL REFER MOUNTPOINT tank72K 134G21K /tank Approx 2 GB of pool space is consumed for metadata. I create two volumes, 10GB and 20GB, in size: # zfs create -V 10G tank/vol1 # zfs create -V 20G tank/vol2 # zfs list -r tank NAMEUSED AVAIL REFER MOUNTPOINT tank 30.0G 104G21K /tank tank/vol110G 114G16K - tank/vol220G 124G16K - In the above output, USED is 30.0G due to the creation of the volumes. If we check the pool space consumed: # zpool list tank NAME SIZE USED AVAILCAP HEALTH ALTROOT tank 136G 124K 136G 0% ONLINE - USED is only 124K because the volumes contain no data yet and USED also included a small amount of metadata. On 01/21/10 20:53, younes naguib wrote: Hi dan, Thanks for your reply. I'm not sure about that, as it shows different values for different zvol. # zfs list -o space NAME AVAIL USED USEDSNAP USEDDS USEDREFRESERV USEDCHILD tank 1.33T 27.2T 0 32.0K 0 27.2T tank/test1 2.28T 1T 0 51.4G 973G 0 tank/test22.33T 1T 0 1.31G 1023G 0 tank/test3 1.38T 100G 0 50.4G 49.6G 0 Thanks, Younes On Thu, Jan 21, 2010 at 10:48 PM, Daniel Carosone d...@geek.com.au mailto:d...@geek.com.au wrote: On Thu, Jan 21, 2010 at 07:33:47PM -0800, Younes wrote: Hello all, I have a small issue with zfs. I create a volume 1TB. # zfs get all tank/test01 NAMEPROPERTY VALUE SOURCE tank/test01 used 1T - tank/test01 available 2.26T - tank/test01 referenced79.4G - tank/test01 reservation none default tank/test01 refreservation1T local tank/test01 usedbydataset 79.4G - tank/test01 usedbychildren0 - tank/test01 usedbyrefreservation 945G - I've trimmed some not relevant properties. What bugs me is the available:2.26T. Any ideas on why is that? That's the available space in the rest of the pool. This includes space that could be used (ie, available for) potential snapshots of the volume (which would show in usedbychildren), since the volume size is a refreservation not a reservation. -- Dan. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] zero out block / sectors
Is there a way to zero out unused blocks in a pool? I'm looking for ways to shrink the size of an opensolaris virtualbox VM and using the compact subcommand will remove zero'd sectors. Thanks, John Hoogerdijk Sun Microsystems of Canada IMO Network Computer NC Ltd. 808 240 Graham Avenue Winnipeg, Manitoba, R3C 0J7, Canada Phone: 204.927.1932 Cell: 204.230.6720 Fax: 204.927.1939 Email: john.hoogerd...@sun.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] hard drive choice, TLER/ERC/CCTL
dc == Daniel Carosone d...@geek.com.au writes: w == Willy willy.m...@gmail.com writes: sb == Simon Breden sbre...@gmail.com writes: First of all, I've been so far assembling vdev stripes from different manufacturers, such that one manufacturer can have a bad batch or firmware bug killing all their drives at once without losing my pool. Based on recent drive problems I think this is a really wise idea. w http://www.csc.liv.ac.uk/~greg/projects/erc/ dead link? w Unfortunately, smartmontools has limited SATA drive support in w opensolaris, and you cannot query or set the values. also the driver stack is kind of a mess with different mid-layers depending on which SATA low-level driver you use, and many proprietary no-source low-level drivers, neither of which you have to deal with on Linux. Maybe in a decade it will get better if the oldest driver we have to deal with is AHCI, but yes smartmontools vs. uscsi still needs fixing! w I have 4 of the HD154UI Samsung Ecogreens, and was able to set w the error reporting time using HDAT2. The settings would w survive a warm reboot, but not a powercycle. after stfw this seems to be some MesS-DOS binary-only tool. Maybe you can run it in virtualbox and snoop on its behavior---this worked for me with Wine and a lite-on RPC tool. At least on Linux you can for example run CD burning programs from within Wine---it is that good. sb RAID-version drives at 50%-100% price premium, I have decided sb not to use Western Digital drives any longer, and have sb explained why here: sb http://breden.org.uk/2009/05/01/home-fileserver-a-year-in-zfs/ IMHO it is just a sucker premium because the feature is worthless anyway. From the discussion I've read here, the feature is designed to keep drives which are *reporting failures* to still be considered *GOOD*, and to not drop out of RAIDsets in RAID-on-a-card implementations with RAID-level timeouts 60seconds. It is a workaround for huge modern high-BER drives and RAID-on-card firmware that's (according to some person's debateable idea) not well-matched to the drive. Of course they are going to sell it as this big valuable enterprise optimisation, but at its root it has evloved as a workaround for someone else's broken (from WD POV) software. The solaris timeout, because of m * n * o multiplicative layered speculative retry nonsense, is 60 seconds or 180 seconds or many hours, so solaris is IMHO quite broken in this regard but also does not benefit from the TLER workaround: the long-TLER drives will not drop out of RAIDsets on ZFS even if they report an error now and then. What's really needed for ZFS or RAID in general is (a) for drives to never spend more than x% of their time attempting recovery, so they don't effectively lose ALL the data on a partially-damaged drive by degrading performance to the point it would take n years to read out what data they're able to deliver and (b) RAID-level smarts to dispatch reads for redundant data when a drive becomes slow without reporting failure, and to diagnose drives as failed based on statistical measurements of their speed. TLER does not deliver (a) because reducing error retries to 5 seconds is still 10^3 slowdown instead of 10^4 and thus basically no difference, and the hard drive can never do (b) it's a ZFS-level feature. so my question is, have you actually found cases where ZFS needs TLER adjustments, or are you just speculating and synthesizing ideas from a mess of whitepaper marketing blurbs? Because a 7-second-per-read drive will fuck your pool just as badly as a 70-second-per-read drive: you're going to have to find and unplug it before the pool will work again. pgpXHCdaAwoIH.pgp Description: PGP signature ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zero out block / sectors
John Hoogerdijk wrote: Is there a way to zero out unused blocks in a pool? I'm looking for ways to shrink the size of an opensolaris virtualbox VM and using the compact subcommand will remove zero'd sectors. Not yet, but this has been discussed here before. It is something I want to look at after I've got encryption support integrated. No promises but I want it for this and other purposes. -- Darren J Moffat ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS Dedup Performance
We're having to split data to multiple pools if we enable dedup, 1+ TB pools each (one 6x750gb is particularly bad). The timeouts cause COMSTAR / iSCSI to fail, Windows clients are dropping the persistent targets due to timeouts ( 15 seconds it seems). This is causing bigger problems. Disabling dedup is an option, but it shouldn't be *THAT* much load I wouldn't think. Having it on a cache drive is reasonable, however if this is required OpenSolaris should add something like DDTCacheDevice so we can dedicate a device to it seperate from the secondcache. I'll drop in a 150gb cache drive tonight to see if it improves things. Steve Radich www.BitShop.com -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS Dedup Performance
I should note that trying zfs set primarycache=metadata tank1 took a few minutes. Seems changing what is cached in ram would be instant (we don't need to flush out from ram the data, just don't put it back in ram again). During this disk i/o seemed slow, could have been unrelated. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs send/receive as backup - reliability?
On Wed, Jan 20, 2010 at 08:11:27AM +1300, Ian Collins wrote: True, but I wonder how viable its future is. One of my clients requires 17 LT04 types for a full backup, which cost more and takes up more space than the equivalent in removable hard drives. What kind of removable hard drives are you getting that are cheaper than tape? LTO4 media is less than 2.5 cents/GB for us (before compression, acquisition cost only). -- Darren ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] hard drive choice, TLER/ERC/CCTL
Thanks for your reply Miles. I think I understand your points, but unfortunately my historical knowledge of the the need for TLER etc solutions is lacking. How I've understood it to be (as generic as possible, but possibly inaccurate as a result): 1. In simple non-RAID single drive 'desktop' PC scenarios where you have one drive, if your drive is experiencing read/write errors, as this is the only drive you have, and therefore you have no alternative redundant source of data to help with required reconstruction/recovery, you REALLY NEED your drive to try as much as possible to try to recover from the error, therefore a long 'deep recovery' process may be kicked off to try to fix/recover the problematic data being read/written. 2. Historically, hardware RAID arrays, where redundant data *IS* available, you really DON'T want any drive with trivial occasional block read errors to be kicked from the array, so the idea was to have drives experiencing read errors report quickly to the hardware RAID controller that there's a problem, so that the hardware RAID controller can then quickly reconstruct the missing data by using the redundant parity data. 3. With ZFS, I think you're saying that if, for example, there's a block read error, then even with a RAID EDITION (TLER) drive, you're still looking at a circa 7 second delay before the error is reported to ZFS, and if you're using a cheapo standard non-RAID edition drive then you're looking at a likely circa 60/70 second delay before ZFS is notified. Either way, you say that ZFS won't kick the drive, yes? And worst case is that depending on arbitrary 'unknowns' relating to the particular drive's firmware chemistry/storage stack, relating to the storage array's repsonsiveness, 'some time' could be 'mucho time' if you're unlucky. And to summarise, you don't see any point in spending a high premium on RAID-edition drives if using with ZFS, yes? And also, you don't think that using non-RAID edition drives presents a significant additional data loss risk? Cheers, Simon http://breden.org.uk/2009/05/01/home-fileserver-a-year-in-zfs/ -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Best 1.5TB drives for consumer RAID?
Well, I've purchased 5 Barracuda LP 1.5TB. They ran very queit, cool, 5 in a cage and the vibration are nearly zero. reliability ? Well every HDD is unreliable, every major brand at this time have problems, so go for the best bang for the bucks. In my country Seagate have the best RMA service, with tournaround in about 1 week or so, WD is 3-4 weeks. Samsung have no direct RMA service, Hitachi well have a foot out HDD business IMHO, no attractive product at moment. The enterprise SATA class HDD is a joke, same costructions like the consumers line only longer warranties but with a helfy money premium. If you need of a real enterprise class HDD you want a SAS not a SATA. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs send/receive as backup - reliability?
On Thu, Jan 21, 2010 at 12:38:56AM +0100, Ragnar Sundblad wrote: On 21 jan 2010, at 00.20, Al Hopper wrote: I remember for about 5 years ago (before LT0-4 days) that streaming tape drives would go to great lengths to ensure that the drive kept streaming - because it took so much time to stop, backup and stream again. And one way the drive firmware accomplished that was to write blocks of zeros when there was no data available. I haven't seen drives that fill out with zeros. Sounds like an ugly solution, but maybe it could be useful in some strange case. It was closer to 15 years ago than 5, but this may be a reference to the first release of the DLT 7000. That version came out with only 4MB as a RAM buffer, which is insufficient to buffer at speed during a stop/start cycle. It didn't write zeros, but it would disable the on-drive compression to try to keep the speed of bits being written to tape up. So the effect was similar in that the capacity of the media was reduced. The later versions had 8MB buffers and that behavior was removed. -- Darren ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Does OpenSolaris mpt driver support LSI 2008 controller
OK, gotcha. Relating to my request for robustness feedback of the other driver, I was referring in fact to the mpt_sas driver that James says is used for the non-RAID LSI SAS2008-based cards like the SuperMicro AOC-USAS2-L8e (as opposed to the RAID-capable AOC-USAS2-L8i LSI SAS 9211-8i cards, which use the mr_sas driver). As far as I'm aware, the standard mpt driver is used for the card I already own, the LSI SAS1068E-based AOC-USAS-L8i etc. Cheers, Simon http://breden.org.uk/2009/05/01/home-fileserver-a-year-in-zfs/ -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zero out block / sectors
On Fri, Jan 22, 2010 at 1:00 PM, John Hoogerdijk john.hoogerd...@sun.com wrote: Is there a way to zero out unused blocks in a pool? I'm looking for ways to shrink the size of an opensolaris virtualbox VM and using the compact subcommand will remove zero'd sectors. I've long suspected that you should be able to just use mkfile or dd if=/dev/zero ... to create a file that consumes most of the free space then delete that file. Certainly it is not an ideal solution, but seems quite likely to be effective. -- Mike Gerdts http://mgerdts.blogspot.com/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs send/receive as backup - reliability?
A Darren Dunham wrote: On Wed, Jan 20, 2010 at 08:11:27AM +1300, Ian Collins wrote: True, but I wonder how viable its future is. One of my clients requires 17 LT04 types for a full backup, which cost more and takes up more space than the equivalent in removable hard drives. What kind of removable hard drives are you getting that are cheaper than tape? It's not the raw cost per GB, it's the way the tapes are used. To aid recovery times, a number of different backup sets (groups of filesystems) are written, so the tapes aren't all used to capacity. -- Ian. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss