[zfs-discuss] SPAM *** Re: [osol-help] Adding a new partition to the system
Antonio wrote: Hi all, First of all let me say that, after a few days using it (and after several *years* of using Linux daily), I'm delighted with OpenSolaris 8.11. It's gonna be the OS of my choice. The fact is that I installed it in a partition of 16Gb in my hard disk and that I'd like to add another partition to the system (I have different partitions with Linux and Windows and some others). So the questions are: 1.- How do I add an existing partition to OpenSolaris? (Should I change the partition type or something? Shall I grow ZFS or shall I mount the extra partition somewhere else?) yes. You can create a new zpool from your free/spare partition. I had the same problem. I wanted to use Linux partition as a mirror. So here is how to: Follow this blog - http://blogs.sun.com/pradhap/entry/mount_ntfs_ext2_ext3_in * install FSWpart and FSWfsmisc * run prtpart (find out your disk ID) * figure out partitions ID: prtpart disk ID -ldevs * create zpool from linux partition e.g. zpool create trunk /dev/dsk/c9d0p3 * check it out: zpool list or zpool status 2.- Would you please recommend a good introduction to Solaris/OpenSolaris? I'm used to Linux and I'd like to get up to speed with OpenSolaris. sure, OpenSolaris Bible :) http://blogs.sun.com/observatory/entry/two_more_chapters_from_the Hope this helps, Regards, Jan Hlodan Thanks in advance, Antonio ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS on SAN?
On 14-Feb-09, at 2:40 AM, Andras Spitzer wrote: Damon, Yes, we can provide simple concat inside the array (even though today we provide RAID5 or RAID1 as our standard, and using Veritas with concat), the question is more of if it's worth it to switch the redundancy from the array to the ZFS layer. The RAID5/1 features of the high-end EMC arrays also provide performance improvements, that's why I wonder what would be the pros/cons of such a switch (I mean the switch of the redundancy from the array to the ZFS layer). So, you telling me that even if the SAN provides redundancy (HW RAID5 or RAID1), people still configure ZFS with either raidz or mirror? Without doing so, you don't get the benefit of checksummed self-healing. --Toby Regards, sendai On Sat, Feb 14, 2009 at 6:06 AM, Damon Atkins damon.atk...@_no_spam_yahoo.com.au wrote: Andras, It you can get Concat Disk or Raid 0 Disk inside the array, then use RaidZ (if I/O is not large amount or its mostly sequential) if very high I/O then use ZFS Mirror. You can not spread a zpool over multiple EMC Arrays using SRDF if you are not using EMC Power Path. HDS for example does not support anything other than Mirror or RAID5 configuration, so RaidZ or ZFS Mirror results in a lot of wasted disk space. However people still use RaidZ on HDS Raid5. As the top of the line HDS arrays are very fast and they want the features offered by ZFS. Cheers -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] SPAM *** Re: [osol-help] Adding a new partition to the system
Hi Antonio, did you try to recreate this partition e.g. with Gparted? Maybe is something wrong with this partition. Can you also post what prtpart disk ID -ldevs says? Regards, Jan Hlodan Antonio wrote: Hi Jan, I tried out what you say long ago, but zfs fails on pool creation. This is, when I issue the zpool create trunk /dev/dsk/c9d0p3 the command fails saying that there's no such file or directory. And the disk is correct!! What I think is that /dev/dsk/c9d0p3 is a symbolic name used by FSWpart, and it's not a valid device name for zpool. Thanks anyway, Antonio Jan Hlodan escribió: Antonio wrote: Hi all, First of all let me say that, after a few days using it (and after several *years* of using Linux daily), I'm delighted with OpenSolaris 8.11. It's gonna be the OS of my choice. The fact is that I installed it in a partition of 16Gb in my hard disk and that I'd like to add another partition to the system (I have different partitions with Linux and Windows and some others). So the questions are: 1.- How do I add an existing partition to OpenSolaris? (Should I change the partition type or something? Shall I grow ZFS or shall I mount the extra partition somewhere else?) yes. You can create a new zpool from your free/spare partition. I had the same problem. I wanted to use Linux partition as a mirror. So here is how to: Follow this blog - http://blogs.sun.com/pradhap/entry/mount_ntfs_ext2_ext3_in * install FSWpart and FSWfsmisc * run prtpart (find out your disk ID) * figure out partitions ID: prtpart disk ID -ldevs * create zpool from linux partition e.g. zpool create trunk /dev/dsk/c9d0p3 * check it out: zpool list or zpool status 2.- Would you please recommend a good introduction to Solaris/OpenSolaris? I'm used to Linux and I'd like to get up to speed with OpenSolaris. sure, OpenSolaris Bible :) http://blogs.sun.com/observatory/entry/two_more_chapters_from_the Hope this helps, Regards, Jan Hlodan Thanks in advance, Antonio ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs destroy hanging
I think you can kill the destroy command process using traditional methods. Perhaps your slowness issue is because the pool is an older format. I've not had these problems since upgrading to the zfs version that comes default with 2008.11 On Fri, Feb 13, 2009 at 4:14 PM, David Dyer-Bennet d...@dd-b.net wrote: This shouldn't be taking anywhere *near* half an hour. The snapshots differ trivially, by one or two files and less than 10k of data (they're test results from working on my backup script). But so far, it's still sitting there after more than half an hour. local...@fsfs:~/src/bup2# zfs destroy ruin/export cannot destroy 'ruin/export': filesystem has children use '-r' to destroy the following datasets: ruin/export/h...@bup-20090210-202557utc ruin/export/h...@20090210-213902utc ruin/export/home/local...@first ruin/export/home/local...@second ruin/export/home/local...@bup-20090210-202557utc ruin/export/home/local...@20090210-213902utc ruin/export/home/localddb ruin/export/home local...@fsfs:~/src/bup2# zfs destroy -r ruin/export It's still hung. Ah, here's zfs list output from shortly before I started the destroy: ruin 474G 440G 431G /backups/ruin ruin/export 35.0M 440G18K /backups/ruin/export ruin/export/home35.0M 440G19K /export/home ruin/export/home/localddb 35M 440G 27.8M /export/home/localddb As you can see, the ruin/export/home filesystem (and subs) is NOT large. iostat shows no activity on pool ruin over a minute. local...@fsfs:~$ pfexec zpool iostat ruin 10 capacity operationsbandwidth pool used avail read write read write -- - - - - - - ruin 474G 454G 10 0 1.13M840 ruin 474G 454G 0 0 0 0 ruin 474G 454G 0 0 0 0 ruin 474G 454G 0 0 0 0 ruin 474G 454G 0 0 0 0 ruin 474G 454G 0 0 0 0 ruin 474G 454G 0 0 0 0 ruin 474G 454G 0 0 0 0 ruin 474G 454G 0 0 0 0 The pool still thinks it is healthy. local...@fsfs:~$ zpool status -v ruin pool: ruin state: ONLINE status: The pool is formatted using an older on-disk format. The pool can still be used, but some features are unavailable. action: Upgrade the pool using 'zpool upgrade'. Once this is done, the pool will no longer be accessible on older software versions. scrub: scrub completed after 4h42m with 0 errors on Mon Feb 9 19:10:49 2009 config: NAMESTATE READ WRITE CKSUM ruinONLINE 0 0 0 c7t0d0ONLINE 0 0 0 errors: No known data errors There is still a process out there trying to run that destroy. It doesn't appear to be using much cpu time. local...@fsfs:~$ ps -ef | grep zfs localddb 7291 7228 0 15:10:56 pts/4 0:00 grep zfs root 7223 7101 0 14:18:27 pts/3 0:00 zfs destroy -r ruin/export Running 2008.11. local...@fsfs:~$ uname -a SunOS fsfs 5.11 snv_101b i86pc i386 i86pc Solaris Any suggestions? Eventually I'll kill the process by the gentlest way that works, I suppose (if it doesn't complete). -- David Dyer-Bennet, d...@dd-b.net; http://dd-b.net/ Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/ Photos: http://dd-b.net/photography/gallery/ Dragaera: http://dragaera.info ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] SSD - slow down with age
A useful article about long term use of the Intel SSD X25-M: http://www.pcper.com/article.php?aid=669 - Long-term performance analysis of Intel Mainstream SSDs. Would a zfs cache (ZIL or ARC) based on a SSD device see this kind of issue? Maybe a periodic scrub via a full disk erase would be a useful process. Nicholas ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] SPAM *** Re: [osol-help] Adding a new partition to the system
Antonio wrote: I can mount those partitions well using ext2fs, so I assume I won't need to run gparted at all. This is what prtpart says about my stuff. Kind regards, Antonio r...@antonio:~# prtpart /dev/rdsk/c3d0p0 -ldevs Fdisk information for device /dev/rdsk/c3d0p0 ** NOTE ** /dev/dsk/c3d0p0 - Physical device referring to entire physical disk /dev/dsk/c3d0p1 - p4 - Physical devices referring to the 4 primary partitions /dev/dsk/c3d0p5 ... - Virtual devices referring to logical partitions Virtual device names can be used to access EXT2 and NTFS on logical partitions /dev/dsk/c3d0p1Solaris x86 /dev/dsk/c3d0p2Solaris x86 /dev/dsk/c3d0p3Solaris x86 /dev/dsk/c3d0p4DOS Extended /dev/dsk/c3d0p5Linux native /dev/dsk/c3d0p6Linux native /dev/dsk/c3d0p7Linux native /dev/dsk/c3d0p8Linux native /dev/dsk/c3d0p9Linux swap /dev/dsk/c3d0p10Solaris x86 Hi Antonio, and what does 'zpool create' command say? $ pfexec zpool create test /dev/dsk/c3d0p5 or $ pfexec zpool create -f test /dev/dsk/c3d0p5 Regards, jh Jan Hlodan escribió: Hi Antonio, did you try to recreate this partition e.g. with Gparted? Maybe is something wrong with this partition. Can you also post what prtpart disk ID -ldevs says? Regards, Jan Hlodan Antonio wrote: Hi Jan, I tried out what you say long ago, but zfs fails on pool creation. This is, when I issue the zpool create trunk /dev/dsk/c9d0p3 the command fails saying that there's no such file or directory. And the disk is correct!! What I think is that /dev/dsk/c9d0p3 is a symbolic name used by FSWpart, and it's not a valid device name for zpool. Thanks anyway, Antonio Jan Hlodan escribió: Antonio wrote: Hi all, First of all let me say that, after a few days using it (and after several *years* of using Linux daily), I'm delighted with OpenSolaris 8.11. It's gonna be the OS of my choice. The fact is that I installed it in a partition of 16Gb in my hard disk and that I'd like to add another partition to the system (I have different partitions with Linux and Windows and some others). So the questions are: 1.- How do I add an existing partition to OpenSolaris? (Should I change the partition type or something? Shall I grow ZFS or shall I mount the extra partition somewhere else?) yes. You can create a new zpool from your free/spare partition. I had the same problem. I wanted to use Linux partition as a mirror. So here is how to: Follow this blog - http://blogs.sun.com/pradhap/entry/mount_ntfs_ext2_ext3_in * install FSWpart and FSWfsmisc * run prtpart (find out your disk ID) * figure out partitions ID: prtpart disk ID -ldevs * create zpool from linux partition e.g. zpool create trunk /dev/dsk/c9d0p3 * check it out: zpool list or zpool status 2.- Would you please recommend a good introduction to Solaris/OpenSolaris? I'm used to Linux and I'd like to get up to speed with OpenSolaris. sure, OpenSolaris Bible :) http://blogs.sun.com/observatory/entry/two_more_chapters_from_the Hope this helps, Regards, Jan Hlodan Thanks in advance, Antonio ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs destroy hanging
On Sat, February 14, 2009 13:04, Blake wrote: I think you can kill the destroy command process using traditional methods. kill and kill -9 failed. In fact, rebooting failed; I had to use a hard reset (it shut down most of the way, but then got stuck). Perhaps your slowness issue is because the pool is an older format. I've not had these problems since upgrading to the zfs version that comes default with 2008.11 We can hope. In case that's the cause, I upgraded the pool format (after considering whether I'd be needing to access it with older software; hope I was right :-)). The pool did import and scrub cleanly, anyway. That's hopeful. Also this particular pool is a scratch pool at the moment, so I'm not risking losing data, only risking losing confidence in ZFS. It's also a USB external disk. -- David Dyer-Bennet, d...@dd-b.net; http://dd-b.net/ Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/ Photos: http://dd-b.net/photography/gallery/ Dragaera: http://dragaera.info ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs destroy hanging
On Sat, 14 Feb 2009 15:40:04 -0600 (CST) David Dyer-Bennet d...@dd-b.net wrote: On Sat, February 14, 2009 13:04, Blake wrote: I think you can kill the destroy command process using traditional methods. kill and kill -9 failed. In fact, rebooting failed; I had to use a hard reset (it shut down most of the way, but then got stuck). Perhaps your slowness issue is because the pool is an older format. I've not had these problems since upgrading to the zfs version that comes default with 2008.11 We can hope. In case that's the cause, I upgraded the pool format (after considering whether I'd be needing to access it with older software; hope I was right :-)). The pool did import and scrub cleanly, anyway. That's hopeful. Also this particular pool is a scratch pool at the moment, so I'm not risking losing data, only risking losing confidence in ZFS. It's also a USB external disk. Hi David, if this happens to you again, you could help get more data on the problem by getting a crash dump, either forced or via reboot or (if you have a dedicated dump device, via savecore: (dedicated dump dev, ) # savecore -L /var/crash/`uname -n` or # reboot -dq (forced, 64bit mode) # echo 0rip|mdb -kw (forced, 32bit mode) # echo 0eip|mdb -kw Try the command line options first, only use the mdb kick in the guts if the other two fail. Once you've got the core, you could post the output of ::status $C when run over the core with mdb -k. James C. McPherson -- Senior Kernel Software Engineer, Solaris Sun Microsystems http://blogs.sun.com/jmcp http://www.jmcp.homeunix.com/blog ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS: unreliable for professional usage?
Hey guys, I'll let this die in a sec, but I just wanted to say that I've gone and read the on disk document again this morning, and to be honest Richard, without the description you just wrote, I really wouldn't have known that uberblocks are in a 128 entry circular queue that's 4x redundant. Please understand that I'm not asking for answers to these notes, this post is purely to illustrate to you ZFS guys that much as I appreciate having the ZFS docs available, they are very tough going for anybody who isn't a ZFS developer. I consider myself well above average in IT ability, and I've really spent quite a lot of time in the past year reading around ZFS, but even so I would definitely have come to the wrong conclusion regarding uberblocks. Richard's post I can understand really easily, but in the on disk format docs, that information is spread over 7 pages of really quite technical detail, and to be honest, for a user like myself raises as many questions as it answers: On page 6 I learn that labels are stored on each vdev, as well as each disk. So there will be a label on the pool, mirror (or raid group), and disk. I know the disk ones are at the start and end of the disk, and it sounds like the mirror vdev is in the same place, but where is the root vdev label? The example given doesn't mention its location at all. Then, on page 7 it sounds like the entire label is overwriten whenever on-disk data is updated - any time on-disk data is overwritten, there is potential for error. To me, it sounds like it's not a 128 entry queue, but just a group of 4 labels, all of which are overwritten as data goes to disk. Then finally, on page 12 the uberblock is mentioned (although as an aside, the first time I read these docs I had no idea what the uberblock actually was). It does say that only one uberblock is active at a time, but with it being part of the label I'd just assume these were overwritten as a group.. And that's why I'll often throw ideas out - I can either rely on my own limited knowledge of ZFS to say if it will work, or I can take advantage of the excellent community we have here, and post the idea for all to see. It's a quick way for good ideas to be improved upon, and bad ideas consigned to the bin. I've done it before in my rather lengthly 'zfs availability' thread. My thoughts there were thrashed out nicely, with some quite superb additions (namely the concept of lop sided mirrors which I think are a great idea). Ross PS. I've also found why I thought you had to search for these blocks, it was after reading this thread where somebody used mdb to search a corrupt pool to try to recover data: http://opensolaris.org/jive/message.jspa?messageID=318009 On Fri, Feb 13, 2009 at 11:09 PM, Richard Elling richard.ell...@gmail.com wrote: Tim wrote: On Fri, Feb 13, 2009 at 4:21 PM, Bob Friesenhahn bfrie...@simple.dallas.tx.us mailto:bfrie...@simple.dallas.tx.us wrote: On Fri, 13 Feb 2009, Ross Smith wrote: However, I've just had another idea. Since the uberblocks are pretty vital in recovering a pool, and I believe it's a fair bit of work to search the disk to find them. Might it be a good idea to allow ZFS to store uberblock locations elsewhere for recovery purposes? Perhaps it is best to leave decisions on these issues to the ZFS designers who know how things work. Previous descriptions from people who do know how things work didn't make it sound very difficult to find the last 20 uberblocks. It sounded like they were at known points for any given pool. Those folks have surely tired of this discussion by now and are working on actual code rather than reading idle discussion between several people who don't know the details of how things work. People who don't know how things work often aren't tied down by the baggage of knowing how things work. Which leads to creative solutions those who are weighed down didn't think of. I don't think it hurts in the least to throw out some ideas. If they aren't valid, it's not hard to ignore them and move on. It surely isn't a waste of anyone's time to spend 5 minutes reading a response and weighing if the idea is valid or not. OTOH, anyone who followed this discussion the last few times, has looked at the on-disk format documents, or reviewed the source code would know that the uberblocks are kept in an 128-entry circular queue which is 4x redundant with 2 copies each at the beginning and end of the vdev. Other metadata, by default, is 2x redundant and spatially diverse. Clearly, the failure mode being hashed out here has resulted in the defeat of those protections. The only real question is how fast Jeff can roll out the feature to allow reverting to previous uberblocks. The procedure for doing this by hand has long been known, and was posted on this forum -- though it is tedious. -- richard
Re: [zfs-discuss] ZFS on SAN?
as == Andras Spitzer wsen...@gmail.com writes: as So, you telling me that even if the SAN provides redundancy as (HW RAID5 or RAID1), people still configure ZFS with either as raidz or mirror? There's some experience that, in the case where the storage device or the FC mesh glitches or reboots while the ZFS host stays up across the reboot, you are less likely to lose the whole pool to ``ZFS-8000-72 The pool metadata is corrupted and cannot be opened. Destroy the pool and restore from backup.'' if you have ZFS-level redundancy than if you don't. Note that this ``corrupt and cannot be opened'' is a different problem from ``not being able to self-heal.'' When you need self-healing and don't have it, you usually shouldn't lose the whole pool. You should get a message in 'zpool status' telling you the name of a file that has unrecoverable errors. Any attempt to read the file returns an I/O error (not the marginal data). Then you have to go delete that file to clear the error, but otherwise the pool keeps working. In this self-heal case, if you'd had the ZFS-layer redundancy you'd get a count in the checksum column of one device and wouldn't have to delete the file, in fact you wouldn't even know the name of the file that got healed. some people have been trying to blame the ``corrupt and cannot be opened'' on bit-flips supposedly happening inside the storage or the FC cloud, the same kind of bit flip that causes the other self-healable problem, but I don't buy it. I think it's probably cache sync / write barrier problems that's killing the unredundant pools on SAN's. pgpYVvSe908RY.pgp Description: PGP signature ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss