Re: [zfs-discuss] Which directories must be part of rpool?
On 26-Sep-09, at 9:56 AM, Frank Middleton wrote: On 09/25/09 09:58 PM, David Magda wrote: ... Similar definition for [/tmp] Linux FWIW: Yes, but unless they fixed it recently (=RHFC11), Linux doesn't actually nuke /tmp, which seems to be mapped to disk. One side effect is that (like MSWindows) AFAIK there isn't a native tmpfs, ... Are you sure about that? My Linux systems do. http://lxr.linux.no/linux+v2.6.31/Documentation/filesystems/tmpfs.txt --Toby Cheers -- Frank ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] extremely slow writes (with good reads)
This controller card, you have turned off any raid functionality, yes? ZFS has total control of all discs, by itself? No hw raid intervening? -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] raidz failure, trying to recover
Oh, ps, This is on a Solaris 5.11 snv_99 - thanks! liam -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] raidz failure, trying to recover
Long story short, my cat jumped on my server at my house crashing two drives at the same time. It was a 7 drive raidz (next time ill do raidz2). The server crashed complaining about a drive failure, so i rebooted into single user mode not realizing that two drives failed. I put in a new 500g replacement and had zfs start a replace operation which failed at about 2% because there was two broken drives. From that point i turned off the computer and sent both drives to a data recovery place. They were able to recover the data on one of the two drives (the one that i started the replace operation on) - great - that should be enough to get my data back. I popped the newly recovered drive back in, it had an older tgx number then the other drives so i made a backup of each drive and then modified the tgx number to an earlier tgx number so they all match. However i am still unable to mount the array - im getting the following error: (doesnt matter if i use -f or -F) bash-3.2# zpool import data pool: data id: 6962146434836213226 state: UNAVAIL status: One or more devices are missing from the system. action: The pool cannot be imported. Attach the missing devices and try again. see: http://www.sun.com/msg/ZFS-8000-6X config: data UNAVAIL missing device raidz1 DEGRADED c0t0d0 ONLINE c0t1d0 ONLINE replacing ONLINE c0t2d0 ONLINE c0t7d0 ONLINE c0t3d0 UNAVAIL cannot open c0t4d0 ONLINE c0t5d0 ONLINE c0t6d0 ONLINE Additional devices are known to be part of this pool, though their exact configuration cannot be determined. Now i should have enough online devices to mount and get my data off however no luck. I'm not really sure where to go at this point. Do i have to fake a c0t3d0 drive so it thinks all drives are there? Can somebody point me in the right direction? thanks, liam p.s. To help me find which uberblocks to modify to reset the tgx i wrote a little perl program which finds and prints out information in order to revert to an earlier tgx value. Its a little messy since i wrote it super late at night quickly - but maybe it will help somebody else out. http://liam821.com/findUberBlock.txt (its just a perl script) Its easy to run. It pulls in 256k of data and sorts it (or skipping X kbyte if you use the -s ###) and then searches for uberblocks. (remember there is 4 labels, 0 256, and then two at the end of the disk. You need to manually figure out the end skip value...) Calculating the GUID seems to always fail because the number is to large for perl so it returns a negative number. meh wasnt important enough to try to figure out. (the info below has NOTHING to do with my disk problem above, its a happy and health server that i wrote the tool on) - find newest tgx number bash-3.00# /tmp/findUberBlock /dev/dsk/c0t1d0 -n block=148 (0025000) transaction=15980419 - print verbose output bash-3.00# /tmp/findUberBlock /dev/dsk/c0t1d0 -n -v block=148 (0025000) zfs_ver=3 (0003 ) transaction=15980419(d783 00f3 ) guid_sum=-14861410676147539 (7aad 2fc9 33a0 ffcb) timestamp=1253958103(e1d7 4abd ) (Sat Sep 26 02:41:43 2009) raw = 0025000 b10c 00ba 0003 0025010 d783 00f3 7aad 2fc9 33a0 ffcb 0025020 e1d7 4abd 0001 - list all uberblocks bash-3.00# /tmp/findUberBlock /dev/dsk/c0t1d0 -l block=145 (0024400) transaction=15980288 block=146 (0024800) transaction=15980289 block=147 (0024c00) transaction=15980290 block=148 (0025000) transaction=15980291 block=149 (0025400) transaction=15980292 block=150 (0025800) transaction=15980293 block=151 (0025c00) transaction=15980294 block=152 (0026000) transaction=15980295 block=153 (0026400) transaction=15980296 block=154 (0026800) transaction=15980297 block=155 (0026c00) transaction=15980298 block=156 (0027000) transaction=15980299 block=157 (0027400) transaction=15980300 block=158 (0027800) transaction=15980301 . . . - skip to 256 into the disk and find the newest uberblock bash-3.00# /tmp/findUberBlock /dev/dsk/c0t1d0 -n -s 256 block=507 (7ec00) transaction=15980522 Now lets say i want to go back in time on this, using the program can help me do that. If i wanted to go back in time to tgx 15980450... bash-3.00# /tmp/findUberBlock /dev/dsk/c0t1d0 -t 15980450 dd if=/dev/zero of=/dev/dsk/c0t1d0 bs=1k oseek=180 count=1 conv=notrunc dd if=/dev/zero of=/dev/dsk/c0t1d0 bs=1k oseek=181 count=1 conv=notrunc dd if=/dev/zero of=/dev/dsk/c0t1d0 bs=1k oseek=182 count=1 conv=notrunc dd if=/dev/zero of=/dev/dsk/c0t1d0 bs=1k oseek=183 count=1 conv=notrunc dd if=/dev/zero of=/dev/dsk/c0t1d0 bs=1k oseek=184 count=1 conv=notrunc dd
Re: [zfs-discuss] Help! System panic when pool imported
Richard Elling wrote: Assertion failures indicate bugs. You might try another version of the OS. In general, they are easy to search for in the bugs database. A quick search reveals http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6822816 but that doesn't look like it will help you. I suggest filing a new bug at the very least. I have redispatched 6822816, so it needs to be reevaluated since more information is available now. victor On Sep 24, 2009, at 10:21 PM, Albert Chin wrote: Running snv_114 on an X4100M2 connected to a 6140. Made a clone of a snapshot a few days ago: # zfs snapshot a...@b # zfs clone a...@b tank/a # zfs clone a...@b tank/b The system started panicing after I tried: # zfs snapshot tank/b...@backup So, I destroyed tank/b: # zfs destroy tank/b then tried to destroy tank/a # zfs destroy tank/a Now, the system is in an endless panic loop, unable to import the pool at system startup or with zpool import. The panic dump is: panic[cpu1]/thread=ff0010246c60: assertion failed: 0 == zap_remove_int(mos, ds_prev-ds_phys-ds_next_clones_obj, obj, tx) (0x0 == 0x2), file: ../../common/fs/zfs/dsl_dataset.c, line: 1512 ff00102468d0 genunix:assfail3+c1 () ff0010246a50 zfs:dsl_dataset_destroy_sync+85a () ff0010246aa0 zfs:dsl_sync_task_group_sync+eb () ff0010246b10 zfs:dsl_pool_sync+196 () ff0010246ba0 zfs:spa_sync+32a () ff0010246c40 zfs:txg_sync_thread+265 () ff0010246c50 unix:thread_start+8 () We really need to import this pool. Is there a way around this? We do have snv_114 source on the system if we need to make changes to usr/src/uts/common/fs/zfs/dsl_dataset.c. It seems like the zfs destroy transaction never completed and it is being replayed, causing the panic. This cycle continues endlessly. -- albert chin (ch...@thewrittenword.com) ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Which directories must be part of rpool?
On 09/25/09 09:58 PM, David Magda wrote: The contents of /var/tmp can be expected to survive between boots (e.g., /var/tmp/vi.recover); /tmp is nuked on power cycles (because it's just memory/swap): Yes, but does mapping it to /tmp have any issues regarding booting or image-update in the context of this thread? IMO nuking is a good thing - /tmp and /var/tmp get really cluttered up after a few months, the downside of robust hardware and software :-). Not sure I really care about recovering vi edits in the case of UPS failure... If a program is creating and deleting large numbers of files, and those files aren't needed between reboots, then it really should be using /tmp. Quite. But some lazy programmer of 3rd party software decided to use the default tmpnam() function and I don't have access to the code :-(. tmpnam() The tmpnam() function always generates a file name using the path prefix defined as P_tmpdir in the stdio.h header. On Solaris systems, the default value for P_tmpdir is /var/tmp. Similar definition for [/tmp] Linux FWIW: Yes, but unless they fixed it recently (=RHFC11), Linux doesn't actually nuke /tmp, which seems to be mapped to disk. One side effect is that (like MSWindows) AFAIK there isn't a native tmpfs, so programs that create and destroy large numbers of files run orders of magnitude slower there than on Solaris - assuming the application doesn't use /var/tmp for them :-). Compilers and code generators are typical of applications that do this, though they don't usually do synchronous i/o as said programmer appears to have done. I suppose /var/tmp on zfs would never actually write these files unless they were written synchronously. In the context of this thread, for those of us with space constrained boot disks/ssds, is it OK to map /var/tmp to /tmp, and /var/crash, /var/dump, and swap to a separate data pool in the context of being able to reboot and install new images? I've been doing so for a long time now with no problems that I know of. Just wondering what the gurus think... Havn't seen any definitive response regrading /opt, which IMO should be a good candidate since the installer makes it a separate fs anyway. /usr/local can definitely be kept on a separate pool. I wouldn't move /root. I keep a separate /export/home/root and have root cd to it via a script in /root that also sets HOME, although I noticed on snv123 that logging on as root succeeded even though it couldn't find bash (defaulted to using sh). This may be a snv123 bug, but it is a huge improvement on past behavior. I daresay logging on as root might also work if root's home directory was awol. Haven't tried it... Cheers -- Frank ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] raidz failure, trying to recover
On second though, i used zdb -l to show each device - looks like my dd didnt have the desired effects i wanted. I'm still showing a newer TGX number for all of my drives except c0t2d0 (the replacement which they fixed). (This is probably why it wont mount eh?) Is there anything else i need to do to roll back each drive so the tgx numbers match over then dd'n over each newer tgx entry on the disk? (i booted opensolaris 2009.06 which ive read is a little more forgiving) r...@opensolaris:~# zdb -l /dev/dsk/c7t0d0s0 | more LABEL 0 version=13 name='data' state=0 txg=778014 pool_guid=6962146434836213226 hostid=63246693 hostname='media' top_guid=18396265026227018612 guid=6801152981449012737 r...@opensolaris:~# zdb -l /dev/dsk/c7t1d0s0 | more LABEL 0 version=13 name='data' state=0 txg=778014 pool_guid=6962146434836213226 hostid=63246693 hostname='media' top_guid=18396265026227018612 guid=7077979893178320090 r...@opensolaris:~# zdb -l /dev/dsk/c7t2d0s0 | more LABEL 0 version=13 name='data' state=0 txg=777842 pool_guid=6962146434836213226 hostid=63246693 hostname='media' top_guid=18396265026227018612 guid=7489495842431367457 r...@opensolaris:~# zpool status pool: data state: FAULTED status: One or more devices could not be opened. There are insufficient replicas for the pool to continue functioning. action: Attach the missing device and online it using 'zpool online'. see: http://www.sun.com/msg/ZFS-8000-3C scrub: none requested config: NAME STATE READ WRITE CKSUM data FAULTED 0 0 1 corrupted data raidz1 DEGRADED 0 0 6 c7t0d0 ONLINE 0 0 0 c7t1d0 ONLINE 0 0 0 replacing ONLINE 0 0 0 c7t2d0 ONLINE 0 0 0 c7t7d0 ONLINE 0 0 0 c0t3d0 UNAVAIL 0 0 0 cannot open c7t4d0 ONLINE 0 0 0 c7t5d0 ONLINE 0 0 0 c7t6d0 ONLINE 0 0 0 r...@opensolaris:~# thanks, liam -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Best way to convert checksums
I had this same question. I was recommended to use rsync or zfs send. I used both just to be safe. With zfs send, you create a snapshot and then send the snapshot. After deleting the snapshot on the target, you have identical copies. rsync seems to be used for this task also. And also zfs send. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Problem: ZFS Partition rewriten, how to recover data???
I had a zfs partition written using zfs113 for Mac large around 1.37 TB, then under freebsd 7.2 following a guide on wiki I had wrote 'zpool create trunk' eventually rewriting the partition. Now the question is how to recover the partition or to recover data from it? Thanks ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] raidz failure, trying to recover
Have you considered bying support? Maybe you will get guaranteed help, then? -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Borked zpool, missing slog/zil
Hmmm - this is an annoying one. I'm currently running an OpenSolaris install (2008.11 upgraded to 2009.06) : SunOS shemhazai 5.11 snv_111b i86pc i386 i86pc Solaris with a zpool made up of one radiz vdev and a small ramdisk based zil. I usually swap out the zil for a file-based copy when I need to reboot (zpool replace /dev/ramdisk/slog /root/slog.tmp) but this time I had a brain fart and forgot to. The server came back up and I could sort of work on the zpool but it was complaining so I did my replace command and it happily resilvered. Then I restarted one more time in order to test bringing everything up cleanly and this time it can't find the file based zil. I try importing and it comes back with: zpool import pool: siovale id: 13808783103733022257 state: UNAVAIL status: One or more devices are missing from the system. action: The pool cannot be imported. Attach the missing devices and try again. see: http://www.sun.com/msg/ZFS-8000-6X config: siovale UNAVAIL missing device raidz1ONLINE c8d0ONLINE c9d0ONLINE c10d0 ONLINE c11d0 ONLINE Additional devices are known to be part of this pool, though their exact configuration cannot be determined. Now the file still exists so I don't know why it can't seem to find it and I thought the missing zil issue was corrected in this version (or did I miss something?). I've looked around for solutions to bring it back online and ran across this method: http://www.mail-archive.com/zfs-discuss@opensolaris.org/msg16545.html but before I jump in on this one I was hoping there was a newer, cleaner approach that I missed somehow. Ideas appreciated... Erik ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Which directories must be part of rpool?
On 09/26/09 12:11 PM, Toby Thain wrote: Yes, but unless they fixed it recently (=RHFC11), Linux doesn't actually nuke /tmp, which seems to be mapped to disk. One side effect is that (like MSWindows) AFAIK there isn't a native tmpfs, ... Are you sure about that? My Linux systems do. http://lxr.linux.no/linux+v2.6.31/Documentation/filesystems/tmpfs.txt OK, so you can mount /dev/shm on /tmp and /var/tmp, but that's not the default, at least as of RHFC10. I have files in /tmp going back to Feb 2008 :-). Evidently, quoting Wikipedia, tmpfs is supported by the Linux kernel from version 2.4 and up. http://en.wikipedia.org/wiki/TMPFS, FC1 6 years ago. Solaris /tmp has been a tmpfs since 1990... Now back to the thread... ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] extremely slow writes (with good reads)
This controller card, you have turned off any raid functionality, yes? ZFS has total control of all discs, by itself? No hw raid intervening? -- This message posted from opensolaris.org yes, it's an LSI 150-6, with the BIOS turned off, which turns it into a dumb SATA card. Paul ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS pool replace single disk with raidz
Alas you need the fix for: http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=4852783 Until that arrives mirror the disk or rebuild the pool. --chris -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Which directories must be part of rpool?
Frank Middleton wrote: I suppose /var/tmp on zfs would never actually write these files unless they were written synchronously. In the context of this thread, for those of us with space constrained boot disks/ssds, is it OK to map /var/tmp to /tmp, and /var/crash, /var/dump, and swap to a separate data pool in the context of being able to reboot and install new images? I've been doing so for a long time now with no problems that I know of. Just wondering what the gurus think... Moving /var/tmp works OK, I had a system root pool on an CF card and moved busy filesystems off to another pool. I'm not sure which filesystem caused the problem, but this system was impossible to live upgrade. swap and dump are volumes, so they can be anywhere (the both have commands to add/remove devices). Havn't seen any definitive response regrading /opt, which IMO should be a good candidate since the installer makes it a separate fs anyway. Most of /opt can be relocated, but as I said, I was unable to live upgrade the box. I only moved staroffice and then created filesystems with mountpoints in /opt before added applications that install to /opt. See http://www.sun.com/bigadmin/features/articles/nvm_boot.jsp -- Ian. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Which directories must be part of rpool?
On 09/26/09 05:25 PM, Ian Collins wrote: Most of /opt can be relocated There isn't much in there on a vanilla install (X86 snv111b) # ls /opt DTT SUNWmlib http://www.sun.com/bigadmin/features/articles/nvm_boot.jsp You pretty much answered the OP with this link. Thanks for posting it! Cheers -- Frank ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Which directories must be part of rpool?
On 26-Sep-09, at 2:55 PM, Frank Middleton wrote: On 09/26/09 12:11 PM, Toby Thain wrote: Yes, but unless they fixed it recently (=RHFC11), Linux doesn't actually nuke /tmp, which seems to be mapped to disk. One side effect is that (like MSWindows) AFAIK there isn't a native tmpfs, ... Are you sure about that? My Linux systems do. http://lxr.linux.no/linux+v2.6.31/Documentation/filesystems/tmpfs.txt OK, so you can mount /dev/shm on /tmp and /var/tmp, but that's not the default, It has long been the default in Gentoo. This system in particular was installed in 2004. at least as of RHFC10. I have files in /tmp going back to Feb 2008 :-). Evidently, quoting Wikipedia, tmpfs is supported by the Linux kernel from version 2.4 and up. http://en.wikipedia.org/wiki/TMPFS, FC1 6 years ago. Solaris /tmp has been a tmpfs since 1990... The question wasn't who was first. --Toby Now back to the thread... ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Borked zpool, missing slog/zil
Do you have a backup copy of your zpool.cache file? If you have that file, ZFS will happily mount a pool on boot without its slog device - it'll just flag the slog as faulted and you can do your normal replace. I used that for a long while on a test server with a ramdisk slog - and I never needed to swap it to a file based slog. However without a backup of that file to make zfs load the pool on boot I don't believe there is any way to import that pool. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss