Re: [zfs-discuss] Deleting large amounts of files
On Tue, Jul 20, 2010 at 1:40 PM, Ulrich Graef ulrich.gr...@oracle.com wrote: When you are writing to a file and currently dedup is enabled, then the Data is entered into the dedup table of the pool. (There is one dedup table per pool not per zfs). Switching off the dedup does not change this data. Yes, i suppose so (just as enabling dedup or compression doesn't alter on-disk data), After switching off dedup, he dedup table is used until this file is deleted or overwritten. Deleting or overwriting then accesses the dedup table and corrects the reference count. Is there a way to see which files are using dedup? Or should I just copy everything to a new ZFS? ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Deleting large amounts of files
Hi, thanks for answering, How large is your ARC / your main memory? Probably too small to hold all metadata (1/1000 of the data amount). = metadata has to be read again and again Main memory is 8GB. ARC (according to arcstat.pl) usually stays at 5-7GB A recordsize smaller than 128k increases the problem. recordsize is default, 128k Its a data volume, perhaps raidz or raidz2 and you are using an older ZPOOL version? It's raidz, pool version is 22 Reading is done for the whole raid stripe when you are reading a block. = the whole raidz stripe has the attributes of a single disk (see Roch's blog). The number of files is not specified. some 20 files deleted, each about 4GB in size Updating the dedup table needs random access of the table. dedup was enabled at some point, but I disabled it long ago. Does it still matter? Should I copy all these files again (or zfs send) to un-dedup those blocks? ~ 60 reads per second is normal for a sata disk with 7200 RPM. shouldnt ~60 reads per second at about 128k (not counting prefetch) be about 7MB/s, instead of the 144kbps (!) I'm getting? so far nothing suprising... Regards, Ulrich - Original Message - From: drge...@gmail.com To: zfs-discuss@opensolaris.org Sent: Monday, July 19, 2010 5:14:03 PM GMT +01:00 Amsterdam / Berlin / Bern / Rome / Stockholm / Vienna Subject: [zfs-discuss] Deleting large amounts of files Hello, I think this is the second time this happens to me. A couple of year ago, I deleted a big (500G) zvol and then the machine started to hang some 20 minutes later (out of memory), even rebooting didnt help. But with the great support from Victor Latushkin, who on a weekend helped me debug the problem (abort the transaction and restart it again, which required some black magic and recompiling of ZFS) it worked. Now I'm facing a similar problem. I was writing about 20GB (from CIFS) to a filesystem. While that was going, I deleted some old files, freeing up about 60GB in the process. After Windows was done deleting those (it was instant), i tried to delete another file, which I didnt have permision to. So I SSHd to the machine and removed it manually (pfexec rm file). And thats where problems started. First, I noticed the rm wasnt instant. It was taking long (over 5 minutes). I tried Ctrl-C, Ctrl-Z, another SSH and kill, nothing worked. After a while it died with killed. I did a zfs list, and noticed the free space wasn't updated. I tried sync, it also hangs. I try a reboot - it won't, I guess it's waiting for the sync to finish. So I hard reboot the machine. When it comes back I can access the ZFS pool again. I go to the directory where I tried to delete the files with rm: files are still there (they weren't before the reboot). I try a sync again. Same result (hang). top shows a decreasing amount of free memory. zpool iostat 5 shows: rpool 69.4G 79.6G 0 0 0 0 tera 3.12T 513G 63 0 144K 0 -- - - - - - - rpool 69.4G 79.6G 0 0 0 0 tera 3.12T 513G 63 0 142K 0 -- - - - - - - rpool 69.4G 79.6G 0 0 0 0 tera 3.12T 513G 62 0 142K 0 -- - - - - - - rpool 69.4G 79.6G 0 0 0 0 tera 3.12T 513G 64 0 144K 0 -- - - - - - - rpool 69.4G 79.6G 0 0 0 0 tera 3.12T 513G 65 0 148K 0 Could this be related to the fact that I THINK i enabled deduplication on this pool a while ago (but then I disabled it due to performance reasons)? What should I do? Do I have to wait for these reads to finish? Why are they so slow anyway? Thanks, Hernan -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] problems with ludelete
Hi, I'm not sure if this is the right place to ask. I'm having a little trouble deleting old solaris installs: [EMAIL PROTECTED]:~]# lustatus Boot Environment Is Active ActiveCanCopy Name Complete NowOn Reboot Delete Status -- -- - -- -- b90yes no noyes- snv95 yes no noyes- snv101 yes yesyes no - [EMAIL PROTECTED]:~]# lu lu lucancellucreateludeletelufslistlumount lustatusluupgrade luactivate lucompare lucurr ludesc lumake lurename luumountluxadm [EMAIL PROTECTED]:~]# lustatus Boot Environment Is Active ActiveCanCopy Name Complete NowOn Reboot Delete Status -- -- - -- -- b90yes no noyes- snv95 yes no noyes- snv101 yes yesyes no - [EMAIL PROTECTED]:~]# ludelete b90 System has findroot enabled GRUB Checking if last BE on any disk... ERROR: lulib_umount: failed to umount BE: snv95. ERROR: This boot environment b90 is the last BE on the above disk. ERROR: Deleting this BE may make it impossible to boot from this disk. ERROR: However you may still boot solaris if you have BE(s) on other disks. ERROR: You *may* have to change boot-device order in the BIOS to accomplish this. ERROR: If you still want to delete this BE b90, please use the force option (-f). Unable to delete boot environment. [EMAIL PROTECTED]:~]# ludelete snv95 System has findroot enabled GRUB Checking if last BE on any disk... ERROR: lulib_umount: failed to umount BE: snv95. ERROR: This boot environment snv95 is the last BE on the above disk. ERROR: Deleting this BE may make it impossible to boot from this disk. ERROR: However you may still boot solaris if you have BE(s) on other disks. ERROR: You *may* have to change boot-device order in the BIOS to accomplish this. ERROR: If you still want to delete this BE snv95, please use the force option (-f). Unable to delete boot environment. if anyone could help me I'd appreciate it. Thanks, Hernan -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] can anyone help me?
no, weird situation. I unplugged the disks from the controller (I have them labeled) before upgrading to snv89. after the upgrade, the controller names changed. This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] can anyone help me?
Thanks for your answer, after looking at your posts my suggestion would be to try the OpenSolaris 2008.05 Live CD and to import your pool using the CD. That CD is nv86 + some extra fixes. I upgraded the snv85 to snv89 to see if it helped, but it didn't. I'll try to download the 2008.05 CD again (the ISO for that is one of the things trapped in the pool I can't import). But an upgrade from Sol10 to NV is untested and nothing I would recommend at all. A fresh install of snvXY is what I know works. Didn't know that. I was simply following the N+2 rule, upgrading 10 to 11. This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] can anyone help me? [SOLVED]
Well, finally managed to solve my issue, thanks to the invaluable help of Victor Latushkin, who I can't thank enough. I'll post a more detailed step-by-step record of what he and I did (well, all credit to him actually) to solve this. Actually, the problem is still there (destroying a huge zvol or clone is slow and takes a LOT of memory, and will die when it runs out of memory), but now I'm able to import my zpool and all is there. What Victor did was hack ZFS (libzfs) to force a rollback to abort the endless destroy, which was re-triggered every time the zpool was imported, as it was inconsistent. With this custom version of libzfs, setting an environment variable makes libzfs to bypass the destroy and jump to rollback, undoing the last destroy command. I'll be posting the long version of the story soon. Hernán This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] can anyone help me?
I'll provide you with the results of these commands soon. But for the record, solaris does hang (dies out of memory, can't type anything on the console, etc). What I can do is boot with -k and get to kmdb when it's hung (BREAK over serial line). I have a crashdump I can upload. I checked the disks with the drive manufacturers' tests and found no errors. The controller is an NForce4 SATA on-board. zpool version is the latest (10). The non-default settings were removed, these were only for testing. No other non-default eeprom settings (other than the serial console options, but these were added after the problem started). This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] can anyone help me?
Here's the output. Numbers may be a little off because I'm doing a nightly build and compressing a crashdump with bzip2 at the same time. extended device statistics r/sw/s Mr/s Mw/s wait actv wsvc_t asvc_t %w %b device 3.7 19.40.10.3 3.3 0.0 142.71.6 1 3 c0d0 0.00.00.00.0 0.0 0.00.00.0 0 0 c0t1d0 0.00.00.00.0 0.0 0.00.1 12.6 0 0 c5t0d0 0.00.00.00.0 0.0 0.00.1 13.0 0 0 c5t1d0 0.00.00.00.0 0.0 0.00.1 12.6 0 0 c6t0d0 0.00.00.00.0 0.0 0.00.1 13.4 0 0 c6t1d0 extended device statistics r/sw/s Mr/s Mw/s wait actv wsvc_t asvc_t %w %b device 25.9 12.01.30.3 0.0 0.20.04.4 0 14 c0d0 0.00.00.00.0 0.0 0.00.00.0 0 0 c0t1d0 75.20.0 75.20.0 0.0 1.00.1 12.7 0 96 c5t0d0 68.20.0 68.20.0 0.0 0.90.1 13.1 0 89 c5t1d0 71.70.0 71.70.0 0.0 0.90.1 13.1 0 94 c6t0d0 62.80.0 62.80.0 0.0 0.90.1 14.0 0 88 c6t1d0 extended device statistics r/sw/s Mr/s Mw/s wait actv wsvc_t asvc_t %w %b device 24.0 16.00.60.3 0.0 0.00.10.8 0 3 c0d0 0.00.00.00.0 0.0 0.00.00.0 0 0 c0t1d0 65.50.0 65.50.0 0.0 0.90.1 14.2 0 93 c5t0d0 59.00.0 59.00.0 0.0 0.90.1 14.9 0 88 c5t1d0 67.50.0 67.50.0 0.0 0.90.1 13.2 0 89 c6t0d0 66.50.0 66.50.0 0.0 0.90.1 14.0 0 93 c6t1d0 extended device statistics r/sw/s Mr/s Mw/s wait actv wsvc_t asvc_t %w %b device 47.0 15.50.80.2 0.1 0.11.91.6 3 5 c0d0 0.00.00.00.0 0.0 0.00.00.0 0 0 c0t1d0 55.50.0 55.50.0 0.0 0.80.1 14.5 0 80 c5t0d0 73.00.0 73.00.0 0.0 1.00.1 13.2 0 96 c5t1d0 72.50.0 72.50.0 0.0 1.00.1 13.3 0 96 c6t0d0 68.00.0 68.00.0 0.0 1.00.1 14.3 0 97 c6t1d0 extended device statistics r/sw/s Mr/s Mw/s wait actv wsvc_t asvc_t %w %b device 0.09.50.00.2 0.0 0.00.00.3 0 0 c0d0 0.00.00.00.0 0.0 0.00.00.0 0 0 c0t1d0 65.00.0 65.00.0 0.0 0.90.1 14.5 0 94 c5t0d0 73.50.0 73.50.0 0.0 0.90.1 12.8 0 94 c5t1d0 75.00.0 75.00.0 0.0 0.90.1 11.8 0 89 c6t0d0 68.50.0 68.50.0 0.0 0.90.1 13.9 0 95 c6t1d0 This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] can anyone help me?
Seriously, can anyone help me? I've been asking for a week. No relevant answers, just a couple of answers but none solved my problem or even pointed me in the right way, and my posts were bumped down into oblivion. I don't know how to ask. My home server has been offline for over a week now because of a ZFS issue. Please, can anyone help me? I refuse to believe that The world's most advanced filesystem is so fragile that a simple, textbook, administration command can render it useless. This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] can anyone help me?
fwiw, here are my previous posts: http://www.opensolaris.org/jive/thread.jspa?threadID=61301tstart=30 http://www.opensolaris.org/jive/thread.jspa?threadID=62120tstart=0 This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Dtracing ZFS/ZIL
Hello. I'm still having problems with my array. It's been replaying the ZIL (I think) for a week now and it hasn't finished. Now I don't know if it will ever finish: is it starting from scratch every time? I'm dtracing the ZIL and this is what I get: 0 46882dsl_pool_zil_clean:return 0 46881 dsl_pool_zil_clean:entry 0 46882dsl_pool_zil_clean:return 0 46881 dsl_pool_zil_clean:entry 0 46882dsl_pool_zil_clean:return 0 46881 dsl_pool_zil_clean:entry 0 46882dsl_pool_zil_clean:return 0 46881 dsl_pool_zil_clean:entry 0 46882dsl_pool_zil_clean:return 0 46881 dsl_pool_zil_clean:entry 0 46882dsl_pool_zil_clean:return 0 46881 dsl_pool_zil_clean:entry 0 46882dsl_pool_zil_clean:return 0 46881 dsl_pool_zil_clean:entry 0 46882dsl_pool_zil_clean:return 0 46881 dsl_pool_zil_clean:entry 0 46882dsl_pool_zil_clean:return Does this mean that the ZIL is being updated? Or am I starting all over from scratch every time it reboots? (Rememer that I'm rebooting every 15 minutes because else the machine hangs when it runs out of memory). Hernan This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] help with a BIG problem,
Hello, thanks for your suggestion. I tried settin zfs_arc_max to 0x3000 (768MB, out of 3GB). The system ran for almost 45 minutes before it froze. Here's an interesting piece of arcstat.pl, which I noticed just as it was pasing by: Time read miss miss% dmis dm% pmis pm% mmis mm% arcsz c 15:17:41 152 152100 152 100 00 152 100 2G 805M 15:17:42 139 139100 139 100 00 139 100 2G 805M State Changed 15:17:43 188 188100 188 100 00 188 100 2G 805M 15:17:44 150 150100 150 100 00 150 100 2G 805M 15:17:45 151 151100 151 100 00 151 100 2G 805M 15:17:46 149 149100 149 100 00 149 100 2G 805M 15:17:47 161 161100 161 100 00 161 100 2G 805M 15:17:48 153 153100 153 100 00 153 100 2G 219M 15:17:49 140 140100 140 100 00 140 100 2G 100M 15:17:50 143 143100 143 100 00 143 100 2G 100M 15:17:51 145 145100 145 100 00 145 100 2G 100M notice how it suddenly drops c from 805M to 100M in 2 seconds. Also arcsz is 2G, which is weird because it shouldn't grow beyond 0x3000 (768M), right? And it's also weird to also get 100% MISS ratio Here's top just before it froze: last pid: 5253; load avg: 0.47, 0.37, 0.33; up 0+00:44:53 15:20:14 77 processes: 75 sleeping, 1 running, 1 on cpu CPU states: 57.5% idle, 1.0% user, 41.6% kernel, 0.0% iowait, 0.0% swap Memory: 3072M phys mem, 28M free mem, 2055M swap, 1994M free swap PID USERNAME LWP PRI NICE SIZE RES STATETIMECPU COMMAND 1248 root 1 590 5940K 2736K sleep0:14 0.82% arcstat.pl 5206 root 9 590 47M 4892K sleep0:01 0.35% java 855 root 2 590 5076K 1588K sleep0:09 0.33% apcupsd 3134 root 1 590 5152K 1764K sleep0:02 0.26% zpool 1261 root 1 590 4104K 588K cpu 0:03 0.22% top 3125 root 1 590 6352K 1536K sleep0:00 0.06% sshd 1151 root 1 590 6352K 1504K sleep0:00 0.05% sshd 62 root 1 590 1832K 540K sleep0:01 0.05% powernowd 849 root 1 590 11M 1100K sleep0:00 0.05% snmpd 465 proxy 1 590 15M 2196K run 0:00 0.04% squid 271 daemon 1 590 6652K 264K sleep0:00 0.03% rcapd 1252 root 1 590 6352K 1292K sleep0:00 0.02% sshd 7 root 14 590 12M 5412K sleep0:04 0.02% svc.startd 880 root 1 590 6276K 2076K sleep0:00 0.02% httpd 847 root 1 590 2436K 1148K sleep0:00 0.02% dhcpagent and finally, zpool iostat 1: tera1.51T 312G207 0 1.22M 0 tera1.51T 312G141 0 854K 0 tera1.51T 312G 70 0 427K 0 tera1.51T 312G204 0 1.20M 0 tera1.51T 312G187 0 1.10M 0 tera1.51T 312G179 0 1.05M 0 tera1.51T 312G120 0 743K 0 tera1.51T 312G 94 0 580K 0 tera1.51T 312G 77 0 471K 0 tera1.51T 312G115 0 696K 0 Which shows a very poor read performance, for a 4xSATA2 (this array usually saturates my gigabit ethernet). And it's not that the kernel is processing that much data because the CPU is 57% idle and I THINK powernowd is making it run at 900MHz. Hernán This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] help with a BIG problem,
No, this is a 64-bit system (athlon64) with 64-bit kernel of course. This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] help with a BIG problem,
So, I think I've narrowed it down to two things: * ZFS tries to destroy the dataset every time it's called because the last time it didn't finish destroying * In this process, ZFS makes the kernel run out of memory and die So I thought of two options, but I'm not sure if I'm right: Option 1: Destroy is an atomic operation If destroy is atomic, then I guess what it's trying to do is look up all the blocks that need to be deleted/unlinked/released/freed (not sure which is the word). After it has that list, it will write it to the ZIL (remember this is just what I suppose, correct me if I'm wrong!) and start to physically delete the blocks, until the operation is done and it's finally committed. If this is the case, then the process will be restarted from scratch every time the system is rebooted. But I read that apparently in previous versions, rebooting while destroying a clone that it's taking too long makes the clone reappear intact next time. This, and the fact that zpool iostat show only reads and no or very few writes is what lead me to think this is how it works. So if this is the case, I'd like to abort this destroy. After importing the pool, I will have everything as it was and maybe I can delete snapshots before the clone's parent snapshot and maybe this will speed up the destroy process, or just leave the clone. Option 2: Destroy is not atomic By this I don't mean that it's not atomic, as in if the operation is canceled, it will finish in an incomplete state, but as in if the system is rebooted, the operation will RESUME at the point it was where it died. If this is the case, maybe I can write a script to reboot the computer in a fixed amount of time, and run it on boot: zpool import xx sleep 20 seconds rm /etc/zfs/zpool.cache sleep 1800 seconds reboot This will work under the assumption that the list of blocks to be deleted is flushed to the ZIL or something before boot, to allow the operation to restart at the same point. This is a very nasty hack but it may do the trick only in a very slow fashion: zpool iostat shows 1MB/s read when it's doing the destroy. The dataset in question has 450GB which means that the operation will take 5 days to finish if it needs to read the whole dataset to destroy it, or 7 days if it also needs to go through the other snapshots (600GB total). So, my only viable option seems to be to abort this. How can I do this? disable the ZIL, maybe? Delete the ZIL? scrub after this? Thanks, Hernán This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] help with a BIG problem, can't import my zpool anymore
Hello, I'm having a big problem here, disastrous maybe. I have a zpool consisting of 4x500GB SATA drives, this pool was born on S10U4 and was recently upgraded to snv85 because of iSCSI issues with some initiator. Last night I was doing housekeeping, deleting old snapshots. One snapshot failed to delete because it had a dependant clone. So I try to destroy that clone: Everything went wrong from there. The deletion was taking an excessively long time (over 40 minutes). zpool status hungs when I call it. zfs list too. zpool iostat showed disk activity. Other services non dependant on the pool were running, and the iSCSI this machine was serving was unbearably slow. At one point, I lost all iSCSI, SSH, web, and all other services. Ping still worked. So I go to the server and notice that the fans are running at 100%. I try to get a console (local VGA+keyboard) but the monitor shows no signal. No disk activity seemed to be happening at the moment. So, I do the standard procedure (reboot). Solaris boots but stops at hostname: blah. I see disk activity from the pool disks, so I let it boot. 30 minutes later, still didn't finish. I thought (correctly) that the system was waiting to mount the ZFS before booting, but for some reason it doesn't. I call it the day and let the machine do its thing. 8 hours later I return. CPU is cold, disks are idle and... solaris stays at the same hostname: blah. Time to reboot again, this time in failsafe. zpool import shows that the devices are detected and online. I delete /etc/zfs/zpool.cache and reboot. Solaris starts normally with all services running, but of course no zfs. zpool import shows the available pool, no errors. I do zpool import -f pool... 20 minutes later I'm still waiting for the pool to mount. zpool iostat shows activity: capacity operationsbandwidth pool used avail read write read write -- - - - - - - tera1.51T 312G274 0 1.61M 2.91K tera1.51T 312G308 0 1.82M 0 tera1.51T 312G392 0 2.31M 0 tera1.51T 312G468 0 2.75M 0 but the mountpoint /tera is still not populated (and zpool import still doesn't exit). zpool status shows: pool: tera state: ONLINE scrub: none requested config: NAMESTATE READ WRITE CKSUM teraONLINE 0 0 0 raidz1ONLINE 0 0 0 c1d0ONLINE 0 0 0 c2d0ONLINE 0 0 0 c3d0ONLINE 0 0 0 c4d0ONLINE 0 0 0 errors: No known data errors What's going on? Why is taking so long to import? Thanks in advance, Hernan This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] help with a BIG problem, can't import my zpool anymore
I got more info. I can run zpool history and this is what I get: 2008-05-23.00:29:40 zfs destroy tera/[EMAIL PROTECTED] 2008-05-23.00:29:47 [internal destroy_begin_sync txg:3890809] dataset = 152 2008-05-23.01:28:38 [internal destroy_begin_sync txg:3891101] dataset = 152 2008-05-23.07:01:36 zpool import -f tera 2008-05-23.07:01:40 [internal destroy_begin_sync txg:3891106] dataset = 152 2008-05-23.10:52:56 zpool import -f tera 2008-05-23.10:52:58 [internal destroy_begin_sync txg:3891112] dataset = 152 2008-05-23.12:17:49 [internal destroy_begin_sync txg:3891114] dataset = 152 2008-05-23.12:27:48 zpool import -f tera 2008-05-23.12:27:50 [internal destroy_begin_sync txg:3891120] dataset = 152 2008-05-23.13:03:07 [internal destroy_begin_sync txg:3891122] dataset = 152 2008-05-23.13:56:52 zpool import -f tera 2008-05-23.13:56:54 [internal destroy_begin_sync txg:3891128] dataset = 152 apparently, it starts destroying dataset #152, which is the parent snapshot of the clone I issued the command to destroy. Not sure how it works, but I ordered the deletion of the CLONE, not the snapshot (which I was going to destroy anyway). The question is still, why does it hang the machine? Why can't I access the filesystems? Isn't it supposed to import the zpool, mount the ZFSs and then do the destroy, in background? This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] help with a BIG problem, can't import my zpool anymore
I let it run for about 4 hours. when I returned, still the same: I can ping the machine but I can't SSH to it, or use the console. Please, I need urgent help with this issue! This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] help with a BIG problem, can't import my zpool anymore
I let it run while watching TOP, and this is what I got just before it hung. Look at free mem. Is this memory allocated to the kernel? can I allow the kernel to swap? last pid: 7126; load avg: 3.36, 1.78, 1.11; up 0+01:01:11 21:16:49 88 processes: 78 sleeping, 9 running, 1 on cpu CPU states: 22.4% idle, 0.4% user, 77.2% kernel, 0.0% iowait, 0.0% swap Memory: 3072M phys mem, 31M free mem, 2055M swap, 1993M free swap PID USERNAME LWP PRI NICE SIZE RES STATETIMECPU COMMAND 7126 root 9 580 45M 4188K run 0:00 0.71% java 4821 root 1 590 5124K 1724K run 0:03 0.46% zfs 5096 root 1 590 5124K 1724K run 0:03 0.45% zfs 2470 root 1 590 4956K 1660K sleep0:06 0.45% zfs This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] help with a BIG problem,
I forgot to post arcstat.pl's output: Time read miss miss% dmis dm% pmis pm% mmis mm% arcsz c 22:32:37 556K 525K 94 515K 949K 98 515K 97 1G1G 22:32:38636310063 100 0063 100 1G1G 22:32:39747410074 100 0074 100 1G1G 22:32:40767610076 100 0076 100 1G1G State Changed 22:32:41757510075 100 0075 100 1G1G 22:32:42777710077 100 0077 100 1G1G 22:32:43727210072 100 0072 100 1G1G 22:32:44808010080 100 0080 100 1G1G State Changed 22:32:45989810098 100 0098 100 1G1G sometimes c is 2G. I tried the mkfile and swap, but I get: [EMAIL PROTECTED]:/]# mkfile -n 4g /export/swap [EMAIL PROTECTED]:/]# swap -a /export/swap /export/swap may contain holes - can't swap on it. /export is the only place where I have enough free space. I could add another drive if needed. This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] help with a BIG problem,
oops. replied too fast. Ran without -n, and space was added successfully... but it didn't work. It died out of memory again. This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss