Re: [zfs-discuss] ZFS slows down over a couple of days
Hi all, thanks a lot for your suggestions. I have checked all of them and neither the network itself nor any other check indicated any problem. Alas, I think I know what is going on… ehh… my current zpool has two vdevs that are actually not even sized, as shown by zpool iostat -v: zpool iostat -v obelixData 5 capacity operations bandwidth pool alloc free read write read write --- - - - - - - obelixData 13,1T 5,84T 36 227 348K 21,5M c9t21D023038FA8d0 6,25T 59,3G 21 98 269K 9,25M c9t21D02305FF42d0 6,84T 5,78T 15 129 79,2K 12,3M --- - - - - - - So, the small vdev is actually 99+% full, which is likely to be the root cause for this issue. Especially, since RAIDs tend to take tremendous performance hits, when they exceed 90% space utilization. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS slows down over a couple of days
Stephan, The "vmstat" shows you are not actually short of memory; The "pi" and "po" columns are zero, so the system is not having to do any paging, and it seems unlike the system is slow directly because of RAM shortage. With the ARC, it's not unusual for vmstat to show little free memory, but the system will give up that RAM when an application asks for it. You can tell if this is happening a lot by: echo "::arc" | mdb -k | grep throttle If the value of "memory_throttle_count" is large, that will indicate that apps are often asking the kernel to give up ARC memory. Also, as you said, the "iostat" figures look idle. You can tell more using "iostat -xn 1", which will give service times & percent-busy figures for the actual devices. It could be that something about the networking involved is what is actually slow. You could find out if it's a local bottleneck by trying some simple I/O tests on the server itself, maybe: dd if=/dev/zero of=/file/in/zpool bs=1024k and watching what iostat shows, etc. Another test is to try a network-only test, maybe using "ttcp" between the server and a client. This could tell you if it's network or storage that's causing the slow-down. If you don't have "ttcp", something silly like, on a client running: dd if=/dev/zero bs=1024k | ssh -c blowfish server "dd of=/dev/null bs=1024k" You can watch network throughput on the server using: dladm show-link -s -i 1 Regards, Marion ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS slows down over a couple of days
Am 12.01.11 18:49, schrieb SR: You may need to adjust zfs_arc_max in /etc/system to avoid memory contention http://www.thezonemanager.com/2009/03/filesystem-cache-optimization.htm Suresh I though I had that done through this in /etc/system: set zfs:zfs_arc_max = 17179869184 I do also think that arc_summary.pl showed exactly that... Cheers, budy -- Stephan Budach Jung von Matt/it-services GmbH Glashüttenstraße 79 20357 Hamburg Tel: +49 40-4321-1353 Fax: +49 40-4321-1114 E-Mail: stephan.bud...@jvm.de Internet: http://www.jvm.com Geschäftsführer: Ulrich Pallas, Frank Wilhelm AG HH HRB 98380 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS slows down over a couple of days
You may need to adjust zfs_arc_max in /etc/system to avoid memory contention http://www.thezonemanager.com/2009/03/filesystem-cache-optimization.htm Suresh -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS slows down over a couple of days
Am 12.01.11 16:32, schrieb Jeff Savit: Stephan, There are a bunch of tools you can use, mostly provided with Solaris 11 Express, plus arcstat, arc_summary that are available as downloads. The latter tools will tell you the size and state of ARC, which may be specific to your issues since you cite memory. For the list, could you describe the ZFS pool configuration (zpool status), and summarize output from vmstat, iostat, and zpool iostat? Also, it might be helpful to issue 'prstat -s rss' to see if any process is growing its resident memory size. An excellet source of information is the "ZFS evil tuning guide" (just Google those words), which has a wealth of information. I hope that helps (for a start at least) Jeff On 01/12/11 08:21 AM, Stephan Budach wrote: Hi all, I have exchanged my Dell R610 in favor of a Sun Fire 4170 M2 which has 32 GB RAM installed. I am running Sol11Expr on this host and I use it to primarily serve Netatalk AFP shares. From day one, I have noticed that the amount of free RAM decereased and along with that decrease the overall performance of ZFS decreased as well. Now, since I am still quite a Solaris newbie, I seem to cannot track where the heck all the memory has gone and why ZFS performs so poorly after an uptime of only 5 days. I can reboot Solaris, which I did for testing, and that would bring back the performance to reasonable levels, but otherwiese I am quite at my witts end. To give some numbers: the ZFS performance decreases down to 1/10th of the initial throughput, either read or write. Anybody having some tips up their sleeves, where I should start looking for the missing memory? Cheers, budy Sure - here we go. First of all, the zpool configuration: zpool status -v pool: obelixData state: ONLINE status: The pool is formatted using an older on-disk format. The pool can still be used, but some features are unavailable. action: Upgrade the pool using 'zpool upgrade'. Once this is done, the pool will no longer be accessible on older software versions. scan: scrub repaired 0 in 15h29m with 0 errors on Mon Nov 15 21:42:52 2010 config: NAME STATE READ WRITE CKSUM obelixData ONLINE 0 0 0 c9t21D023038FA8d0 ONLINE 0 0 0 c9t21D02305FF42d0 ONLINE 0 0 0 errors: No known data errors This pool consists of two FC LUNS which are exported from two FC RAIDs (no comments on that one, please I am still working on the transision to another zpool config! ;) ) Next up are arcstat.pl and arc_summary.pl: perl /usr/local/de.jvm.scripts/arcstat.pl time read miss miss% dmis dm% pmis pm% mmis mm% arcsz c 17:13:33 0 0 0 00 00 0015G 16G 17:13:3471 0 0 00 00 0015G 16G 17:13:35 3 0 0 00 00 0015G 16G 17:13:36 30K 0 0 00 00 0015G 16G 17:13:37 13K 0 0 00 00 0015G 16G 17:13:3872 0 0 00 00 0015G 16G 17:13:3912 0 0 00 00 0015G 16G 17:13:4045 0 0 00 00 0015G 16G 17:13:4157 0 0 00 00 0015G 16G 17:13:42 1.3K 8 0 80 00 6015G 16G 17:13:4345 0 0 00 00 0015G 16G 17:13:44 1.5K15 1130 2 50 4015G 16G 17:13:45 122 0 0 00 00 0015G 16G 17:13:4674 0 0 00 00 0015G 16G 17:13:4788 0 0 00 00 0015G 16G 17:13:48 19K67 025042424016G 16G 17:13:49 24K31 0 00319 0015G 16G 17:13:5041 0 0 00 00 0015G 16G perl /usr/local/de.jvm.scripts/arc_summary.pl System Memory: Physical RAM: 32751 MB Free Memory : 5615 MB LotsFree: 511 MB ZFS Tunables (/etc/system): set zfs:zfs_arc_max = 17179869184 ARC Size: Current Size: 16383 MB (arcsize) Target Size (Adaptive): 16384 MB (c) Min Size (Hard Limit):2048 MB (zfs_arc_min) Max Size (Hard Limit):16384 MB (zfs_arc_max) ARC Size Breakdown: Most Recently Used Cache Size: 73% 12015 MB (p) Most Frequently Used Cache Size: 26% 4368 MB (c-p) ARC Efficency: Cache Access Total: 300030668 Cache Hit Ratio: 92% 277102547 [Defined State for buffer] Cache Miss Ratio: 7% 22928121 [Undefined State for Buffer] REAL Hit Ratio: 84% 253621864 [MRU/MFU Hits Only] Data Demand Efficiency:98% D
Re: [zfs-discuss] ZFS slows down over a couple of days
Stephan, There are a bunch of tools you can use, mostly provided with Solaris 11 Express, plus arcstat, arc_summary that are available as downloads. The latter tools will tell you the size and state of ARC, which may be specific to your issues since you cite memory. For the list, could you describe the ZFS pool configuration (zpool status), and summarize output from vmstat, iostat, and zpool iostat? Also, it might be helpful to issue 'prstat -s rss' to see if any process is growing its resident memory size. An excellet source of information is the "ZFS evil tuning guide" (just Google those words), which has a wealth of information. I hope that helps (for a start at least) Jeff On 01/12/11 08:21 AM, Stephan Budach wrote: Hi all, I have exchanged my Dell R610 in favor of a Sun Fire 4170 M2 which has 32 GB RAM installed. I am running Sol11Expr on this host and I use it to primarily serve Netatalk AFP shares. From day one, I have noticed that the amount of free RAM decereased and along with that decrease the overall performance of ZFS decreased as well. Now, since I am still quite a Solaris newbie, I seem to cannot track where the heck all the memory has gone and why ZFS performs so poorly after an uptime of only 5 days. I can reboot Solaris, which I did for testing, and that would bring back the performance to reasonable levels, but otherwiese I am quite at my witts end. To give some numbers: the ZFS performance decreases down to 1/10th of the initial throughput, either read or write. Anybody having some tips up their sleeves, where I should start looking for the missing memory? Cheers, budy -- *Jeff Savit* | Principal Sales Consultant Phone: 602.824.6275 | Email: jeff.sa...@oracle.com | Blog: http://blogs.sun.com/jsavit Oracle North America Commercial Hardware Operating Environments & Infrastructure S/W Pillar 2355 E Camelback Rd | Phoenix, AZ 85016 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] ZFS slows down over a couple of days
Hi all, I have exchanged my Dell R610 in favor of a Sun Fire 4170 M2 which has 32 GB RAM installed. I am running Sol11Expr on this host and I use it to primarily serve Netatalk AFP shares. From day one, I have noticed that the amount of free RAM decereased and along with that decrease the overall performance of ZFS decreased as well. Now, since I am still quite a Solaris newbie, I seem to cannot track where the heck all the memory has gone and why ZFS performs so poorly after an uptime of only 5 days. I can reboot Solaris, which I did for testing, and that would bring back the performance to reasonable levels, but otherwiese I am quite at my witts end. To give some numbers: the ZFS performance decreases down to 1/10th of the initial throughput, either read or write. Anybody having some tips up their sleeves, where I should start looking for the missing memory? Cheers, budy ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss