Re: [zfs-discuss] ZFS monitoring
On Mon, Feb 11, 2013 at 05:39:27PM +0100, Jim Klimov wrote: On 2013-02-11 17:14, Borja Marcos wrote: On Feb 11, 2013, at 4:56 PM, Tim Cook wrote: The zpool iostat output has all sorts of statistics I think would be useful/interesting to record over time. Yes, thanks :) I think I will add them, I just started with the esoteric ones. Anyway, still there's no better way to read it than running zpool iostat and parsing the output, right? I believe, in this case you'd have to run it as a continuous process and parse the outputs after the first one (overall uptime stat, IIRC). Also note that on problems with ZFS engine itself, zpool may lock up and thus halt your program - so have it ready to abort an outstanding statistics read after a timeout and perhaps log an error. And if pools are imported-exported during work, the zpool iostat output changes dynamically, so you basically need to parse its text structure every time. The zpool iostat -v might be even more interesting though, as it lets you see per-vdev statistics and perhaps notice imbalances, etc... All that said, I don't know if this data isn't also available as some set of kstats - that would probably be a lot better for your cause. Inspect the zpool source to see where it gets its numbers from... and perhaps make and RTI relevant kstats, if they aren't yet there ;) On the other hand, I am not certain how Solaris-based kstats interact or correspond to structures in FreeBSD (or Linux for that matter)?.. I made kstat data available on FreeBSD via 'kstat' sysctl tree: # sysctl kstat -- Pawel Jakub Dawidek http://www.wheelsystems.com FreeBSD committer http://www.FreeBSD.org Am I Evil? Yes, I Am! http://tupytaj.pl pgpyFGpZBBFM1.pgp Description: PGP signature ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS monitoring
On Feb 12, 2013, at 11:25 AM, Pawel Jakub Dawidek wrote: I made kstat data available on FreeBSD via 'kstat' sysctl tree: Yes, I am using the data. I wasn't sure about how getting something meaningful from it, but I've found the arcstats.pl script and I am using it as a model. Suggestions will be always welcome, though :) (the sample pages I put on devilator.froblua.com aren't using the better organized graphs, though, it's just a crude parameter dump) Borja. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS monitoring
On Mon, Feb 11, 2013 at 9:53 AM, Borja Marcos bor...@sarenet.es wrote: Hello, I'n updating Devilator, the performance data collector for Orca and FreeBSD to include ZFS monitoring. So far I am graphing the ARC and L2ARC size, L2ARC writes and reads, and several hit/misses data pairs. Any suggestions to improve it? What other variables can be interesting? An example of the current state of the program is here: http://devilator.frobula.com Thanks, Borja. The zpool iostat output has all sorts of statistics I think would be useful/interesting to record over time. --Tim ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS monitoring
On Feb 11, 2013, at 4:56 PM, Tim Cook wrote: The zpool iostat output has all sorts of statistics I think would be useful/interesting to record over time. Yes, thanks :) I think I will add them, I just started with the esoteric ones. Anyway, still there's no better way to read it than running zpool iostat and parsing the output, right? Borja. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS monitoring
On 02/11/2013 04:53 PM, Borja Marcos wrote: Hello, I'n updating Devilator, the performance data collector for Orca and FreeBSD to include ZFS monitoring. So far I am graphing the ARC and L2ARC size, L2ARC writes and reads, and several hit/misses data pairs. Any suggestions to improve it? What other variables can be interesting? An example of the current state of the program is here: http://devilator.frobula.com Hi Borja, I've got one thing up for review in Illumos for upstreaming: #3137 L2ARC Compression. This adds another kstat called l2_asize, which tells you how big the L2ARC actually is, taking into account compression, e.g. # kstat -n arcstats 5 | egrep '(\l2_size\|\l2_asize\)' l2_asize25708032 l2_size 29117952 You can use this to track L2ARC compression efficiency, etc. If the kstat is not present, then L2ARC compression isn't available. Anyway, just a quick thought. -- Saso ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS monitoring
On 2013-02-11 17:14, Borja Marcos wrote: On Feb 11, 2013, at 4:56 PM, Tim Cook wrote: The zpool iostat output has all sorts of statistics I think would be useful/interesting to record over time. Yes, thanks :) I think I will add them, I just started with the esoteric ones. Anyway, still there's no better way to read it than running zpool iostat and parsing the output, right? I believe, in this case you'd have to run it as a continuous process and parse the outputs after the first one (overall uptime stat, IIRC). Also note that on problems with ZFS engine itself, zpool may lock up and thus halt your program - so have it ready to abort an outstanding statistics read after a timeout and perhaps log an error. And if pools are imported-exported during work, the zpool iostat output changes dynamically, so you basically need to parse its text structure every time. The zpool iostat -v might be even more interesting though, as it lets you see per-vdev statistics and perhaps notice imbalances, etc... All that said, I don't know if this data isn't also available as some set of kstats - that would probably be a lot better for your cause. Inspect the zpool source to see where it gets its numbers from... and perhaps make and RTI relevant kstats, if they aren't yet there ;) On the other hand, I am not certain how Solaris-based kstats interact or correspond to structures in FreeBSD (or Linux for that matter)?.. HTH, //Jim Klimov ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS monitoring - best practices?
Ray, Here is my short list of Performance Metrics I track on 7410 Performance Rigs via 7000 Analytics. Cheers, Joel. m:analytics datasets ls Datasets: DATASET STATE INCORE ONDISK NAME dataset-000 active 1016K 75.9M arc.accesses[hit/miss] dataset-001 active390K 37.9M arc.l2_accesses[hit/miss] dataset-002 active242K 13.7M arc.l2_size dataset-003 active242K 13.7M arc.size dataset-004 active958K 86.1M arc.size[component] dataset-005 active242K 13.7M cpu.utilization dataset-006 active477K 46.2M cpu.utilization[mode] dataset-007 active648K 59.7M dnlc.accesses[hit/miss] dataset-008 active242K 13.7M fc.bytes dataset-009 active242K 13.7M fc.ops dataset-010 active242K 12.8M fc.ops[latency] dataset-011 active242K 12.8M fc.ops[op] dataset-012 active242K 13.7M ftp.kilobytes dataset-013 active242K 12.8M ftp.kilobytes[op] dataset-014 active242K 13.7M http.reqs dataset-015 active242K 12.8M http.reqs[latency] dataset-016 active242K 12.8M http.reqs[op] dataset-017 active242K 13.7M io.bytes dataset-018 active439K 43.7M io.bytes[op] dataset-019 active308K 29.6M io.disks[utilization=95][disk] dataset-020 active 2.93M 87.2M io.disks[utilization] dataset-021 active242K 13.7M io.ops dataset-022 active 9.85M 274M io.ops[disk] dataset-023 active 20.0M 827M io.ops[latency] dataset-024 active438K 43.6M io.ops[op] dataset-025 active242K 13.7M iscsi.bytes dataset-026 active242K 13.7M iscsi.ops dataset-027 active 1.45M 91.1M iscsi.ops[latency] dataset-028 active248K 14.8M iscsi.ops[op] dataset-029 active242K 13.7M ndmp.diskkb dataset-030 active242K 13.8M nfs2.ops dataset-031 active242K 12.8M nfs2.ops[latency] dataset-032 active242K 13.8M nfs2.ops[op] dataset-033 active242K 13.8M nfs3.ops dataset-034 active 8.82M 163M nfs3.ops[latency] dataset-035 active327K 18.1M nfs3.ops[op] dataset-036 active242K 13.8M nfs4.ops dataset-037 active 2.31M 97.8M nfs4.ops[latency] dataset-038 active311K 17.2M nfs4.ops[op] dataset-039 active242K 13.7M nic.kilobytes dataset-040 active970K 84.5M nic.kilobytes[device] dataset-041 active943K 77.1M nic.kilobytes[direction=in][device] dataset-042 active457K 31.1M nic.kilobytes[direction=out][device] dataset-043 active503K 49.1M nic.kilobytes[direction] dataset-044 active242K 13.7M sftp.kilobytes dataset-045 active242K 12.8M sftp.kilobytes[op] dataset-046 active242K 13.7M smb.ops dataset-047 active242K 12.8M smb.ops[latency] dataset-048 active242K 13.7M smb.ops[op] dataset-049 active242K 12.8M srp.bytes dataset-050 active242K 12.8M srp.ops[latency] dataset-051 active242K 12.8M srp.ops[op] Cheers, Joel. On 04/08/10 14:06, Ray Van Dolson wrote: We're starting to grow our ZFS environment and really need to start standardizing our monitoring procedures. OS tools are great for spot troubleshooting and sar can be used for some trending, but we'd really like to tie this into an SNMP based system that can generate graphs for us (via RRD or other). Whether or not we do this via our standard enterprise monitoring tool or write some custom scripts I don't really care... but I do have the following questions: - What metrics are you guys tracking? I'm thinking: - IOPS - ZIL statistics - L2ARC hit ratio - Throughput - IO Wait (I know there's probably a better term here) Utilize Latency instead of IO Wait. - How do you gather this information? Some but not all is available via SNMP. Has anyone written a ZFS specific MIB or plugin to make the info available via the standard Solaris SNMP daemon? What information is available only via zdb/mdb? On 7000 appliances, this is easy via Analytics. On Solaris, you need to pull data from kstats and/or DTrace scripts and then archive the data in similar manner... - Anyone have any RRD-based setups for monitoring their ZFS environments they'd be willing to share or talk about? Thanks in advance, Ray ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss -- http://www.oracle.com/Joel Buckley | +1.303.272.5556 Oracle Open Storage Systems 500 Eldorado Blvd Broomfield, CO 80021-3400 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss