Re: [zfs-discuss] ZFS monitoring

2013-02-12 Thread Pawel Jakub Dawidek
On Mon, Feb 11, 2013 at 05:39:27PM +0100, Jim Klimov wrote:
 On 2013-02-11 17:14, Borja Marcos wrote:
 
  On Feb 11, 2013, at 4:56 PM, Tim Cook wrote:
 
  The zpool iostat output has all sorts of statistics I think would be 
  useful/interesting to record over time.
 
 
  Yes, thanks :) I think I will add them, I just started with the esoteric 
  ones.
 
  Anyway, still there's no better way to read it than running zpool iostat 
  and parsing the output, right?
 
 
 I believe, in this case you'd have to run it as a continuous process
 and parse the outputs after the first one (overall uptime stat, IIRC).
 Also note that on problems with ZFS engine itself, zpool may lock up
 and thus halt your program - so have it ready to abort an outstanding
 statistics read after a timeout and perhaps log an error.
 
 And if pools are imported-exported during work, the zpool iostat
 output changes dynamically, so you basically need to parse its text
 structure every time.
 
 The zpool iostat -v might be even more interesting though, as it lets
 you see per-vdev statistics and perhaps notice imbalances, etc...
 
 All that said, I don't know if this data isn't also available as some
 set of kstats - that would probably be a lot better for your cause.
 Inspect the zpool source to see where it gets its numbers from...
 and perhaps make and RTI relevant kstats, if they aren't yet there ;)
 
 On the other hand, I am not certain how Solaris-based kstats interact
 or correspond to structures in FreeBSD (or Linux for that matter)?..

I made kstat data available on FreeBSD via 'kstat' sysctl tree:

# sysctl kstat

-- 
Pawel Jakub Dawidek   http://www.wheelsystems.com
FreeBSD committer http://www.FreeBSD.org
Am I Evil? Yes, I Am! http://tupytaj.pl


pgpyFGpZBBFM1.pgp
Description: PGP signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS monitoring

2013-02-12 Thread Borja Marcos

On Feb 12, 2013, at 11:25 AM, Pawel Jakub Dawidek wrote:

 I made kstat data available on FreeBSD via 'kstat' sysctl tree:

Yes, I am using the data. I wasn't sure about how getting something meaningful 
from it, but I've found the arcstats.pl script and I am using it as a model.

Suggestions will be always welcome, though :)

(the sample pages I put on devilator.froblua.com aren't using the better 
organized graphs, though, it's just a crude parameter dump)





Borja.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS monitoring

2013-02-11 Thread Tim Cook
On Mon, Feb 11, 2013 at 9:53 AM, Borja Marcos bor...@sarenet.es wrote:


 Hello,

 I'n updating Devilator, the performance data collector for Orca and
 FreeBSD to include ZFS monitoring. So far I am graphing the ARC and L2ARC
 size, L2ARC writes and reads, and several hit/misses data pairs.

 Any suggestions to improve it? What other variables can be interesting?

 An example of the current state of the program is here:

 http://devilator.frobula.com

 Thanks,





 Borja.



The zpool iostat output has all sorts of statistics I think would be
useful/interesting to record over time.

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS monitoring

2013-02-11 Thread Borja Marcos

On Feb 11, 2013, at 4:56 PM, Tim Cook wrote:

 The zpool iostat output has all sorts of statistics I think would be 
 useful/interesting to record over time.


Yes, thanks :) I think I will add them, I just started with the esoteric ones.

Anyway, still there's no better way to read it than running zpool iostat and 
parsing the output, right? 





Borja.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS monitoring

2013-02-11 Thread Sašo Kiselkov
On 02/11/2013 04:53 PM, Borja Marcos wrote:
 
 Hello,
 
 I'n updating Devilator, the performance data collector for Orca and FreeBSD 
 to include ZFS monitoring. So far I am graphing the ARC and L2ARC size, L2ARC 
 writes and reads, and several hit/misses data pairs.
 
 Any suggestions to improve it? What other variables can be interesting?
 
 An example of the current state of the program is here:
 
 http://devilator.frobula.com

Hi Borja,

I've got one thing up for review in Illumos for upstreaming: #3137 L2ARC
Compression. This adds another kstat called l2_asize, which tells you
how big the L2ARC actually is, taking into account compression, e.g.

# kstat -n arcstats 5 | egrep '(\l2_size\|\l2_asize\)'
l2_asize25708032
l2_size 29117952

You can use this to track L2ARC compression efficiency, etc. If the
kstat is not present, then L2ARC compression isn't available.

Anyway, just a quick thought.

--
Saso
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS monitoring

2013-02-11 Thread Jim Klimov

On 2013-02-11 17:14, Borja Marcos wrote:


On Feb 11, 2013, at 4:56 PM, Tim Cook wrote:


The zpool iostat output has all sorts of statistics I think would be 
useful/interesting to record over time.



Yes, thanks :) I think I will add them, I just started with the esoteric ones.

Anyway, still there's no better way to read it than running zpool iostat and 
parsing the output, right?



I believe, in this case you'd have to run it as a continuous process
and parse the outputs after the first one (overall uptime stat, IIRC).
Also note that on problems with ZFS engine itself, zpool may lock up
and thus halt your program - so have it ready to abort an outstanding
statistics read after a timeout and perhaps log an error.

And if pools are imported-exported during work, the zpool iostat
output changes dynamically, so you basically need to parse its text
structure every time.

The zpool iostat -v might be even more interesting though, as it lets
you see per-vdev statistics and perhaps notice imbalances, etc...

All that said, I don't know if this data isn't also available as some
set of kstats - that would probably be a lot better for your cause.
Inspect the zpool source to see where it gets its numbers from...
and perhaps make and RTI relevant kstats, if they aren't yet there ;)

On the other hand, I am not certain how Solaris-based kstats interact
or correspond to structures in FreeBSD (or Linux for that matter)?..

HTH,
//Jim Klimov

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS monitoring - best practices?

2010-04-08 Thread Joel Buckley

Ray,

Here is my short list of Performance Metrics I track on 7410 Performance 
Rigs via 7000 Analytics.


Cheers,
Joel.

m:analytics datasets ls
Datasets:

DATASET STATE   INCORE ONDISK NAME
dataset-000 active   1016K  75.9M arc.accesses[hit/miss]
dataset-001 active390K  37.9M arc.l2_accesses[hit/miss]
dataset-002 active242K  13.7M arc.l2_size
dataset-003 active242K  13.7M arc.size
dataset-004 active958K  86.1M arc.size[component]
dataset-005 active242K  13.7M cpu.utilization
dataset-006 active477K  46.2M cpu.utilization[mode]
dataset-007 active648K  59.7M dnlc.accesses[hit/miss]
dataset-008 active242K  13.7M fc.bytes
dataset-009 active242K  13.7M fc.ops
dataset-010 active242K  12.8M fc.ops[latency]
dataset-011 active242K  12.8M fc.ops[op]
dataset-012 active242K  13.7M ftp.kilobytes
dataset-013 active242K  12.8M ftp.kilobytes[op]
dataset-014 active242K  13.7M http.reqs
dataset-015 active242K  12.8M http.reqs[latency]
dataset-016 active242K  12.8M http.reqs[op]
dataset-017 active242K  13.7M io.bytes
dataset-018 active439K  43.7M io.bytes[op]
dataset-019 active308K  29.6M io.disks[utilization=95][disk]
dataset-020 active   2.93M  87.2M io.disks[utilization]
dataset-021 active242K  13.7M io.ops
dataset-022 active   9.85M   274M io.ops[disk]
dataset-023 active   20.0M   827M io.ops[latency]
dataset-024 active438K  43.6M io.ops[op]
dataset-025 active242K  13.7M iscsi.bytes
dataset-026 active242K  13.7M iscsi.ops
dataset-027 active   1.45M  91.1M iscsi.ops[latency]
dataset-028 active248K  14.8M iscsi.ops[op]
dataset-029 active242K  13.7M ndmp.diskkb
dataset-030 active242K  13.8M nfs2.ops
dataset-031 active242K  12.8M nfs2.ops[latency]
dataset-032 active242K  13.8M nfs2.ops[op]
dataset-033 active242K  13.8M nfs3.ops
dataset-034 active   8.82M   163M nfs3.ops[latency]
dataset-035 active327K  18.1M nfs3.ops[op]
dataset-036 active242K  13.8M nfs4.ops
dataset-037 active   2.31M  97.8M nfs4.ops[latency]
dataset-038 active311K  17.2M nfs4.ops[op]
dataset-039 active242K  13.7M nic.kilobytes
dataset-040 active970K  84.5M nic.kilobytes[device]
dataset-041 active943K  77.1M nic.kilobytes[direction=in][device]
dataset-042 active457K  31.1M nic.kilobytes[direction=out][device]
dataset-043 active503K  49.1M nic.kilobytes[direction]
dataset-044 active242K  13.7M sftp.kilobytes
dataset-045 active242K  12.8M sftp.kilobytes[op]
dataset-046 active242K  13.7M smb.ops
dataset-047 active242K  12.8M smb.ops[latency]
dataset-048 active242K  13.7M smb.ops[op]
dataset-049 active242K  12.8M srp.bytes
dataset-050 active242K  12.8M srp.ops[latency]
dataset-051 active242K  12.8M srp.ops[op]

Cheers,
Joel.

On 04/08/10 14:06, Ray Van Dolson wrote:

We're starting to grow our ZFS environment and really need to start
standardizing our monitoring procedures.

OS tools are great for spot troubleshooting and sar can be used for
some trending, but we'd really like to tie this into an SNMP based
system that can generate graphs for us (via RRD or other).

Whether or not we do this via our standard enterprise monitoring tool
or write some custom scripts I don't really care... but I do have the
following questions:

- What metrics are you guys tracking?  I'm thinking:
- IOPS
- ZIL statistics
- L2ARC hit ratio
- Throughput
- IO Wait (I know there's probably a better term here)
  

Utilize Latency instead of IO Wait.

- How do you gather this information?  Some but not all is
  available via SNMP.  Has anyone written a ZFS specific MIB or
  plugin to make the info available via the standard Solaris SNMP
  daemon?  What information is available only via zdb/mdb?
  

On 7000 appliances, this is easy via Analytics.

On Solaris, you need to pull data from kstats and/or DTrace scripts and then
archive the data in similar manner...


- Anyone have any RRD-based setups for monitoring their ZFS
  environments they'd be willing to share or talk about?

Thanks in advance,
Ray
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
  



--

http://www.oracle.com/Joel Buckley | +1.303.272.5556
Oracle Open Storage Systems
500 Eldorado Blvd

Broomfield, CO 80021-3400

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss