On Fri, 16 May 2008 12:52:09 -0700 Hal Rosenstock <[EMAIL PROTECTED]> wrote:
> On Thu, 2008-05-15 at 13:27 -0700, Ira Weiny wrote: > > I decided to write a little HOWTO to help people to set it up. > > Nice writeup :-) > > > 5) Can be run in a standby SM > > I thought it was changed so that it can run in a standalone mode without > SM. Am I confusing this with something else ? > I think you are right I should have said standalone. However, can't it also work in a standby SM? yea, from the patch which Sasha applied: opensm/perfmgr: PerfMgr for SM standby and inactive states Here is an updated patch with the correction. Ira >From 9be13c3da4d34ad0a736ced4c9e3bb5e13a24bb6 Mon Sep 17 00:00:00 2001 From: Ira K. Weiny <[EMAIL PROTECTED]> Date: Thu, 15 May 2008 08:19:17 -0700 Subject: [PATCH] Add a Performance Manager HOWTO to the docs and the dist Signed-off-by: Ira K. Weiny <[EMAIL PROTECTED]> --- opensm/Makefile.am | 3 +- opensm/doc/performance-manager-HOWTO.txt | 153 ++++++++++++++++++++++++++++++ opensm/opensm.spec.in | 2 +- 3 files changed, 156 insertions(+), 2 deletions(-) create mode 100644 opensm/doc/performance-manager-HOWTO.txt diff --git a/opensm/Makefile.am b/opensm/Makefile.am index 3811963..4c79f49 100644 --- a/opensm/Makefile.am +++ b/opensm/Makefile.am @@ -24,8 +24,9 @@ endif man_MANS = man/opensm.8 man/osmtest.8 various_scripts = $(wildcard scripts/*) +docs = doc/performance-manager-HOWTO.txt -EXTRA_DIST = autogen.sh opensm.spec $(various_scripts) $(man_MANS) +EXTRA_DIST = autogen.sh opensm.spec $(various_scripts) $(man_MANS) $(docs) dist-hook: $(EXTRA_DIST) if [ -x $(top_srcdir)/../gen_chlog.sh ] ; then \ diff --git a/opensm/doc/performance-manager-HOWTO.txt b/opensm/doc/performance-manager-HOWTO.txt new file mode 100644 index 0000000..f0380f3 --- /dev/null +++ b/opensm/doc/performance-manager-HOWTO.txt @@ -0,0 +1,153 @@ +OpenSM Performance manager HOWTO +================================ + +Introduction +============ + +OpenSM now includes a performance manager which collects Port counters from +the subnet and stores them internally in OpenSM. + +Some of the features of the performance manager are: + + 1) Collect port data and error counters per v1.2 spec and store in + 64bit internal counts. + 2) Automatic reset of counters when they reach approximatly 3/4 full. + (While not guarenteeing that counts will not be missed this does + keep counts incrementing as best as possible given the current + hardware limitations.) + 3) Basic warnings in the OpenSM log on "critical" errors like symbol + errors. + 4) Automatically detects "outside" resets of counters and adjusts to + continue collecting data. + 5) Can be run when OpenSM is in standby or inactive states. + +Known issues are: + + 1) Data counters will be lost on high data rate links. Sweeping the + fabric fast enough for a DDR link is not practical. + 2) Default partition support only. + + +Setup and Usage +=============== + +Using the Performance Manager consists of 3 steps: + + 1) compiling in support for the perfmgr (Optionally: the console + socket as well) + 2) enabling the perfmgr and console in opensm.opts + 3) retrieving data which has been collected. + 3a) using console to "dump data" + 3b) using a plugin module to store the data to your own + "database" + +Step 1: Compile in support for the Performance Manager +------------------------------------------------------ + +Because of the performance manager's experimental status, it is not enabled at +compile time by default. (This will hopefully soon change as more people use +it and confirm that it does not break things... ;-) The configure option is +"--enable-perf-mgr". + +At this time it is really best to enable the console socket option as well. +OpenSM can be run in an "interactive" mode. But with the console socket option +turned on one can also make a connection to a running OpenSM. The console +option is "--enable-console-socket". This option requires the use of +tcp_wrappers to ensure security. Please be aware of your configuration for +tcp_wrappers as the commands presented in the console can affect the operation +of your subnet. + +The following configure line includes turning on the performance manager as +well as the console: + + ./configure --enable-perf-mgr --enable-console-socket + + +Step 2: Enable the perfmgr and console in opensm.opts +----------------------------------------------------- + +Turning the Perfmorance Manager on is pretty easy, set the following options in +the opensm.opts config file. (Default location is +/var/cache/opensm/opensm.opts) + + # Turn it all on. + perfmgr TRUE + + # sweep time in seconds + perfmgr_sweep_time_s 180 + + # Dump file to dump the events to + event_db_dump_file /var/log/opensm_port_counters.log + +Also enable the console socket and configure the port for it to listen to if +desired. + + # console [off|local|socket] + console socket + + # Telnet port for console (default 10000) + console_port 10000 + +As noted above you also need to set up tcp_wrappers to prevent unauthorized +users from connecting to the console.[*] + + [*] As an alternate you can use the loopback mode but I noticed when + writing this (OpenSM v3.1.10; OFED 1.3) that there are some bugs in + specifying the loopback mode in the opensm.opts file. Look for this to + be fixed in newer versions. + + [**] Also you could use "local" but this is only useful if you run + OpenSM in the foreground of a terminal. As OpenSM is usually started + as a daemon I left this out as an option. + +Step 3: retrieve data which has been collected +---------------------------------------------- + +Step 3a: Using console dump function +------------------------------------ + +The console command "perfmgr dump_counters" will dump counters to the file +specified in the opensm.opts file. In the example above +"/var/log/opensm_port_counters.log" + +Example output is below: + +<snip> +"SW1 wopr ISR9024D (MLX4 FW)" 0x8f10400411f56 port 1 (Since Mon May 12 13:27:14 2008) + symbol_err_cnt : 0 + link_err_recover : 0 + link_downed : 0 + rcv_err : 0 + rcv_rem_phys_err : 0 + rcv_switch_relay_err : 2 + xmit_discards : 0 + xmit_constraint_err : 0 + rcv_constraint_err : 0 + link_integrity_err : 0 + buf_overrun_err : 0 + vl15_dropped : 0 + xmit_data : 470435 + rcv_data : 405956 + xmit_pkts : 8954 + rcv_pkts : 6900 + unicast_xmit_pkts : 0 + unicast_rcv_pkts : 0 + multicast_xmit_pkts : 0 + multicast_rcv_pkts : 0 +</snip> + + +Step 3b: Using a plugin module +------------------------------ + +If you want a more automated method of retrieving the data OpenSM provides a +plugin interface to extend OpenSM. The header file is osm_event_plugin.h. +The functions you register with this interface will be called when data is +collected. You can then use that data as appropriate. + +An example plugin can be configured at compile time using the +"--enable-default-event-plugin" option on the configure line. This plugin is +very simple. It logs "events" recieved from the performance manager to a log +file. I don't recomend using this directly but rather use it as a templat to +create your own plugin. + diff --git a/opensm/opensm.spec.in b/opensm/opensm.spec.in index feabfef..c36d6f2 100644 --- a/opensm/opensm.spec.in +++ b/opensm/opensm.spec.in @@ -125,7 +125,7 @@ fi %{_sbindir}/opensm %{_sbindir}/osmtest %{_mandir}/man8/* -%doc AUTHORS COPYING README +%doc AUTHORS COPYING README doc/performance-manager-HOWTO.txt %{_sysconfdir}/init.d/opensmd %{_sbindir}/sldd.sh %config(noreplace) @OPENSM_CONFIG_DIR@/opensm.conf -- 1.5.1 _______________________________________________ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general