Re: [collectd] curl_xml plugin

2010-01-13 Thread Amit Gupta
On Mon, Jan 11, 2010 at 9:26 PM, Florian Forster o...@verplant.org wrote:

 Hi Amit,

 thank you very much for your patch :)


Thanks for applying the patch :)


 I did the following changes:

  * Re-order all of the functions to get rid of the forward declarations.
  * Replace the string and boolean handling functions for config options
   with functions declared in src/configfile.h. This way the
   functionality is not duplicated.
  * Put the type_instance setting and value parsing into a separate
   functions. This makes “cx_submit_xpath_values” considerably bit
   shorter.

 Please let me know if my changes broke anything – unfortunately I didn't
 have the possibility to test them :/


I did a sanity testing and the plugin seems to be working fine.


  #define CX_KEY_MAGIC 0x43484b59UL /* CHKY */
  #define CX_IS_KEY(key) (key)-magic == CX_KEY_MAGIC

 What do we need this hack for? I've found only one place where
 “c_avl_insert” is called, so the elements in the tree should all be of
 the same type, right? I'd be great if you could remove or document this.


There is no need of  this magic key. It just happens to be there since I
borrowed the code from curl_json :). I will clean this up.


 Maybe the entire AVL tree should be removed: As far as I see it is only
 used to iterate over the values (rather than searching for a specific
 key), so a linked list is probably more appropriate.


Yeah I agree. I have replaced avl tree with a linked list.


 Both, “cx_check_type” and “cx_submit_xpath_values” call “plugin_get_ds”.
 I think one call could be optimized away.

This is also done.

Do find the updated patch attached.

Regards
Amit


 Best regards,
 —octo

 [0] 
 http://git.verplant.org/?p=collectd.git;a=blob_plain;f=src/curl_xml.c;hb=refs/heads/ag/curl_xml
 
 --
 Florian octo Forster
 Hacker in training
 GnuPG: 0x91523C3D
 http://verplant.org/

 -BEGIN PGP SIGNATURE-
 Version: GnuPG v1.4.6 (GNU/Linux)

 iD8DBQFLS0olHdggu3Q05IYRAuMXAJ440LTtkMUZaDb7MUq4EdnXw8SXTgCfS3gS
 Pat48nthKsGvE3YMI6rIOh0=
 =VgpJ
 -END PGP SIGNATURE-


--- collectd-latest/src/curl_xml.c	Wed Jan 13 16:23:42 2010
+++ collectd-latest-mine/src/curl_xml.c	Wed Jan 13 15:39:22 2010
@@ -23,7 +23,7 @@
 #include common.h
 #include plugin.h
 #include configfile.h
-#include utils_avltree.h
+#include utils_llist.h
 
 #include libxml/parser.h
 #include libxml/tree.h
@@ -32,8 +32,6 @@
 #include curl/curl.h
 
 #define CX_DEFAULT_HOST localhost
-#define CX_KEY_MAGIC 0x43484b59UL /* CHKY */
-#define CX_IS_KEY(key) (key)-magic == CX_KEY_MAGIC
 
 /*
  * Private data structures
@@ -79,7 +77,7 @@
   size_t buffer_size;
   size_t buffer_fill;
 
-  c_avl_tree_t *tree; /* tree of xpath blocks */
+  llist_t *list; /* list of xpath blocks */
 };
 typedef struct cx_s cx_t; /* }}} */
 
@@ -138,25 +136,26 @@
   sfree (xpath);
 } /* }}} void cx_xpath_free */
 
-static void cx_tree_free (c_avl_tree_t *tree) /* {{{ */
+static void cx_list_free (llist_t *list) /* {{{ */
 {
-  char *name;
-  void *value;
+  llentry_t *le;
 
-  while (c_avl_pick (tree, (void *) name, (void *) value) == 0)
+  le = llist_head (list);
+  while (le != NULL)
   {
-cx_xpath_t *key = (cx_xpath_t *)value;
+llentry_t *le_next;
 
-if (CX_IS_KEY(key))
-  cx_xpath_free (key);
-else
-  cx_tree_free ((c_avl_tree_t *)value);
+le_next = le-next;
 
-sfree (name);
+sfree (le-key);
+cx_xpath_free (le-value);
+
+le = le_next;
   }
 
-  c_avl_destroy (tree);
-} /* }}} void cx_tree_free */
+  llist_destroy (list);
+  list = NULL;
+} /* }}} void cx_list_free */
 
 static void cx_free (void *arg) /* {{{ */
 {
@@ -173,9 +172,8 @@
 curl_easy_cleanup (db-curl);
   db-curl = NULL;
 
-  if (db-tree != NULL)
-cx_tree_free (db-tree);
-  db-tree = NULL;
+  if (db-list != NULL)
+cx_list_free (db-list);
 
   sfree (db-buffer);
   sfree (db-instance);
@@ -190,11 +188,8 @@
   sfree (db);
 } /* }}} void cx_free */
 
-static int cx_check_type (cx_xpath_t *xpath) /* {{{ */
+static int cx_check_type (const data_set_t *ds, cx_xpath_t *xpath) /* {{{ */
 {
-  const data_set_t *ds;
-  
-  ds = plugin_get_ds (xpath-type);
   if (!ds)
   {
 WARNING (curl_xml plugin: DataSet `%s' not defined., xpath-type);
@@ -373,7 +368,7 @@
 if ( (tmp_size == 0)  (is_table) )
 {
   WARNING (curl_xml plugin: 
-  relative xpath expression for 'Instance' \%s\ doesn't match 
+  relative xpath expression for 'InstanceFrom' \%s\ doesn't match 
   any of the nodes. Skipping the node., xpath-instance);
   xmlXPathFreeObject (instance_node_obj);
   return (-1);
@@ -382,7 +377,7 @@
 if (tmp_size  1)
 {
   WARNING (curl_xml plugin: 
-  relative xpath expression for 'Instance' \%s\ is expected 
+  relative xpath expression for 'InstanceFrom' \%s\ is expected 
   to return only one text node. Skipping the node., xpath-instance);
   xmlXPathFreeObject (instance_node_obj);
   return (-1);
@@ -425,7 

Re: [collectd] Strange SNMP collection glitches

2010-01-13 Thread Florian Forster
On Wed, Jan 13, 2010 at 11:57:05AM +0100, Mirko Buffoni wrote:
 If that is the case, how could I solve this behavior which is going to
 cause the graphs to be unusable due to the oversized scale factor?

One way is to replace the COUNTER data source with a DERIVE data source
and set the minimum value to zero. DERIVE data sources are documented
at [0]. Then, when the counter is reset to zero, the rate conversion
((new value - old value) / interval) will result in a negative value
(because new value is zero). This negative value is then ignored due
to the minimum value being zero. The downside is that this will happen
also when the counter overflows, so you will occasionally lose a
legitimate value.

You can change the DS type of existing RRD files using
  rrdtool tune --data-source-type …
  (see rrdtune(1))

Another possibility is to set a correct maximum value. For example, if
you have a 10 Mbps line you could set maximum to 125 (Byte/s).

You can change the maximum value of a DS using
  rrdtool tune --maximum …
  (see rrdtune(1))

 In the meantime, I'll try to increase traffic to speed up the counter
 to reach the overflow point.

I don't think that will cause the problem. COUNTER data sources handle
*overflows* correctly, only counter *resets* are a problem. Resets are
often caused when an interface is taken down and up again. Especially
WAN interfaces are prone to this.

Regards,
—octo

P.S.: I've CC'd the mailing list because your first question is probably
  interesting for other people, too.

[0] http://collectd.org/wiki/index.php/Data_source
-- 
Florian octo Forster
Hacker in training
GnuPG: 0x91523C3D
http://verplant.org/


signature.asc
Description: Digital signature
___
collectd mailing list
collectd@verplant.org
http://mailman.verplant.org/listinfo/collectd


Re: [collectd] Strange SNMP collection glitches

2010-01-13 Thread Mirko Buffoni
At 12.07 13/01/2010 +0100, you wrote:
On Wed, Jan 13, 2010 at 11:57:05AM +0100, Mirko Buffoni wrote:
  If that is the case, how could I solve this behavior which is going to
  cause the graphs to be unusable due to the oversized scale factor?

One way is to replace the COUNTER data source with a DERIVE data source
and set the minimum value to zero. DERIVE data sources are documented
at [0]. Then, when the counter is reset to zero, the rate conversion
((new value - old value) / interval) will result in a negative value
(because new value is zero). This negative value is then ignored due
to the minimum value being zero. The downside is that this will happen
also when the counter overflows, so you will occasionally lose a
legitimate value.

You can change the DS type of existing RRD files using
   rrdtool tune --data-source-type …
   (see rrdtune(1))

Another possibility is to set a correct maximum value. For example, if
you have a 10 Mbps line you could set maximum to 125 (Byte/s).

You can change the maximum value of a DS using
   rrdtool tune --maximum …
   (see rrdtune(1))

I set the correct maximum value to the RRD archives for both datasource
tx and rx.

rrdtool tune archive.rrd --maximum rx:240
rrdtool tune archive.rrd --maximum tx:240

with a dump I see that the change has been done correctly.
However the past daily/weekly/monthly graphs are unchanged and retain
those autoscaled values.  I wouldn't want to change collection.cgi, but
I'd like to readjust past rrd values to the new constraints.  Is this
possible?


Mirko


___
collectd mailing list
collectd@verplant.org
http://mailman.verplant.org/listinfo/collectd


Re: [collectd] Strange SNMP collection glitches

2010-01-13 Thread Florian Forster
Hi Mirko,

On Wed, Jan 13, 2010 at 12:36:20PM +0100, Mirko Buffoni wrote:
 with a dump I see that the change has been done correctly.
 However the past daily/weekly/monthly graphs are unchanged and retain
 those autoscaled values.  I wouldn't want to change collection.cgi,
 but I'd like to readjust past rrd values to the new constraints.  Is
 this possible?

yes, this is done by dumping the RRD file to its XML representation
(rrddump(1)) and then restoring the binary file with the --range-check
option (rrdrestore(1)).

Regards,
—octo
-- 
Florian octo Forster
Hacker in training
GnuPG: 0x91523C3D
http://verplant.org/


signature.asc
Description: Digital signature
___
collectd mailing list
collectd@verplant.org
http://mailman.verplant.org/listinfo/collectd


[collectd] rrdcached plugin question

2010-01-13 Thread Mark Moseley
I just started messing with rrdcached+collectd, so it's not impossible
that I've obtusely missed something in the docs.

* Is it possible to set RRARows and RRATimespan in the rrdcached
plugin like you can for rrdtool? If I try using them, collectd
complains that:

[2010-01-13 18:06:36] Plugin `rrdcached' did not register for value `RRARows'.
[2010-01-13 18:06:36] Plugin `rrdcached' did not register for value
`RRATimespan'.
[2010-01-13 18:06:36] Plugin `rrdcached' did not register for value
`RRATimespan'.
[2010-01-13 18:06:36] Plugin `rrdcached' did not register for value
`RRATimespan'.
[2010-01-13 18:06:36] Plugin `rrdcached' did not register for value
`RRATimespan'.
[2010-01-13 18:06:36] Plugin `rrdcached' did not register for value
`RRATimespan'.

And any rrd files created have the usual collectd defaults, instead of
what I put. And they only seem to be registered in src/rrdtool.c. I
also tried turning on the rrdtool plugin but without a DataDir in
the hopes it might have some effect. Is there a trick here that I'm
missing? We've got non-technical folks running reports from our rrd
data, and they balk at the built-in intervals, so I've got the
following because it falls on nice, grok-able datapoints, spaced by 5
mins, 1 day, etc.

Here's my rrdcached (pointers most welcome):

/usr/bin/rrdcached -l unix:/var/run/rrdcached/rrdcached.sock -b
/var/lib/collectd/rrd2 -j /var/run/rrdcached -w 3600 -z 3600 -f 7200
-t 10 -p /var/run/rrdcached/rrdcached.pid

I've been working with a super stripped-down collectd.conf, to try to
rule out other things. Here it is:

# Hostname  localhost
FQDNLookup  true
BaseDir /var/lib/collectd
PIDFile /var/run/collectd/collectd.pid
PluginDir   /usr/lib/collectd
TypesDB /usr/share/collectd/types.db
Interval30
ReadThreads 20

LoadPlugin logfile

Plugin logfile
LogLevel error
File STDOUT
Timestamp true
/Plugin

LoadPlugin network
# LoadPlugin rrdtool
LoadPlugin rrdcached


Plugin network
Listen 10.20.2.2 25826
TimeToLive 128
/Plugin


# Plugin rrdtool
#   DataDir /var/lib/collectd/rrd
#   CacheTimeout300
#   CacheFlush  600
#   RandomTimeout   60
#
#   # StepSize  30
#   # HeartBeat 120
#   WritesPerSecond 500
#   RRARows 2400
#   RRATimespan 4800
#   RRATimespan 144000
#   RRATimespan 72
#   RRATimespan 432
#   RRATimespan 4320
# /Plugin

Plugin rrdcached
DataDir /var/lib/collectd/rrd2
CreateFiles true
DaemonAddress   unix:/var/run/rrdcached/rrdcached.sock

# StepSize  30
# HeartBeat 120
WritesPerSecond 500
RRARows 2400
RRATimespan 4800
RRATimespan 144000
RRATimespan 72
RRATimespan 432
RRATimespan 4320
/Plugin

___
collectd mailing list
collectd@verplant.org
http://mailman.verplant.org/listinfo/collectd


[collectd] Processes plugin

2010-01-13 Thread Mark Moseley
Since the 4.9.0 upgrade, I see this popping up on all of my boxes:

Jan 13 20:35:39 server collectd[8501]: rrdtool plugin: rrd_update_r
(/var/lib/collectd/rrd/server/processes-httpd/ps_disk_octets.rrd)
failed: not a simple integer: '-1719325917'

It's not happening every collectd interval but it looks like once
every 3-7 intervals, and always seems to be the ps_disk_octets metric.

Here's a grab of non-nan for processes-httpd/ps_disk_octets.rrd MAX:

1263426420: 8.9397714667e+06 8.8867349333e+06
1263426480: 8.9397714667e+06 8.8867349333e+06
1263426540: 2.8722361756e+06 2.8611358122e+06
1263426600: 2.8722361756e+06 2.8611358122e+06
1263429840: 6.9840988933e+06 6.9629671933e+06
1263429900: 6.9840988933e+06 6.9629671933e+06
1263429960: 4.1813118933e+06 4.1651611100e+06
1263430020: 2.4956140633e+06 2.4853952000e+06
1263432180: 1.5469198067e+07 1.5460964333e+07
1263432240: 3.5285845300e+06 3.5101065467e+06

There are big holes there and the 'nan' rows are about 60% of the
file. The biggest recorded value is 28858203.767. The lowest number
reported in the error message is -2147483522 (ranges all the way up to
-5). Presumably something's overflowing :)

Other background: These are all Debian Etch, running collectd 4.9.0.
They're all 32-bit boxes, all running fairly new linux kernels, all
with CONFIG_TASK_IO_ACCOUNTING=y. The example above is from a box
running 2.6.32.3, but I see this happening on other boxes regardless
of the kernel (even down to 2.6.29.x and beyond).

The above example is a pretty heavily loaded web server. Though it's
serving *only* read-only web traffic, it does write a good deal of
logs out, so it's not impossible for it to have very high IO numbers.

This isn't a big deal, just a minor annoyance, but I figured I'd mention it.

___
collectd mailing list
collectd@verplant.org
http://mailman.verplant.org/listinfo/collectd