Re: [collectd] [PATCH] Fix strtok_r check in configure.in

2010-05-01 Thread Florian Forster
Hi,

sorry for re-doing your work, but I felt it'd be best to mimic the
behavior of src/Makefile.am as closely as possible. I've therefore added
a check for GCC before the block and set "-Wall -Werror" if the compiler
is GCC. Hope this works as intended..

Regards,
—octo
-- 
Florian octo Forster
Hacker in training
GnuPG: 0x91523C3D
http://verplant.org/


signature.asc
Description: Digital signature
___
collectd mailing list
collectd@verplant.org
http://mailman.verplant.org/listinfo/collectd


[collectd] Version 4.10.0 available

2010-05-01 Thread Florian Forster
Hello everybody,

I'm pleased to announce the release of collectd 4.10.0, a new minor
release of collectd.

This version introduces several new features, including three new
plugins for retrieving and parsing XML files, reading Modbus/TCP
registers and receiving PHP profiling information. Improvements of
existing plugins includes Python 3 support in the Python plugin,
manually setting an interface in the Network plugin and support for
pinging "dynamic DNS" hosts in the Ping plugin.

This will probably be the last (feature) release of major version 4. The
next version with new features will be version 5.0. I hope to be able to
release 5.0 in July, in time for collectd's five year anniversary ;)


Download


The new version is available in source-code form from collectd's
download page. The direct download links are:

  Version 4.10.0:

  * http://collectd.org/files/collectd-4.10.0.tar.bz2
SHA-1: eda40d632d24dd5ec3d124a305d5c1a4ac8d62d7

  * http://collectd.org/files/collectd-4.10.0.tar.gz
SHA-1: 5763f28b8721be115afed58752f5de76c627bba5


Thanks
--

Thanks to everybody who helped with this new version. In particular, the
release contains new code by:

  * Amit Gupta
  * Andrés J. Díaz
  * Chris Buben
  * Clément Stenac
  * Fabian Schuh
  * Julien Ammous
  * Lorin Scraba
  * Max Henkel
  * Peter Warasin
  * Phoenix Kayo
  * Sebastian Harl
  * Stefan Völkel
  * Sven Trenkel
  * Vaclav Malek


ChangeLog
-
2010-05-01, Version 4.10.0
  * collectd: JSON output now includes the "dstypes" and "dsnames"
fields. This makes it easier for external applications to interpret
the data. Thanks to Chris Buben for his work.
  * collectd: The new "Timeout" option can be used to specify a
"timeout" for missing values. This is used in the threshold checking
code to detect missing values. Thanks to Andrés J. Díaz for the
patch.
  * apache plugin: Support for "IdleWorkers" (Apache 1.*: "IdleServers")
has been added.
  * curl plugin: The new "ExcludeRegex" allows to easily exclude certain
lines from the match.
  * curl_xml plugin: This new plugin allows to read XML files using cURL
and extract metrics included in the files. Thanks to Amit Gupta for
his work.
  * filecount plugin: The new "IncludeHidden" option allows to include
"hidden" files and directories in the statistics. Thanks to Vaclav
Malek for the patch.
  * logfile plugin: The new "PrintSeverity" option allows to include the
severity of a message in the output. Thanks to Clément Stenac for
his patch.
  * memcachec plugin: The new "ExcludeRegex" allows to easily exclude
certain lines from the match.
  * modbus plugin: This new plugin allows to read registers from
Modbus-TCP enabled devices.
  * network plugin: The new "Interface" option allows to set the
interface to be used for multicast and, if supported, unicast
traffic. Thanks to Max Henkel for his work.
  * openvpn plugin: The "CollectUserCount" and "CollectIndividualUsers"
options allow more detailed control over how to report sessions of
multiple users. Thanks to Fabian Schuh for his work.
  * pinba plugin: This new plugin receives timing information from the
Pinba PHP extension, which can be used for profiling PHP code and
webserver performance.
  * ping plugin: The new "MaxMissed" allows to re-resolve a hosts
address when it doesn't reply to a number of ping requests. Thanks
to Stefan Völkel for the patch.
  * postgresql plugin: The "Interval" config option has been added. The
plugin has been relicensed under the 2-clause BSD license. Thanks to
Sebastian Harl for his work.
  * processes plugin: Support for "code" and "data" virtual memory sizes
has been added. Thanks to Clément Stenac for his patch.
  * python plugin: Support for Python 3 has been implemented. Thanks to
Sven Trenkel for his work.
  * routeros plugin: Support for collecting CPU load, memory usage, used
and free disk space, sectors written and number of bad blocks from
MikroTik devices has been added.
  * swap plugin: Support for Linux < 2.6 has been added. Thanks to Lorin
Scraba for his patch.
  * tail plugin: The new "ExcludeRegex" allows to easily exclude certain
lines from the match. Thanks to Peter Warasin for his patch.
  * write_http plugin: The "StoreRates" option has been added. Thanks to
Paul Sadauskas for his patch.
  * regex match: The "Invert" option has been added. Thanks to Julien
Ammous for his patch.


Best regards,
—octo
-- 
Florian octo Forster
Hacker in training
GnuPG: 0x91523C3D
http://verplant.org/


signature.asc
Description: Digital signature
___
collectd mailing list
collectd@verplant.org
http://mailman.verplant.org/listinfo/collectd


Re: [collectd] [PATCH] configure.in: have_htonll should depend on linker check

2010-05-19 Thread Florian Forster
Hi Max,

On Tue, May 04, 2010 at 10:17:53AM +0200, Max Henkel wrote:
> During cross-compiling I've observed, that commit 35602ac1 introduced
> an unresolvable (regarding cross-compiling) configure error.

thanks for catching this :) I've applied the patch to the collectd-4.9
branch.

Regards,
—octo
-- 
Florian octo Forster
Hacker in training
GnuPG: 0x91523C3D
http://verplant.org/


signature.asc
Description: Digital signature
___
collectd mailing list
collectd@verplant.org
http://mailman.verplant.org/listinfo/collectd


Re: [collectd] Tail plugin "missing" events

2010-05-19 Thread Florian Forster
Hi Gregory,

On Wed, May 05, 2010 at 01:56:56PM +0100, Gregory Giguashvili wrote:
> I'm seeing these low values when displaying the results using "rrdtool
> graph". I've had some time to read through collectd code and rrdtool
> documentation and it looks like I need something like GaugeInc type
> which would sum-up pattern occurrences in the log file and display
> numbers as is without extrapolating them over period of time. 

if you're using RRDtool, the number will be aggregated eventually.
Your best bet is to multiply the value by the time difference between
two samples, which should result in roughly the number of lines matched
in the displayed time period.

Reading those graphs is not trivial though, since the number in the
graph will increase with the timespan being displayed as a result of
more and more events being shown in the graph.

Personally I think what you're trying to achieve is best done with a
COUNTER or DERIVE data source, but if should send a patch for a
"GaugeInc" type I don't have any general objections ;)

Regards,
—octo
-- 
Florian octo Forster
Hacker in training
GnuPG: 0x91523C3D
http://verplant.org/


signature.asc
Description: Digital signature
___
collectd mailing list
collectd@verplant.org
http://mailman.verplant.org/listinfo/collectd


Re: [collectd] swap plugin improvements

2010-05-19 Thread Florian Forster
Hi Aurélien,

thank you very much for your patches :)

On Tue, May 04, 2010 at 03:05:11PM +0200, colle...@wattapower.net wrote:
> The second patch adds support for the two-args version of swapctl() in
> the actual swap plugin code.

I've applied the first two patches to the master branch, so they will be
included in the next feature release of collectd.

> As a consequence the returned metrics are quite different. "reserved"
> is no more, and "used" + "free" reflects the total amount of on-disk
> swap space. This is in contrast to the libkstat implementation where
> "used" + "free" + "reserved" added up to more than on-disk swap, as it
> comprised a variable amount of RAM.
> 
> In my opinion the previous metrics have their interest and should be
> preserved. But as they reflect the behavior of the Solaris virtual
> memory subsystem rather than strictly speaking swap devices, maybe
> they should be moved to another plugin like vmem ?

I don't know enough about the memory management of Solaris to have an
opinion here, maybe someone with more knowledge can share his opinion?

In general, now would be the perfect time for such a change since the
next release may include backwards incompatible changes, such as this
one. I've added a note to the “Plans for 5.0” wiki page [0], so I don't
forget to check up on this patch before releaseing 5.0.

Regards,
—octo

[0] 
-- 
Florian octo Forster
Hacker in training
GnuPG: 0x91523C3D
http://verplant.org/


signature.asc
Description: Digital signature
___
collectd mailing list
collectd@verplant.org
http://mailman.verplant.org/listinfo/collectd


Re: [collectd] I don't know the ASN type

2010-05-19 Thread Florian Forster
Hi,

On Sat, May 08, 2010 at 04:56:50PM +0200, Flyinvap wrote:
> Is it possible to have oid included in message for this type log ? I
> can be useful to find quickly the wrong data.

that's a good idea. I've added this to the collectd-4.9 branch [0].

Regards,
—octo

[0] 

-- 
Florian octo Forster
Hacker in training
GnuPG: 0x91523C3D
http://verplant.org/


signature.asc
Description: Digital signature
___
collectd mailing list
collectd@verplant.org
http://mailman.verplant.org/listinfo/collectd


Re: [collectd] interval in network packets?

2010-05-19 Thread Florian Forster
Hi Thorsten,

On Fri, May 07, 2010 at 11:26:09AM -0700, Thorsten von Eicken wrote:
> Stupid question: what is the interval in network packets used for? I
> would think that the only interval that matters in the end is the one
> defined in the RRDs.

the interval is used to set the correct "step" setting when creating new
RRD files, for instance. Another use is to detect values that have timed
out: By default values that are collected every 10 seconds time out
after 20 seconds while values that are collected once a minute time out
after two minutes.

> Does collectd do any combining of values within an interval?

No.

> Does the interval in the network packets have to match the one used in
> the final RRD or else inaccuracies result?

Short answer: Yes; longer answer: If the "step" (== interval) of the RRD
files is wider / longer than the actual interval in between two updates,
you will use accuracy but everything will still work smoothly. If the
actual interval is larger than the "heartbeat" (which is 2*step by
default in RRD files created by collectd), you will end up with gaps in
your graphs or (depending on the "XFF" setting) totally empty graphs.

HTH,
—octo
-- 
Florian octo Forster
Hacker in training
GnuPG: 0x91523C3D
http://verplant.org/


signature.asc
Description: Digital signature
___
collectd mailing list
collectd@verplant.org
http://mailman.verplant.org/listinfo/collectd


Re: [collectd] Bugfix for libcollectdclient lcc_putval.

2010-05-19 Thread Florian Forster
Hi Johan,

On Wed, May 12, 2010 at 10:44:41AM +0200, Johan Van den Brande wrote:
> The wire format of the putval command is missing a space behind the
> closing double quote of the identifier.

thanks for the pointer and the patch :) I've applied your patch to the
collectd-4.9 branch.

Regards,
—octo
-- 
Florian octo Forster
Hacker in training
GnuPG: 0x91523C3D
http://verplant.org/


signature.asc
Description: Digital signature
___
collectd mailing list
collectd@verplant.org
http://mailman.verplant.org/listinfo/collectd


Re: [collectd] Compilation fixes for AIX

2010-05-19 Thread Florian Forster
Hi Aurelien,

On Wed, May 12, 2010 at 11:55:03AM +0200, Aurelien Reynaud wrote:
> here are two minor fixes to allow compilation of the snmp and
> processes plugins on AIX.

thanks for your patches, I've applied them to the collectd-4.9 branch.

Best regards,
—octo
-- 
Florian octo Forster
Hacker in training
GnuPG: 0x91523C3D
http://verplant.org/


signature.asc
Description: Digital signature
___
collectd mailing list
collectd@verplant.org
http://mailman.verplant.org/listinfo/collectd


Re: [collectd] Segfault with python plugin

2010-05-19 Thread Florian Forster
Hi Giorgio,

On Sat, May 15, 2010 at 05:40:54PM -0400, Giorgio Lansing wrote:
> I'm using collectd-4.10.0 on Ubuntu 9.10 64-bit with python 2.6.4rc2.
> My python plugins seem to be working fine and collect data
> successfully, but when I checked the logs I noticed that segfaults
> were occurring frequently and collectd was restarting every 5 minutes.

this problem is *very hard* to debug without the appropriate debugging
information. To create that, you first need to compile with debugging
symbols enabled ("-g") and ideally with optimization disabled ("-O0").
So the approriate way to run configure would be:

  $ ./configure ... CFLAGS="-g -O0"

Then you need to tell your system to write a "core file" when a program
dies due to a segmentation fault. You can do this with:

  $ ulimit -c unlimited

If this is in effect, the system will write a core dump when the process
dies. The file is called "core" or "core.$PID" and resides in the
"current working directory" of the process, usually /var/lib/collectd.

You then need to point a debugger at the core file, for example the GNU
debugger "gdb":

  $ gdb /usr/sbin/collectd /var/lib/collectd/core

The debugger will open a shell-like interface where you tell it to print
a "stack backtrace" using "bt full":

  (gdb) bt full

The command will print (a lot of) output which will hopefully help us to
solve the problem.

HTH,
—octo
-- 
Florian octo Forster
Hacker in training
GnuPG: 0x91523C3D
http://verplant.org/


signature.asc
Description: Digital signature
___
collectd mailing list
collectd@verplant.org
http://mailman.verplant.org/listinfo/collectd


Re: [collectd] attempting to use gmond plugin with collectd

2010-05-22 Thread Florian Forster
Hi Raymond,

On Fri, May 21, 2010 at 12:19:51PM -0400, Raymond Flanery wrote:
> […] The only plugins I have turned on are syslog, gmond and csv. if I
> start collectd the records in syslog look like the following:
> 
> collectd[18878]: Initialization complete, entering read-loop.
> collectdmon[18852]: Warning: collectd terminated with exit status 127
> collectdmon[18852]: Warning: restarting collectd

That sounds like collectd is crashing, likely due to a bug in the gmond
plugin. Since that plugin is probably not the most widely used one, even
obvious bugs in that plugin are possible :(

It looks like you're using a Debian package. Could you please install
the "collectd-dbg" package (which provides the debugging symbols) and
enable the creation of "core dumps" by setting
  ENABLE_COREFILES=1
in /etc/default/collectd. When collectd crashes after an
  /etc/init.d/collectd restart
it should create a core dump in the working directory of the daemon,
usually
  /var/lib/collectd

If you're using either x86 or x86_64, it's enough if you could provide
that core file to me. If you use another architecture, a stack backtrace
would be very useful.

I've just created a wiki page, "Core file" [0], which describes the
debugging process in some more detail. It'd be great if you would follow
that, provide that core file or stack backtrace and tell me if the
instructions are understandable. :)

Regards,
—octo

[0] 
-- 
Florian octo Forster
Hacker in training
GnuPG: 0x91523C3D
http://verplant.org/


signature.asc
Description: Digital signature
___
collectd mailing list
collectd@verplant.org
http://mailman.verplant.org/listinfo/collectd


Re: [collectd] gmond plugin for collectd - part 2

2010-05-22 Thread Florian Forster
Hi Raymond,

On Fri, May 21, 2010 at 02:55:29PM -0400, Raymond Flanery wrote:
> May 21 14:52:54 node250 collectd[23920]: No read-functions are registered.
> May 21 14:52:54 node250 collectd[23920]: Exiting normally.
> May 21 14:52:54 node250 collectd[23920]: gmond plugin: Stopping receive 
> thread.
> 
> Am I wrong in thinking that the gmond plugin is a read-function?

Yes, the gmond plugin doesn't have a receive function, because it gets
its data "asynchronously". It starts a new thread which waits for
incoming network packages and dispatches them whenever it receives any.
There is no "read function" which needs to be called periodically. So,
if you only need the "csv", "syslog" and "gmond" plugins, this message
is correct but nothing to worry about.

(Setups like this are the reason why this message is merely a "notice",
not a "warning". It's noteworthy, but not a problem.)

Regards,
—octo
-- 
Florian octo Forster
Hacker in training
GnuPG: 0x91523C3D
http://verplant.org/


signature.asc
Description: Digital signature
___
collectd mailing list
collectd@verplant.org
http://mailman.verplant.org/listinfo/collectd


Re: [collectd] Varnish plugin

2010-06-01 Thread Florian Forster
Hi Jérôme,

thank you very much for your patch :)

I have a couple of small suggestions / questios regarding the code,
maybe you could help me out:

* You forgot to add yourself as copyright holder at the beginning of the
  file. I guess you copied the "apache" plugin and modified it. Since
  there's basically nothing of the "apache" plugin left, I'd suggest you
  simply remove those other lines.

* The new "types" introduces by the patch, "varnish_cache" and
  "varnish_connections", have multiple "data sources" each. This is
  deprecated and only used when there's a good reason for it. I think
  the "cache_ratio" type could be used instead of "varnish_cache". Many
  other plugins define their own "foo_connections" type, for example
  "memcached_connections" and "nginx_connections". Feel free to copy one
  of those definitions to "varnish_connections". [*]

* The only argument passed to "VSL_OpenStats" is a NULL pointer called
  "varnish_instance_name". Would it be reasonable to let the user
  configure something else here and use NULL as a default?

* When "VSL_OpenStats" fails, is it possible to get a more detailed
  error description or at least an error code?

* The member (value_list_t *)->values_len is of type "int". You assign
  to it a "size_t" without cast, which may result in warnings (and hence
  problems) with some optimization settings.

Best regards,
—octo

[*] In the long run, I'd prefer to have *one* type for this purpose that
is used by all plugins. I might change that in an attempt to unify
the behavior eventually.
-- 
Florian octo Forster
Hacker in training
GnuPG: 0x91523C3D
http://verplant.org/


signature.asc
Description: Digital signature
___
collectd mailing list
collectd@verplant.org
http://mailman.verplant.org/listinfo/collectd


Re: [collectd] Varnish plugin

2010-06-01 Thread Florian Forster
Hi Jérôme,

On Tue, Jun 01, 2010 at 04:07:58PM +0200, Jerome Renard wrote:
> > * The only argument passed to "VSL_OpenStats" is a NULL pointer
> >   called  "varnish_instance_name". Would it be reasonable to let the
> >   user  configure something else here and use NULL as a default?
> 
> Yeah I already thought about that and I was not sure what to do here.
> Even though it is in theory possible to run multiple varnishd
> instances on the same host, in practice it is quite common to have
> only one instance per host.

I was just curious. If having only one instance is the usual case,
going with that is alright. But in my experience, if it is possible to
have multiple instances, *someone* will come up with a reason to have
multiple instances and will want data about all of them ;)

So being able to configure multiple instances would definitely be nice
to have, but if you have better things to do that's no problem either –
let whoever needs it do the work ;)

> But maybe a relevant alternative would be to provide a configuration
> variable for this, and if it is empty, then set varnish_instance_name
> to NULL and thus get statistics for the current instance.

Should you want to look into that, here's how I'd do it:

* Register a simple config callback function. Take a look at
  "interface_config" in src/interface.c for an example.
* For every "Instance" (or whatever you chose to name it) option add the
  correcponding "value" string to a (module-)global "char **".
* In the read function, do:
  if the global "char **" is not NULL:
iterate over all instances and record statistics in turn
  else
call VSL_OpenStats with NULL

> > * The member (value_list_t *)->values_len is of type "int". You
> > assign  to it a "size_t" without cast, which may result in warnings
> > (and hence  problems) with some optimization settings.
> 
> Ok I'll see how I can fix that,

A simple cast should do the trick.

> could you please give me the optimization settings you use to get such
> warnings ? So I can reproduce the issue locallu and fix it :)

Sorry, but I didn't compile the code yet :/ When using GCC, "-O0" and a
x86_64 machine you should get a warning due to assigning a 64bit value
to a 32bit integer … But then again, I am hardly ever able to reproduce
such stuff with GCC ;)

> On a practical side, would you prefer I fork the project on github so
> you can pull my changes when you think everything is OK for you ? Or
> do you prefer getting patches through email ?

Either way works for me. Pulling from a public repository is a tad
easier than applying the patch from a mail, but I value the publicity of
a public mailing list, too. So just pick what *you* prefer ;)

Regards,
—octo
-- 
Florian octo Forster
Hacker in training
GnuPG: 0x91523C3D
http://verplant.org/


signature.asc
Description: Digital signature
___
collectd mailing list
collectd@verplant.org
http://mailman.verplant.org/listinfo/collectd


Re: [collectd] [PATCH] Add InterfaceFormat setting to libvirt plugin

2010-06-01 Thread Florian Forster
Hi Ruben,

On Sun, May 30, 2010 at 02:57:18PM +0200, Ruben Kerkhof wrote:
> So let's introduce the InterfaceFormat setting When set to 'address'
> it uses the mac address of the interface instead of the path.

thank you very much for your patch :) I've applied it to the master
branch, so it will be included in the next feature release.

Best regards,
—octo
-- 
Florian octo Forster
Hacker in training
GnuPG: 0x91523C3D
http://verplant.org/


signature.asc
Description: Digital signature
___
collectd mailing list
collectd@verplant.org
http://mailman.verplant.org/listinfo/collectd


Re: [collectd] Solaris vmstat info available in any current plugins

2010-06-01 Thread Florian Forster
Hi Cleveland,

On Thu, May 27, 2010 at 10:59:43AM -0400, Cleveland Mark-RGKW63 wrote:
> Do any of the current plugins available on Solaris provide similar
> information to what you can get from vmstat? In particular I am
> interested in process information w/ respect to number that are
> running/blocked/waiting (r b w) and paging info, page in/page out info
> (pi po)

generally, this information is collected by the "processes" plugin.
Under Solaris, it is able to determine the number of running, blocked,
idle, … processes only.

The number of pagefaults, memory sizes etc. is determined on a
per-process basis if requested by the user. This is not yet possible
under solaris.

Global page-in/page-out counters could be added to the "vmem" plugin,
which is Linux-only at the moment.

Regards,
—octo
-- 
Florian octo Forster
Hacker in training
GnuPG: 0x91523C3D
http://verplant.org/


signature.asc
Description: Digital signature
___
collectd mailing list
collectd@verplant.org
http://mailman.verplant.org/listinfo/collectd


Re: [collectd] trying to use rrdcached instead rrdtool plugin

2010-06-01 Thread Florian Forster
Hi Israel,

On Thu, May 27, 2010 at 01:13:58PM +0200, Israel Garcia wrote:
> me again.. I've now collectd working with rrdcached.. BUT, (always
> there's a BUT), graphs delay almost 8 min to show... My collectd
> clients send info in 30 sec... any ideas?

does the user running RRDtool have permission to write to the UNIX
socket?

Does RRDtool know that it is supposed to use the daemon to flush cached
data?

Regards,
—octo
-- 
Florian octo Forster
Hacker in training
GnuPG: 0x91523C3D
http://verplant.org/


signature.asc
Description: Digital signature
___
collectd mailing list
collectd@verplant.org
http://mailman.verplant.org/listinfo/collectd


Re: [collectd] Varnish plugin

2010-06-07 Thread Florian Forster
Hi Jérôme,

On Fri, Jun 04, 2010 at 05:42:10PM +0200, Jerome Renard wrote:
> You can indeed help if you want to, I work on a fork of Florian's
> master branch, you can get it from my github account :
> - http://github.com/jeromer/collectd

I've changes some things in the Varnish plugins based on your current
master branch. The changes are available on Github from the jr/varnish
branch [0].

Sorry that I didn't just merge your branch: I changed the author name to
your full name. While I was at it I added a "Varnish plugin: " prefix to
the commit messages because that makes writing the ChangeLog entry
easier.

Other than that, there are two new "features": The configure script now
uses pkg-config to determine the correct C- and LD-flags. Unfortunately,
the Debian package is broken in this respect so that building under
Debian currently needs some hands-on hacking fixing. I opened bug
reports but I doubt this will be fixed soon – a similar bug report has
been open since November 2009 :(

I also implemented configuration for multiple instances. You can now
configure multiple instances using a configuration like:

  

  MonitorCache true
  MonitorConnections true


  MonitorCache true
  MonitorConnections true
  MonitorBackend true

  

To only select the default instance, you can omit the instance name:

  

  MonitorCache true
  MonitorConnections true

  

If no configuration is specified, a set of default metrics is collected
from the default instance. Currently that's "MonitorCache" and
"MonitorConnections" but you're the expert, so it'd be great if you
could come up with a "sane" set of defaults.

> > AFAIK, the default varnish instance name is the host's FQDN. Maybe
> > would it be an idea have the instance name appear in the path where
> > the rrd files get saved (sort of like the cpu and disk plugins do) ?
> > Although I agree that in 99% of the cases people would have only one
> > instance per host.

I'm using the "plugin instance" to store the instance name. If the
instance name is the default instance, then the plugin instance is left
empty. So in the default case, there is no clutter. I hope that's
reasonable…?

Unfortunately I'm flying blind because I don't have a Varnish instance
running anywhere. It'd be awesome if you could test the changes for me.

Regards,
—octo

[0] 
-- 
Florian octo Forster
Hacker in training
GnuPG: 0x91523C3D
http://verplant.org/


signature.asc
Description: Digital signature
___
collectd mailing list
collectd@verplant.org
http://mailman.verplant.org/listinfo/collectd


Re: [collectd] Varnish plugin

2010-06-08 Thread Florian Forster
Hi Marc,

On Tue, Jun 08, 2010 at 10:45:51AM +0200, Marc Fournier wrote:
> A few relevant pieces below:

could you look for relevant information in "config.log", the file
created by the configure script? Something like

  grep -C 10 varnish config.log

should do the trick I hope.

> /usr/lib64/libvarnish.so
> /usr/lib64/libvarnishapi.so
> /usr/lib64/libvarnishcompat.so

Also, an "ldd" of each of those files would be interesting for the RHEL
box. On Debian Squeeze I know it's broken :/

> Can you elaborate on the "hands-on fixing" you mentioned ?

Sure. Under Debian, the libvarnish-dev package has at least three bugs:

  - "libvarnishapi" uses symbols from "libvarnish" but doesn't link
against it. This means that you can't simply use "-lvarnishapi" –
you need to specify "-lvarnish", too.

  - The pkg-config file doesn't account for this, i.e. the following
output isn't enough.

> $ pkg-config --libs varnishapi
> -lvarnishapi

  - The package is missing the "libvarnishapi.so" symlink, so the linker
only finds the static libraries.

To work around these problems, do the following:

  - Create the required symlinks by hand:
  
/usr/lib # ln -s libvarnish.so.1 libvarnish.so
/usr/lib # ln -s libvarnishapi.so.1 libvarnish.so

  - Add the .pc file to include the required libraries. Under Debian,
edit /usr/lib/pkgconfig/varnishapi.pc so that the "Libs:" line
reads:

Libs: -L${libdir} -lvarnishapi -lvarnish -lpcre

HTH,
—octo
-- 
Florian octo Forster
Hacker in training
GnuPG: 0x91523C3D
http://verplant.org/


signature.asc
Description: Digital signature
___
collectd mailing list
collectd@verplant.org
http://mailman.verplant.org/listinfo/collectd


Re: [collectd] Varnish plugin

2010-06-08 Thread Florian Forster
Hi Jérôme,

On Tue, Jun 08, 2010 at 03:12:26PM +0200, Jerome Renard wrote:
> can you please pull the changes from my branch:
> - http://github.com/jeromer/collectd/

I just cherry-picked your changes. Could you please reset your master
branch to my “jr/varnish” branch? I don't mind the cherry-picking but
ultimately being able to simply merge the changes makes collaboration
easier.

If you need assistance to do that, please let me know.

Regards,
—octo
-- 
Florian octo Forster
Hacker in training
GnuPG: 0x91523C3D
http://verplant.org/


signature.asc
Description: Digital signature
___
collectd mailing list
collectd@verplant.org
http://mailman.verplant.org/listinfo/collectd


Re: [collectd] Varnish plugin

2010-06-08 Thread Florian Forster
Hi Marc,

On Tue, Jun 08, 2010 at 03:08:22PM +0200, Marc Fournier wrote:
> /usr/lib/gcc/x86_64-redhat-linux/4.1.2/../../../../lib64/libvarnishapi.so: 
> undefined reference to `VRE_compile'
> /usr/lib/gcc/x86_64-redhat-linux/4.1.2/../../../../lib64/libvarnishapi.so: 
> undefined reference to `VRE_exec'

this looks precisely like the problem on Debian. “VRE_exec” is available
from "libvarnish", so the library is missing that dependency.

> /usr/lib64/libvarnishapi.so:
> libc.so.6 => /lib64/libc.so.6 (0x003ebb20)
> /lib64/ld-linux-x86-64.so.2 (0x003ebae0)

As I though: "libvarnishapi" is not linked against "libvarnish".

> As debian squeeze ships varnish 2.1 too, I'm wondering if the problem
> wouldn't be in varnish 2.1 itself, not in the way the distros package
> varnish ?

Yeah, I just confirmed that by looking at the current Varnish SVN trunk.

> >   - Create the required symlinks by hand:
> >   
> > /usr/lib # ln -s libvarnish.so.1 libvarnish.so
> > /usr/lib # ln -s libvarnishapi.so.1 libvarnish.so
> 
> You probably ment:
>   /usr/lib # ln -s libvarnishapi.so.1 libvarnishapi.so

Yeah, that's a typo. Sorry.

> I suppose this must also be followed by a "ldconfig" run ?

Worked for me without; YMMV.

> /usr/lib/gcc/x86_64-redhat-linux/4.1.2/../../../../lib64/libvarnish.so: 
> undefined reference to `strlcpy'

That's new. I thought the GNU libc did implement "strlcpy", too..?

> readelf shows that /usr/lib/libvarnish.so has a reference to strlcpy
> but /usr/lib/libvarnishapi.so doesn't. Maybe adding -lvarnish finally
> isn't a good idea ?

"libvarnish" is needed for the "VRE_exec" and "VRE_compile" symbols, so
you can't just leave that away.

Regards,
—octo
-- 
Florian octo Forster
Hacker in training
GnuPG: 0x91523C3D
http://verplant.org/


signature.asc
Description: Digital signature
___
collectd mailing list
collectd@verplant.org
http://mailman.verplant.org/listinfo/collectd


Re: [collectd] Varnish plugin

2010-06-08 Thread Florian Forster
On Tue, Jun 08, 2010 at 10:18:28PM +0200, Florian Forster wrote:
> Yeah, I just confirmed that by looking at the current Varnish SVN trunk.

I just sent a patch to the "varnish-dev" mailing list [0]. I hope they
pick it up soon.

—octo

[0] <http://lists.varnish-cache.org/pipermail/varnish-dev/2010-June/002494.html>
-- 
Florian octo Forster
Hacker in training
GnuPG: 0x91523C3D
http://verplant.org/


signature.asc
Description: Digital signature
___
collectd mailing list
collectd@verplant.org
http://mailman.verplant.org/listinfo/collectd


Re: [collectd] Varnish plugin

2010-06-09 Thread Florian Forster
Hi Jérôme,

On Wed, Jun 09, 2010 at 08:04:21AM +0200, Jerome Renard wrote:
> While reading this email once again I am wondering if I did the right
> thing here.
> What I did is to clone your repository and checkout the jr/varnish
> branch but I do not have enough rights to push changes I guess.

I have taken your changes and modified the meta information, such as the
commit message. This resulted in a different commit-id and thus, from
the point of view of Git, in an entirely new branch.

The *content* doesn't differ, though:

  o...@alyja:~/collectd $ git diff jr/varnish..jeromer/master
  o...@alyja:~/collectd $ 

So what I'm asking is that you use my "jr/varnish" branch as the base
for new development. Here's how you do that. Beware: The following
commands will remove all changes you have in your working directory
(i.e. uncommitted stuff) and all commits after your (currently) latest
commit, 3e916c4 "- Updated configuration directives + doc".

  # Create a backup branch
  $ git checkout -b backup/master master
  # Add my Github repository as an additional "remote":
  $ git remote add -t jr/varnish octo git://github.com/octo/collectd.git
  # Fetch / update the "jr/varnish" branch from my repository:
  $ git remote update
  # Reset your master branch. This is what may cause data loss. Make
  # sure "git diff" doesn't return anything!
  $ git checkout master
  $ git reset --hard octo/jr/varnish
  # Force-update the branch in your Github repository:
  $ git push origin +master:master

Hope this helps. If not, let me know or ping me in IRC ;)
—octo
-- 
Florian octo Forster
Hacker in training
GnuPG: 0x91523C3D
http://verplant.org/


signature.asc
Description: Digital signature
___
collectd mailing list
collectd@verplant.org
http://mailman.verplant.org/listinfo/collectd


Re: [collectd] Varnish plugin

2010-06-09 Thread Florian Forster
On Tue, Jun 08, 2010 at 03:07:12PM +0200, Jerome Renard wrote:
> I am thinking about documenting this workaround for Debian. Iwas
> thinking about adding a README file especially for that purpose, do
> you prefer a README.varnish file or a README.debian one ?

I'd rather get this fixed in Debian than document broken behavior.

I think documenting this in your wiki would be a good compromise. Once
the issue is fixed in Debian, we can easily update or remove the note in
the wiki.

Regards,
—octo
-- 
Florian octo Forster
Hacker in training
GnuPG: 0x91523C3D
http://verplant.org/


signature.asc
Description: Digital signature
___
collectd mailing list
collectd@verplant.org
http://mailman.verplant.org/listinfo/collectd


Re: [collectd] Varnish plugin

2010-06-11 Thread Florian Forster
Hi Jérôme,

On Fri, Jun 11, 2010 at 08:40:30AM +0200, Jerome Renard wrote:
> Are you still running the plugin on your production machines ?
> Did you nothing any (new ?) issue so far ?

I've changed the data sets / types used for storing the values
yesterday. Some of the values are "gauge" values while most of them are
"counters". The plugin now uses generic data sets with gauge or derive
data sources, as needed.

I'm currently thinking about moving information from the type instance
to the plugin instance. Take the storage components for example: The
files are currently named


  host/varnish[-instance]/...
.../total_requests-storage-file
.../requests-storage-file-outstanding
.../bytes-storage-file-allocated
.../bytes-storage-file-free
.../total_requests-storage-mem
.../requests-storage-mem-outstanding
.../bytes-storage-mem-outstanding
.../bytes-storage-mem-allocated
.../bytes-storage-mem-free
.../total_requests-storage-synth
.../requests-storage-synth-outstanding
.../bytes-storage-synth-outstanding
.../bytes-storage-synth-allocated
.../bytes-storage-synth-free

If we move some of the information into the plugin instance, the data is
structured better:

  host/varnish-default-storage-file/...
.../total_requests
.../requests-outstanding
.../bytes-allocated
.../bytes-free
  host/varnish-default-storage-mem/...
.../total_requests
.../requests-outstanding
.../bytes-outstanding
.../bytes-allocated
.../bytes-free
  host/varnish-default-storage-synth/...
.../total_requests
.../requests-outstanding
.../bytes-outstanding
.../bytes-allocated
.../bytes-free

Here, "default" is the Varnish instance name (which would no longer be
allowed to be empty).

> Do you think the plugin should monitor more stuff or is that
> sufficient for now ?

I think the following might be interesting:

  s_sess   Total Sessions
  s_reqTotal Requests
  s_pipe   Total pipe
  s_pass   Total pass
  s_fetch  Total fetch
  s_hdrbytes   Total header bytes
  s_bodybytes  Total body bytes

> @Florian :
> Do you think we should focus on the Wiki page that will document the
> plugin now ?

I already set up a preliminary wiki page at [0]. Feel free to edit it to
your liking ;) You need to log-in to upload example graphs, though.

Regards,
-octo

[0] 
-- 
Florian octo Forster
Hacker in training
GnuPG: 0x91523C3D
http://verplant.org/


signature.asc
Description: Digital signature
___
collectd mailing list
collectd@verplant.org
http://mailman.verplant.org/listinfo/collectd


Re: [collectd] Varnish plugin

2010-06-12 Thread Florian Forster
Hi Jérôme,

On Fri, Jun 11, 2010 at 07:29:23PM +0200, Jerome Renard wrote:
> I tested the "cache_result" type, and when the DERIVE type is used, I
> do not get the correct numbers.

> If I compare the output I get from varnishstat and the result I get
> from the rrd generated graphs the values are completely different.

as far as I see, varnishstat divides counter values with the uptime. The
result is therefore roughly the all-time average rate of that counter.

With collectd, you get the rate of the last $Interval seconds, for
example the last 10 seconds in the default configuration. This value may
of course be *much* smaller or larger, depending on how busy the system
was compared to the average usage.

> I guess this is due to de derive type.

"COUNTER" and "DERIVE" only differ in the overflow and counter-reset
cases. Since overflows are next to impossible with 64bit counters, and
because the counters are reset to zero when varnish is restarted, I
opted for optimizing for the latter case, i.e. used "DERIVE".

You can find a more in-depth explanation of the difference between
"COUNTER" and "DERIVE" in the collectd wiki at [0].

> But the problem is that I never get any value, it is always set to
> zero, here is what I get from the debug:

Is the counter value actually increasing? Otherwise a rate of (close to)
zero is expected and normal.

Regards,
-octo

[0] 
-- 
Florian octo Forster
Hacker in training
GnuPG: 0x91523C3D
http://verplant.org/


signature.asc
Description: Digital signature
___
collectd mailing list
collectd@verplant.org
http://mailman.verplant.org/listinfo/collectd


Re: [collectd] collectd & hostapd + mac80211

2010-06-12 Thread Florian Forster
Hi Michael,

On Wed, Jun 09, 2010 at 09:22:05PM +0200, Michael Markstaller wrote:
> After searching a bit, I wondered wether there's already something to
> monitor wifi using hostapd (and mac80211) ?

not that I'm aware of. The "Wireless" plugin reads from /proc/wireless
and the "MadWifi" plugin works some specialized MadWifi magic to access
detailed statistics.

Regards,
—octo
-- 
Florian octo Forster
Hacker in training
GnuPG: 0x91523C3D
http://verplant.org/


signature.asc
Description: Digital signature
___
collectd mailing list
collectd@verplant.org
http://mailman.verplant.org/listinfo/collectd


Re: [collectd] Help With Plugin Exec

2010-06-12 Thread Florian Forster
Hi Ravi,

On Wed, Jun 09, 2010 at 12:01:24PM -0700, Raviprakash Ramanujam wrote:
> I have a python script that needs to run under root privileges […]
> […]
> In order to run the script as root, […]

> 
> Exec "nobody:root" "/root/script.py"
> 

this configuration will run the script as user "nobody" and group
"root". Running scripts as root is not possible. You can, however, use
"sudo" inside the script to re-gain root privileges if required.

> exec plugin: exec_read_one: error = logfile plugin: fopen 
> (/var/log/collectd.log) failed: Permission denied

You're doing something mighty weird here: Why is the script printing
something about the "logfile plugin" to STDERR? And why is it trying to
access the log file directly?

Regards,
—octo
-- 
Florian octo Forster
Hacker in training
GnuPG: 0x91523C3D
http://verplant.org/


signature.asc
Description: Digital signature
___
collectd mailing list
collectd@verplant.org
http://mailman.verplant.org/listinfo/collectd


Re: [collectd] unixsock-like plugin extension

2010-06-12 Thread Florian Forster
Hi Shaun,

On Fri, Jun 04, 2010 at 12:39:47PM -0700, Shaun Lindsay wrote:
> So, I wrote a plugin I'm calling aggregator to do handle this sort of
> use case.

what you're describing sounds a lot like the "Event infrastructure", an
idea that has been floating around for a long time by nobody actually
implemented it yet.

The idea was to implement a new command in the "UnixSock" plugin with
the following syntax:

  EVENT  []

When this command is received, the "UnixSock" plugin looks up the
structure called  or allocates it if necessary. If a  is 
given, the number is added to a counter in the structure, otherwise the
counter is increased by one.

So, for example, if a web-application issues

  EVENT "pageview"

after each page served, you'll end up with a graph showing the pageviews
per second.

A possible extension would be to allow the user to specify a "data
set" / "type" to use. The same web-application could, for example, issue
this command

  EVENT type="total_bytes" "pages" 18063

to use the "total_bytes" data set and add 18 kByte to the current
counter.

> Again due to my particular use case, it needed to be able to handle a
> large number of concurrent connections (50k potentially), so rather
> than spawning a thread per connection ala unixsock, I kick off one
> thread when the plugin init's and then run a libevent server inside
> that thread and do everything asynchronously.

I think it might be possible to incorporate that into the "UnixSock"
plugin, either as a compile-time or run-time choice. We could, for
example, use libevent if it is available and fall back to the current
implementation otherwise.

> So, then, two questions:  First, did I just reinvent the wheel on
> this?  Is there a plugin that already satisfies this sort of need?

No. The idea is not new, but there is not code in the daemon / plugins
yet.

> Second, if this is new, is anyone else interested in this sort of
> functionality?

Definitely ;)

Regards,
—octo

[0] 
-- 
Florian octo Forster
Hacker in training
GnuPG: 0x91523C3D
http://verplant.org/


signature.asc
Description: Digital signature
___
collectd mailing list
collectd@verplant.org
http://mailman.verplant.org/listinfo/collectd


Re: [collectd] Large instalaction - Collecting data from

2010-06-12 Thread Florian Forster
Hi Domagoj,

On Fri, Jun 04, 2010 at 11:17:48AM +0200, Domagoj Mikac wrote:
> I am trying to pool SNMP performance data (with SNMP plugin) from
> aproximately 1750 routers.
> I have 4 configuration files for different types of routers  as stated
> bellow:
> 
> routersA.cfg - 850 Host blocks and 5408 Data Blocks
> routersB.cfg - 683 Host blocks and 5839 Data Blocks
> routersC.cfg - 119 Host blocks and 1133 Data Blocks
> routersD.cfg - 99 Host blocks and 906 Data Blocks
> ---
> TOTAL: 1751 Host blocks and 13286 Data Blocks
> 
> My problem is that after restaring collectd it creates only 10252 RRD
> files for 1411 hosts.
> No matter how many new hosts I add in config files it doesn't create
> any new RRDs.

there's no hard limit in the "SNMP" plugin or somewhere else in
collectd, so the number of configuration blocks should not be a problem.

There is, however, a hard limit on the number of file descriptors
libnetsnmp, the SNMP library used by the plugin, can have open at any
time, afaIk. If I recall correctly, the number was 1024. So with 1751
hosts you may well run into that problem.

I'm not quite sure what we can do about that. Your best bet is probably
to build Net-SNMP yourself and adjust the limit in the sources. Maybe
complaining to the Net-SNMP guys might help with that problem in the
future.

Regards,
—octo
-- 
Florian octo Forster
Hacker in training
GnuPG: 0x91523C3D
http://verplant.org/


signature.asc
Description: Digital signature
___
collectd mailing list
collectd@verplant.org
http://mailman.verplant.org/listinfo/collectd


Re: [collectd] Large instalaction - Collecting data from

2010-06-12 Thread Florian Forster
On Sat, Jun 12, 2010 at 11:10:33AM +0200, Florian Forster wrote:
> There is, however, a hard limit on the number of file descriptors
> libnetsnmp, the SNMP library used by the plugin, […]

I may have been a bit quick to pass the blame, here. libnetsnmp uses
select(2), which accepts up to FD_SETSIZE file descriptors.

> I'm not quite sure what we can do about that.

The solution would be to switch from select(2) to poll(2) which doesn't
have this limit because it's not using bitfields to identify file
descriptors. Still, a change in libnetsnmp is required.

Regards,
—octo
-- 
Florian octo Forster
Hacker in training
GnuPG: 0x91523C3D
http://verplant.org/


signature.asc
Description: Digital signature
___
collectd mailing list
collectd@verplant.org
http://mailman.verplant.org/listinfo/collectd


Re: [collectd] Greetings (oh and some nooby help)

2010-06-23 Thread Florian Forster
Hi,

> collectd appears to be writing rrd files in
> /var/lib/collectd/www.mysite.com

> opendir > (/var/lib/collectd/rrd): No such file or directory at
> ../lib/Collectd/Graph/Common.pm line 263

either directory is fine, but they have to match.

By default the RRDtool plugin uses the …/collectd/rrd directory and the
CSV plugin uses the …/collectd/csv directory, so both plugin can be
active at the same time without getting in each other's way.

> (this feels like the wrong place to me?).

Why's that? It's data written at run-time, so $localstatedir (i.e. /var)
is the right place for this. 

> Am I nearly there?

Yeah, I think so ;)

Regards,
—octo
-- 
Florian octo Forster
Hacker in training
GnuPG: 0x91523C3D
http://verplant.org/


signature.asc
Description: Digital signature
___
collectd mailing list
collectd@verplant.org
http://mailman.verplant.org/listinfo/collectd


Re: [collectd] RRDTool slowdowns

2010-07-03 Thread Florian Forster
Hi,

On Wed, Jun 30, 2010 at 04:04:31PM +0200, Alexander Wirt wrote:
> Unfortunatly that is a known problem with rrd 1.4.x, cairo is much
> slower than in 1.2.x. The only way to prevent this is to stay on 1.2  

unfortunately, it appears that graph generation isn't the only thing
that slowed down from 1.3 to 1.4. With RRDtool 1.4, an update causes *a
lot more* data to be read from disk than RRDtool 1.3 does. This results
in a slowdown from ~40 writes per second (RRDtool 1.3) to ~15 writes per
second (RRDtool 1.4) on the same machine.

I do not yet know what the cause of this is, but I'll try to track that
down.

Regards,
—octo
-- 
Florian octo Forster
Hacker in training
GnuPG: 0x91523C3D
http://verplant.org/


signature.asc
Description: Digital signature
___
collectd mailing list
collectd@verplant.org
http://mailman.verplant.org/listinfo/collectd


Re: [collectd] [PATCH 3/3] Add custom message for threshold and missings.

2010-07-03 Thread Florian Forster
Hi Andres,

On Sun, Jun 13, 2010 at 07:57:07PM +0200, Andres J. Diaz wrote:
> Add two new options in thresholds, the Message which can define a
> custom message for thresholds and MissingMessage, which define a
> custom message for missing interesting values related with that
> threshold.

I finally got around to look at your code more thoroughly. Overall it
looks great, thank you very much for your work :)

I've made some changes though. Some of it is just general clean-up and
doesn't directly have to do with your code. Some is coding style
changes. The rest is basically a little rework of the "REPLACE_FIELD",
"FTOA" and "ITOA" macros.

The changes are included in the "ad/msg" branch available from my Git
repository and on Github. It'd be great if you could take a look at (or
better yet: test) the changes, so I can merge them into the master
branch.

> Thanks again to Taizo for the previously work :)

Yes, indeed, thank you, too, Taizo :)

Regards,
—octo
-- 
Florian octo Forster
Hacker in training
GnuPG: 0x91523C3D
http://verplant.org/


signature.asc
Description: Digital signature
___
collectd mailing list
collectd@verplant.org
http://mailman.verplant.org/listinfo/collectd


Re: [collectd] collectd v4.6 encryption?

2010-07-09 Thread Florian Forster
Hi,

On Thu, Jul 08, 2010 at 09:53:53AM -0700, N. Tucker wrote:
> Hello, I have a collectd 4.3 server which uses a "network" section that
> looks like this:
> 
> 
> SecurityLevel Encrypt
> AuthFile authfile
> 

the possibility to sign / encrypt network traffic has been added in
version 4.7. The plugin doesn't complain about unknown config options,
because it doesn't (yet) expect "Listen" to be a block, ignoring all
contained options.

> Is it possible that the encrypted networking option was removed for
> one or more versions?

No.

> Is it possible to make collectd 4.6.3 talk to a collectd server with
> encrypted communications?

No.

> The docs imply that there was no encryption support before 4.7, but
> this is obviously false.

No, the docs are accurate. (At least in this respect.)

Regards,
—octo
-- 
Florian octo Forster
Hacker in training
GnuPG: 0x91523C3D
http://verplant.org/


signature.asc
Description: Digital signature
___
collectd mailing list
collectd@verplant.org
http://mailman.verplant.org/listinfo/collectd


Re: [collectd] Varnish plugin

2010-07-13 Thread Florian Forster
Hi Jérôme,

On Mon, Jul 12, 2010 at 08:25:38AM +0200, Jerome Renard wrote:
> I just pushed some changes Marc did a few days ago.

those changes are already merged into my master branch.

> What are the next steps now ?

I don't think there's a whole lot to do … Test the plugin as thorough as
possible, hunt down bugs if you encounter any and maybe create some
example graphs for the wiki.

Best regards,
—octo
-- 
Florian octo Forster
Hacker in training
GnuPG: 0x91523C3D
http://verplant.org/


signature.asc
Description: Digital signature
___
collectd mailing list
collectd@verplant.org
http://mailman.verplant.org/listinfo/collectd


Re: [collectd] Help w/ exec plugin

2010-07-13 Thread Florian Forster
Hi Richard,

On Mon, Jun 28, 2010 at 09:34:18AM +1200, Richard Hanschu wrote:
> I have created a perl script to collect data from a embedded webserver
> on a power meter which creates the following output every 60 seconds:
> 
> tridiumJace/snmp/power-COE2 interval=60 N:468.73

this looks basically okay. Using the plugin "snmp" when in fact the
script is executed via the "exec" plugin is kind of confusing, but
"legal".

> [2010-06-28 08:42:37] exec plugin: Prepending `PUTVAL' to this line: 
> tridiumJace/snmp/power-COE2 interval=60 N:452.65

Please note that with collectd 5 the compatibility code you rely on will
be removed. To be save for the future please change the output to:

  PUTVAL tridiumJace/snmp/power-COE2 interval=60 N:468.73

> If I up the script interval to 1 sec, then the extra data will create
> the following type of entries in the collectd log when too many data
> points are feed to collectd for it's 10 min collection interval:

I have no idea what you're trying to do there. If "interval=60" is given
in the line printed by your script, then the created RRD files will have
a 60 second step – no matter what the global "interval" setting is set
to.

If you submit values more often than specified in the "interval="
option, you'll waste resources but other than that this won't have bad
consequences. If you submit values less often, you may run into
problems, depending on heartbeat of the RRD file.

> [2010-06-28 08:42:37] uc_update: Value too old: name = 
> tridiumJace/snmp/power-COE2; value time = 1277671357; last cache update = 
> 1277671357;

The most common source for this problem is that a value is submitted
twice. For example your script could be running twice, the script could
print the same line twice or the snmp plugin creates a value with the
same identifier.

I think that once you solve the "duplicate value" problem, your RRD file
will start working as expected. You may have to delete it so it's
re-created with the correct parameters, though.

> This email may be confidential […]

Uhm, you've sent your email to a public mailing list.

Regards,
—octo
-- 
Florian octo Forster
Hacker in training
GnuPG: 0x91523C3D
http://verplant.org/


signature.asc
Description: Digital signature
___
collectd mailing list
collectd@verplant.org
http://mailman.verplant.org/listinfo/collectd


Re: [collectd] Integer overflow in curl_json plugin

2010-07-13 Thread Florian Forster
Hi Garret,

On Tue, Jun 22, 2010 at 04:00:10PM -0700, Garret Heaton wrote:
> The error in the collectd log:
> curl_json plugin: yajl_parse failed: parse error: integer overflow
> #012  ct_running":false,"disk_size":2226049133,"instance_start_tim
> #012 (right here) --^
> 
> The plugin uses the yajl [3] library for parsing json and is currently
> making use of its integer and double callbacks. Apparently if the
> number callback is used it'll use that for all numeric data. Is there
> a way for collectd to capture the data in this way and use it in a
> meaningful way (convert to a different unit, use a "wrapped around"
> counter type, etc)? Or is the only option to use a 64-bit machine for
> the collectd server?

actually, using the "number" callback here is much nicer. By the time it
is called, the "type" of the data set has already been determined and we
can parse the number accordingly as double, int64_t or uint64_t.

I've implemented a patch which you can find in the "collectd-4.9" branch
of my Git repository [0]. It'd be great if you could test it.

"disk_size" sounds a lot like a "gauge" value. If could append ".0" to
the value you might be able to trick libyajl into parsing the string as
a double value, which is large enough to hold this value (the integer is
later converted to a double anyway, if it's a gauge).

Regards,
—octo

[0] 

-- 
Florian octo Forster
Hacker in training
GnuPG: 0x91523C3D
http://verplant.org/


signature.asc
Description: Digital signature
___
collectd mailing list
collectd@verplant.org
http://mailman.verplant.org/listinfo/collectd


Re: [collectd] How to add clients to collectd

2010-07-13 Thread Florian Forster
On Tue, Jun 22, 2010 at 09:21:30AM +0530, Raja antony wrote:
> How can i add clients to collectd?

Please see .

—octo
-- 
Florian octo Forster
Hacker in training
GnuPG: 0x91523C3D
http://verplant.org/


signature.asc
Description: Digital signature
___
collectd mailing list
collectd@verplant.org
http://mailman.verplant.org/listinfo/collectd


Re: [collectd] Sensors not showing

2010-07-14 Thread Florian Forster
Hi,

On Tue, Jul 13, 2010 at 05:33:56PM +0100, Michelle Wright wrote:
> On the CGI web interface I got lots of things showing but not sensors
> even though it is enabled and the 'sensors' command works:

how did you install collectd and libsensors?

There are two versions of libsensors. collectd can work with both, but
once it's been built it needs the version that was present at build
time. Under Debian the correct version is only "recommended" (not
"depended" upon), so depending on your system config you may end up with
the incorrect version.

> I'm really not sure what I should be putting in the collectd.conf file
> under the sensors section? Reading the wiki suggests nothing or am I
> mis reading it?

If you don't provide a  block, the plugin will collect
statistics about all sensors. This is okay for a first step – you can
always ignore unnecessary sensors later.

Regards,
—octo
-- 
Florian octo Forster
Hacker in training
GnuPG: 0x91523C3D
http://verplant.org/


signature.asc
Description: Digital signature
___
collectd mailing list
collectd@verplant.org
http://mailman.verplant.org/listinfo/collectd


Re: [collectd] Bug#590656: collectd: apache traffic cut off at 1 GBit/sec

2010-07-28 Thread Florian Forster
Hi,

On Wed, Jul 28, 2010 at 09:18:10AM +0200, Sebastian Harl wrote:
> The reason for setting max. values in the first place is to avoid huge
> peaks in case of a counter reset (e.g., after restarting some
> service). In the future (collectd 5), this will probably be handled by
> using the "DERIVE" data type rather than "COUNTER" in the RRD files.
> In the meantime, I think we should raise the limits.
> 
> Florian, do you agree with this?

yes, for all kinds of counters [*] using DERIVE is preferable to
COUNTER. I think I'll replace all COUNTER data sources with DERIVE in
collectd 5 – users of high-traffic interfaces with 32bit counters will
have to convert their "if_octets" files to COUNTER by hand.

Regards,
—octo

[*] Possible exception: 32bit counters of high-traffic network
interfaces, as found on some switches.
-- 
Florian octo Forster
Hacker in training
GnuPG: 0x91523C3D
http://verplant.org/


signature.asc
Description: Digital signature
___
collectd mailing list
collectd@verplant.org
http://mailman.verplant.org/listinfo/collectd


Re: [collectd] Tail plugin "missing" events

2010-07-28 Thread Florian Forster
Hey Gregory,

On Tue, Jul 27, 2010 at 09:40:08AM +, Gregory Giguashvili wrote:
> I found that the patch was not included in 4.10.1 release. Can you
> please let me know if this patch is going to make it to 4.10.x branch
> eventually?

sorry, but I don't know which patch you're referring to. Was it sent to
the mailing list?

Regards,
—octo
-- 
Florian octo Forster
Hacker in training
GnuPG: 0x91523C3D
http://verplant.org/


signature.asc
Description: Digital signature
___
collectd mailing list
collectd@verplant.org
http://mailman.verplant.org/listinfo/collectd


Re: [collectd] [PATCH] Add redis plugin.

2010-08-04 Thread Florian Forster
Hi Andres,

On Fri, Jul 30, 2010 at 04:36:38PM +0200, Andres J. Diaz wrote:
> This commit adds a new redis plugin, which connect to a number of
> redis server and get information about their status, using the
> libcredis > 0.2.2 library.

thank you very much for your patch :) I've pulled the version from
Github. Is that the same as the attached patch?

I've rebased the commit to my master branch. Also, I've applied two
changes: Some compilation fixes (typos in the Makefile and the POD) and
I've re-indented whitespace. I'm sorry about the re-indenting, but I
felt it'd be best to do that as soon as possible.

I pushed my changes to the "ad/redis" branch on Github. It'd be great if
you could take a look and hopefully catch my mistakes ;)

Regards,
—octo
-- 
Florian octo Forster
Hacker in training
GnuPG: 0x91523C3D
http://verplant.org/


signature.asc
Description: Digital signature
___
collectd mailing list
collectd@verplant.org
http://mailman.verplant.org/listinfo/collectd


Re: [collectd] size of rrd's

2010-08-04 Thread Florian Forster
Hi Patrick,

On Mon, Jul 19, 2010 at 09:09:08AM +0200, Patrick Matula wrote:
> The rrd files have a fix size. But where is it?

the size of the file depends on the parameters passed to RRDtool when
creating the database file. The default RRD files created by collectd
typically are about 150 kByte in size.

HTH,
—octo
-- 
Florian octo Forster
Hacker in training
GnuPG: 0x91523C3D
http://verplant.org/


signature.asc
Description: Digital signature
___
collectd mailing list
collectd@verplant.org
http://mailman.verplant.org/listinfo/collectd


Re: [collectd] FreeBSD interval?

2010-08-04 Thread Florian Forster
Hi Denis,

On Tue, Jul 20, 2010 at 05:14:17PM +0400, Denis Melnikov wrote:
> 'nan' marks intervals when I changed Interval to 80. So 70 is default
> and the only available one, isn't it?

the interval setting is also stored in the RRD files when they are
created. Is it not changes however, if you change the interval setting
to something else later.

If you change the interval in the collectd config, you have to delete
all the RRD files so they are re-created with the correct settings.

Regards,
—octo
-- 
Florian octo Forster
Hacker in training
GnuPG: 0x91523C3D
http://verplant.org/


signature.asc
Description: Digital signature
___
collectd mailing list
collectd@verplant.org
http://mailman.verplant.org/listinfo/collectd


Re: [collectd] plugin zfs_arc don't graph

2010-08-04 Thread Florian Forster
Hi Vincent,

On Tue, Jul 27, 2010 at 05:38:00PM +0200, Vincent Kwiatkowski wrote:
> No problem during compilation "zfs_arc..yes"

> Jul 27 15:11:31 PA-OFC-SRV-UAT-2 collectd[28056]: [ID 702911 daemon.notice] 
> uc_update: Value too old: name = PA-OFC-SRV-UAT-2/interface/if_octets-mac; 
> value time = 1280236291; last cache update = 1280236291;

this error message has nothing to do with the "zfs_arc" plugin, but with
the "interface" plugin. You can resolve it by ignoring the interface
called "mac", that for some reasons is reported multiple times under
certain versions of Solaris. You can do this using the "Interface" and
"IgnoreSelected" options of the "interface" plugin, which are documented
in the collectd.conf(5) manual page.

Do you have a directory named "zfs_arc" somewhere beneath the RRD data
directory? You can use find(1) to look for it, for example:

  $ find /var/lib/collectd/rrd -name zfs_arc -type d

Regards,
—octo
-- 
Florian octo Forster
Hacker in training
GnuPG: 0x91523C3D
http://verplant.org/


signature.asc
Description: Digital signature
___
collectd mailing list
collectd@verplant.org
http://mailman.verplant.org/listinfo/collectd


Re: [collectd] AIX: WPAR and cpu patchs.

2010-08-04 Thread Florian Forster
Hi Manuel,

On Tue, Jul 20, 2010 at 08:18:12PM +0200, Manuel Luis Sanmartín Rozada wrote:
> The wpar patches are for a plugin to collect cpu, load and memory from
> Workload Partitioning in AIX. It was tested with system WPAR in
> aix 6.1.

thank you very much for your patches :) I pulled them into my Git
repository and pushed them to Github [0].

I did some changes to the code, but couldn't test them because I don't
have an AIX machine around. It'd be great if you could tell me whether
it still works or not, so I can pull the changes into the master branch.

Regards,
—octo

[0] 
-- 
Florian octo Forster
Hacker in training
GnuPG: 0x91523C3D
http://verplant.org/


signature.asc
Description: Digital signature
___
collectd mailing list
collectd@verplant.org
http://mailman.verplant.org/listinfo/collectd


Re: [collectd] [PATCH 1/2] Add collectd-flush command line utility.

2010-08-06 Thread Florian Forster
Hi Håkon,

On Thu, Aug 05, 2010 at 04:37:45PM +0200, hakon-dugstad.john...@telenor.com 
wrote:
> collectd-flush is a small command-line utility which uses
> libcollectdclient to flush collectd through the unixsock plugin.

thank you for that patch :) This scratches an itch I've had for some
time, too. I never did anything about it, though, since just telnet'ing
to the appropriate port worked for me ;)

> I am no C wizard, so please bear with me if I have done something a
> stupid way. :)

Your code looks fine to me :) I've found one missing "break", but that's
it. I'm a bit concerned about using "getopt_long", since - to the best
of my knowledge - that's a GNU extension and not portable. Unless I'm
mistaken I will probably remove the long option names and go back to
"getopt".

I've applied the patches to the master branch and pushed them to my Git
repository.

Best regards,
—octo
-- 
Florian octo Forster
Hacker in training
GnuPG: 0x91523C3D
http://verplant.org/


signature.asc
Description: Digital signature
___
collectd mailing list
collectd@verplant.org
http://mailman.verplant.org/listinfo/collectd


Re: [collectd] default collectd.conf syslog verbosity

2010-08-08 Thread Florian Forster
Hi Michael,

On Fri, Aug 06, 2010 at 10:22:53PM +0300, Michael Shigorin wrote:
>   Hello Florian,
> is it reasonable to propose adding
> 
> 
>   LogLevel info
> 
> 
> to default configuration or disabling syslog plugin by default
> with a sensible comment in stock collectd.conf that would warn
> of sizeable log traffic?

I much prefer to have some kind of logging in the default config file,
so users have a place to look if something doesn't work. Currently the
"syslog" plugin is used if it was built, otherwise the "logfile" plugin
is used.

The default log level depends on the configure option "--enable-debug".
It is "debug" if debugging was enabled and "info" otherwise. This
assumes that if the user enables debugging, he actually requires
debugging output. [*]

How would you feel about an additional configure option,
"--with-loglevel=info" which would override this default behavior? Would
that help with your package?

> PS: would a commit be more convenient?  We maintain ALT package
> in git either: http://git.altlinux.org/gears/c/collectd.git

Patch or pull request, either way is fine. If you already have a Git
repository available, pulling from it may be a tad easier.

Regards,
-octo

[*] I know that this doesn't hold true for packagers and I'm sorry for
the inconvenience, but I think that this is a reasonable assumption
for Joe Everyuser.
-- 
Florian octo Forster
Hacker in training
GnuPG: 0x91523C3D
http://verplant.org/


signature.asc
Description: Digital signature
___
collectd mailing list
collectd@verplant.org
http://mailman.verplant.org/listinfo/collectd


Re: [collectd] possible memory leak?

2010-08-18 Thread Florian Forster
Hi Kimo,

On Mon, Aug 16, 2010 at 10:00:38AM -0700, Kimo Rosenbaum wrote:
> Over the course of a couple days active memory slowly increases and
> eventually swap starts getting used. Restarting the collectd instance
> relieves memory pressure and everything is fine for a few more days.

I've observed a similar memory growth on very busy boxes but so far was
unable to track down an error in collectd.

When I noticed that some network packets were dropped, I raised the
receive buffer size (and limit) using sysctl. Strangely enough, this
stopped not only packets being dropped, but also the memory growth. I
still have to find an explanation as to why.

The applicable keys are:
  net.core.rmem_default
  net.core.rmem_max

At least under Debian you can set them persistently (so the setting will
survive a reboot) by adding the keys to the
  /etc/sysctl.conf
configuration file.

It'd be awesome if you could give it a shot and give me some feedback if
you observer any changes. Also, your exact kernel version would be
interesting.

Regards,
--octo
-- 
Florian octo Forster
Hacker in training
GnuPG: 0x91523C3D
http://verplant.org/


signature.asc
Description: Digital signature
___
collectd mailing list
collectd@verplant.org
http://mailman.verplant.org/listinfo/collectd


Re: [collectd] [PATCH] New plugin - lpar

2010-08-18 Thread Florian Forster
Hi Aurélien,

thank you very much for your patch :)

Manuel has recently sent a patch for "Workload Partitioning" (WPAR),
also an AIX virtualization technique. Could you or Manuel enlighten me
how these things relate to one another? Would it make sense to combine
both plugins into one?

I have a couple of questions / comments regarding the data being
collected, too:

  * You are calculating the time difference yourself and calculating a
rate from that. I'd prefer to use a DERIVE or COUNTER data source
type for this kind of data rather than converting the counters to a
"gauge" in the plugin.

I have changed the generic CPU stuff to use the "cpu" type, which
currently is of type COUNTER and will probably be converted to
DERIVE before 5.0 is released. You can find the changed in my Git
repository in the "ar/lpar" branch.

  * What's the deal with the minimum, entitled, maximum "proc capacity"?
Is that something that actually does change often? It sounds more
like a static configuration thing. Why do you divide that number by
100? Is that some magical number required here?

  * Why do you use the chassis serial number as plugin instance? I'd
expect that this information would be either assigned to the host
name or that the partition's ID ("lpar_id") would be used as plugin
instance. If the partition is moved to another system, the physical
ID you're using changes, and this seems to be on purpose. I'd
however expect that you'd look for something that *doesn't* change
to identify the partition. Something like:
  hostname = "lpar_pool-%x", pool_id
  plugin  = "lpar"
  plugin_instance = "partition-%x", lpar_id / "global"
What am I missing?

  * Why not use the name included in the struct,
(perfstat_partition_total_t *)->name?

  * What's this code doing?:
> + dlt_pit = lparstats.pool_idle_time - 
> last_pool_idle_time;
> + total = (double)lparstats.phys_cpus_pool;
> + idle  = (double)dlt_pit / XINTFRAC / 
> (double)delta_time_base;
Why don't you use the "pool_busy_time" member?

Regards,
—octo
-- 
Florian octo Forster
Hacker in training
GnuPG: 0x91523C3D
http://verplant.org/


signature.asc
Description: Digital signature
___
collectd mailing list
collectd@verplant.org
http://mailman.verplant.org/listinfo/collectd


[collectd] SSC Serv 2.0.0 is available.

2010-08-26 Thread Florian Forster
Hello,

I'm pleased to announce version 2.0.0 of “SSC Serv”, the first stable
version of the next generation of SSC Serv. The system statistics
collection service has been in public beta testing for five months and
has proven to be very stable.

While most changes from version 1.2 are not directly visible to the
user, there are some new features that have been requested repeatedly:

  * The DF, Disk and Interface plug-ins now feature a selection list
that is very similar to the “ignore list” feature of several
collectd plug-ins. This makes it possible to ignore the loopback
network interface, for example.

  * It is now easily possible to configure multiple destinations for the
Network plug-in or use an alternative port.

  * The Memory plug-in now reports data with far more detail. Instead of
trying to approximate the behavior of the collectd plug-in under
Linux, it uses the Windows-specific names.

  * The "DF" plugin is using the relatively new “df_complex” type to
report its statistics now.


Download


Customers can download x86 (32 bit) and x86-64 (64 bit) installer
packages from the “Download” page in the customer's area.


Free version


There's a free (as in “beer”) version of SSC Serv 2 available as well.
The free version comes with the CPU and Interface plug-ins and is
limited to one network destination to send data to. You can download the
free version from:

  https://ssc-serv.com/download.shtml


About
-

“SSC Serv” is a service for Microsoft®  Windows®, which collects system
statistics and submits them to a central statistics server. It is
similar to and compatible with collectd, a free and open-sourced
solution for UNIX® systems. SSC Serv is intended to be an outpost of
collectd in the Windows® world and isn't really useful without a
collectd server instance.

For more information on SSC Serv, please see its homepage at:

  http://ssc-serv.com/


Best regards,
Florian Forster
-- 
Florian octo Forster
Hacker in training
GnuPG: 0x91523C3D
http://verplant.org/


signature.asc
Description: Digital signature
___
collectd mailing list
collectd@verplant.org
http://mailman.verplant.org/listinfo/collectd


Re: [collectd] mysql plugin display same data for different databases

2010-09-02 Thread Florian Forster
Hi Salimane,

On Thu, Sep 02, 2010 at 05:54:43PM +0800, Salimane Adjao Moustapha wrote:
>  even though i specified different databases , collection3 display the same
> data for all the databases.
> 
>  
> Host "10.0.0.1"

> 
> Host "10.0.0.1"

> 
>  Host "10.0.0.1"

thanks to MySQL, the terminology here is a bit confusing. The term
"database" is used both, for the MySQL *process* and for an abstract
unit -- a set of tables -- within such a process.

The MySQL plugin collects information concerning the *process*, not the
set of tables. Therefore, connecting to the same process three times
*should* produce the same statistics three times.

In all likelihood, you can omit the "Database" option within the
"" block altogether -- "selecting" a "database" is only
necessary to satisfy some weird access configurations, afaIk.

Regards,
--octo
-- 
Florian octo Forster
Hacker in training
GnuPG: 0x0C705A15
http://octo.it/


signature.asc
Description: Digital signature
___
collectd mailing list
collectd@verplant.org
http://mailman.verplant.org/listinfo/collectd


Re: [collectd] [PATCH] New plugin - lpar

2010-09-03 Thread Florian Forster
Hi Aurélien,

sorry for the late reply :/

On Fri, Aug 20, 2010 at 12:47:58AM +0200, Aurélien Reynaud wrote:
> LPARs (Logical Partitions) and WPARs (Workload Partitions) are indeed
> similar as they are both virtualization techniques. Please note that
> I'm not familiar with WPARs, so I may be somewhat mistaken in my
> comparison.
> […]
> IMHO these concepts are different enough to warrant separate plugins.

Yes, I agree fully.

> As an example, imagine an IBM power system with 16 CPUs. […]

Thanks for the explanation, it helped greatly in understanding what your
plugin does :)

> >   * You are calculating the time difference yourself and calculating
> > a rate from that. I'd prefer to use a DERIVE or COUNTER data
> > source […]
> 
> Well, I find using a counter/derive more elegant myself. I just fail
> to understand how it can work. The original counters are expressed in
> 'processor time spent in xxx mode' where time is not in seconds but in
> custom cpu-clock dependent 'ticks'.

That's correct and is the same for other systems, for example Linux.
That's why we collect CPU usage in "jiffies", the Linux term for "ticks
per second". The question ("Why don't the rates add up to 100?") has
been asked a lot, so there's an extensive FAQ entry about this at


> If I'm not mistaken, using the raw counters the graphs will show cpu
> usage scaled by a factor of 'ticks_per_second' which we cannot
> compensate for as this value isn't known outside the host running the
> plugin.

Many people expect the CPU usage to be in percent. You can easily
calculate that in the front-end as

  percent busy = 100.0 * busy / (busy + idle + )

For this to work, you'll need to track the time assigned to a different
partition and report that as a separate state. For example (assuming
only "busy" and "idle" states for brevity):

  time_diff = now  - time_old;
  busy_diff = busy - busy_old;
  idle_diff = idle - idle_old;
  /* unav == unavailable */
  unav_diff = (entitled_per_second * time_diff) - (busy_diff + idle_diff);
  unav_counter += unav_diff;
  *_old = *;

I'm not sure if and how the calculation needs to be modified if that
"donate_flag" is set. I could imagine that in this case all processor
ticks are accounted for and we wouldn't need to do any of the above.

> As shown in the example above, entitlement is the processor capacity
> each LPAR gets allocated by the hypervisor for its use. Once set, it
> does not change but it can be dynamically adjusted by the admin to
> meet workload changes. I find it useful to have both cpu usage and
> entitlement on graphs: this allows to tell at a glance whether the cpu
> resource are sufficent or overkill at a given time.

It makes sense with your explanation above. I'd track it in the way
described above, though, i.e. in terms of "processor ticks not available
to the partition" rather than in it's absolute form. Does that make
sense to you?

> >   * Why do you use the chassis serial number as plugin instance? […]
> 
> You're right, there is a secret purpose to this. ;-)
> I would like to graph the total cpu usage of the chassis itself, by
> adding up the individual metrics of each participating LPAR. But there
> is no static list of LPARs since they can be moved across chassis'.
> So I need to know which rrd's are to be considered when graphing a
> given chassis.

I see, that makes sense. Though I would imagine that a user (i.e. not a
sysadmin) cares more about a partition than a cassis, so I think it'd
make sense to allow for both scenarios.

> Another possibility would be a config option like "ReportByChassis =
> true" which would tell the plugin to use the chassis's serial instead
> of the hostname.

I think the "ReportBySerial" configuration option you have implemented
in the meantime is exactly what I had in mind ;)

> >   * What's this code doing?:
> 
> […]
> After all, who am I to argue with IBM coders ?

"IBM guys do it like this" is good enough an explanation for me ;)
Either there is no difference at all (i.e. we don't lose) or there's
some subtle difference we don't know about (i.e. we win ;).

Regards,
—octo
-- 
Florian octo Forster
Hacker in training
GnuPG: 0x0C705A15
http://octo.it/


signature.asc
Description: Digital signature
___
collectd mailing list
collectd@verplant.org
http://mailman.verplant.org/listinfo/collectd


Re: [collectd] [PATCH] lpar plugin: new attempt

2010-09-03 Thread Florian Forster
Hi Aurélien,

On Wed, Sep 01, 2010 at 10:28:38PM +0200, Aurélien Reynaud wrote:
> The patch is against the current 4.10 branch, rather than against
> ar/lpar, because it is more of a complete rewrite than just fixes. I
> could provide a patch against ar/lpar however if you prefer so.

thanks for your update :)

I've applied your patch on top of the "ar/lpar" branch and rebased it to
master. I've also added / improved the calculation of "ticks not
available to the partition". Since I don't have an AIX box available,
I'm totally flying blind here. It'd be great if you could give it a go
and possibly fix the compilation error still contained in the code ;)

Regards,
—octo
-- 
Florian octo Forster
Hacker in training
GnuPG: 0x0C705A15
http://octo.it/


signature.asc
Description: Digital signature
___
collectd mailing list
collectd@verplant.org
http://mailman.verplant.org/listinfo/collectd


Re: [collectd] AIX: WPAR and cpu patchs.

2010-09-03 Thread Florian Forster
Hi Manuel,

On Wed, Aug 25, 2010 at 04:07:31PM +0200, Manuel Luis Sanmartín Rozada wrote:
> cc1: warnings being treated as errors
> cpu.c: In function 'cpu_read':
> cpu.c:566: warning: unused variable 'temp'

fixed that, thanks :)

> In the wpar plugin I change some strings, and the wpar cpu code.
> The cpu part was wrong, I send an old version.
> I need to do some calculations to convert the physical tics to
> something like cpu total from 0 to 100.

I've applied the changes that fix your typos, too.

I'm having reservations regarding the mangling of the CPU counters,
though. The basic unit here *is* ticks and I think it should be left to
front-ends to convert this to a percentage if the users wishes so. Also,
submitting the raw ticks would be consistent with the CPU and LPAR
plugins.

Best regards,
—octo
-- 
Florian octo Forster
Hacker in training
GnuPG: 0x0C705A15
http://octo.it/


signature.asc
Description: Digital signature
___
collectd mailing list
collectd@verplant.org
http://mailman.verplant.org/listinfo/collectd


Re: [collectd] Strange issue with exec and unixsock plugin

2010-09-04 Thread Florian Forster
Hi Mariusz,

thanks for all the effort you put into tracking down this problem :)

On Fri, Sep 03, 2010 at 11:33:45PM +0200, XANi wrote:
> Okay ive found another thing. While playing around with git bisect ive
> got to the point when
> * commit f6c23c8 have that "bug"
> * commit cbc3671 have bug (had to remove -Werror to compile it, else
>   it would fail during make)
> * commit d37ebe6 seems to work propertly, I can't trigger that bug

The commit you point our here introduces a heap to schedule calling of
the read callbacks. I've fixed a bug in the calculation of the parent
node in commit 69149f2 (initially committed to "collectd-4.9" and has
been merged to "collectd-4.10" and "master"). Could you check if the
problem persists after this commit?

Regards,
--octo
-- 
Florian octo Forster
Hacker in training
GnuPG: 0x0C705A15
http://octo.it/


signature.asc
Description: Digital signature
___
collectd mailing list
collectd@verplant.org
http://mailman.verplant.org/listinfo/collectd


Re: [collectd] [PATCH] collection3: fix multiple hosts selection issue

2010-09-04 Thread Florian Forster
Hi Jerome,

On Wed, Sep 01, 2010 at 02:35:58PM -0400, Jerome Oufella wrote:
> This patch addresses this issue by modifying the name of the hash key
> in the group_files_by_plugin_instance function, making it less prone
> to name collisions by prefixing it by the host name.

thanks for your fix :) I've applied it to the "collectd-4.9" branch.

Regards,
--octo
-- 
Florian octo Forster
Hacker in training
GnuPG: 0x0C705A15
http://octo.it/


signature.asc
Description: Digital signature
___
collectd mailing list
collectd@verplant.org
http://mailman.verplant.org/listinfo/collectd


Re: [collectd] segfault in notify_email

2010-09-05 Thread Florian Forster
Hi Manuel,

On Fri, Aug 27, 2010 at 09:28:17AM +0200, Manuel CISSE wrote:
> I'm using collectd 4.10.1 (Debian sid stock package) and I'm
> experiencing segfaults in notify_email quite a few times a day.

I've added serialization code to the plugin which protects access to the
library using a mutex. It'd be great if you could test the patch.
Currently, the fix is only in the "collectd-4.9" branch. If you need it
merged to 4.10 or master, please let me know.

Also, could you please open a bug report for Debian? If you do so,
Sebastian might be able to get this fix into Debian Lenny. (Provided
the problem is fixed, of course ;)

Regards,
--octo
-- 
Florian octo Forster
Hacker in training
GnuPG: 0x0C705A15
http://octo.it/


signature.asc
Description: Digital signature
___
collectd mailing list
collectd@verplant.org
http://mailman.verplant.org/listinfo/collectd


Re: [collectd] segfault in notify_email

2010-09-06 Thread Florian Forster
Hey Manuel,

On Mon, Sep 06, 2010 at 03:19:55PM +0200, Manuel CISSE wrote:
> I'm currently testing your patch since a few hours, for now it seems
> to work. I'll let you know if something happens but I think the
> problem is fixed.

good to know, thanks for your feedback :) Please keep me informed if the
problem happens to turn up again..

> Done (bug #595756).

Just out of curiosity, what bug tracker do you reference here?

Regards,
—octo
-- 
Florian octo Forster
Hacker in training
GnuPG: 0x0C705A15
http://octo.it/


signature.asc
Description: Digital signature
___
collectd mailing list
collectd@verplant.org
http://mailman.verplant.org/listinfo/collectd


Re: [collectd] segfault in notify_email

2010-09-06 Thread Florian Forster
Hi Manuel,

On Mon, Sep 06, 2010 at 05:28:30PM +0200, Manuel CISSE wrote:
> Debian BTS (maybe I should have created an account to submit it to
> collectd BTS instead ?).

no, no, the Debian bug tracker is perfectly fine. The collectd
bug tracker never caught on, so it's basically a legacy now.

It might be a good idea to send a "it appears to work now" message to
the Debian BTS, in order to inform people not reading the mailing list
of the progress. You can do so by sending an email to
@bugs.debian.org (with "" replaced, of course).

Regards,
—octo
-- 
Florian octo Forster
Hacker in training
GnuPG: 0x0C705A15
http://octo.it/


signature.asc
Description: Digital signature
___
collectd mailing list
collectd@verplant.org
http://mailman.verplant.org/listinfo/collectd


Re: [collectd] [PATCH] New plugin - lpar

2010-09-07 Thread Florian Forster
Hi Aurélien,

On Tue, Sep 07, 2010 at 11:50:09AM +0200, Aurélien Reynaud wrote:
> > Many people expect the CPU usage to be in percent. You can easily
> > calculate that in the front-end as

> What I am saying here is that we cannot just compute a ratio with the
> raw counters, we need to have the final result expressed in CPUs.

You're right: While it is of course possible to calculate the
percentages of one status to the sum of all status, it is much more
interesting to take the "entitlement" as reference. Since "uncapped"
partitions can use more ticks than they are entitled to, calculating a
percentage relative to the entitlement becomes impossible, leave alone
something relative to the entire processor pool.

> What I am saying here is that we cannot just compute a ratio with the
> raw counters, we need to have the final result expressed in CPUs.

Yeah, since the LPAR documentation itself always talks in terms of
"virtual CPUs" we might as well adopt that terminology in the plugin.

Currently I tend towards the solution I think you originally
implemented: Introduce a "vcpu_gauge" (or simply "vcpu") type and track
usage in terms of virtual CPUs. I.e. if the partition spends 123% of a
virtual CPU executing user code, store 1.23 in the "user" value.

Best regards,
--octo
-- 
Florian octo Forster
Hacker in training
GnuPG: 0x0C705A15
http://octo.it/


signature.asc
Description: Digital signature
___
collectd mailing list
collectd@verplant.org
http://mailman.verplant.org/listinfo/collectd


Re: [collectd] What is GaugeMax of tail plugin?

2010-09-09 Thread Florian Forster
Hi Denis,

On Thu, Sep 09, 2010 at 07:52:07PM +0400, Denis Melnikov wrote:
> Now if I see for example 783.3 on my graph I believe I can see " 200
> resp=783.3" in the logfile. But I don't see it! Instead, I find " 200
> resp=1537.387".

I believe this is due to the way RRDtool builds the "Primary Data Point"
(PDP). It basically comes down to this: RRDtool uses fixed intervals. If
you use a ten second step, for example, these intervals would start at
  t=0, t=10, ..., t=1284049090, t=1284049100, ...
and last for ten seconds each. If you add an update to RRDtool at an
arbitrary point in time, say
  value[t=17]=200
  value[t=27]=1000
  value[t=37]=100
then the value for PDP[t=20] will be calculated from the values t=17 and
t=27 as
  PDP[20] = 7/10 * value[17] + 3/10 * value[27] = 140 + 300 = 440
  PDP[30] = 7/10 * value[27] + 3/10 * value[37] = 700 +  30 = 730

So the problem is that the "Consolidation Function" (CF) is not taken
into account when building a PDP -- only to create "Consolidated Data
Points" (CDP).

> `rrdtool fetch` lists as follows:
> 
> 1284033470: 2.463450e+01
> 1284033480: 7.832815e+02
> 1284033490: 7.794455e+02
> 1284033500: 1.910200e+01

This output supports the argument given above: The value (1537.387) was
apparently distributed over two PDPs:
  7.832815e+02 + 7.794455e+02 = 1562.727

> Is it possible?

With RRDtool, no. Not that I'm aware, anyway. You could use the "CSV"
plugin, for example. That doesn't do any of that magic and will store
1537.387 as you'd expect.

> Otherwise, what is a benefit of GaugeMax?

If within one interval *multiple* lines are matched by the regular
expression, only that value will be passed on to the write plugins. As
long as the value is within collectd, it *is* 1537.387. The unintended
averaging happens after the value has left collectd's control.

Hope this helps.. Regards,
--octo
-- 
Florian octo Forster
Hacker in training
GnuPG: 0x0C705A15
http://octo.it/


signature.asc
Description: Digital signature
___
collectd mailing list
collectd@verplant.org
http://mailman.verplant.org/listinfo/collectd


Re: [collectd] What is GaugeMax of tail plugin?

2010-09-11 Thread Florian Forster
Good morning,

On Fri, Sep 10, 2010 at 01:33:05PM +0400, Denis Melnikov wrote:
> Why collectd cannot update RRD for fixed points in time?

imho the problem is that the CF is not taken into account at the PDP
level. I imagine this is due to the root of RRDtool in MRTG: When you
read interface counters (or any other COUNTER or DERIVE data source) you
already have implicit averaging over the interval in which you read the
counter. And for the AVERAGE CF the PDP calculation does *exactly* what
you'd expect, i.e. it calculates the average of the time-shifted
interval to the highest possible precision.

> There will be some time shift, that may be done optional for those who
> take care of it.

What you could do is write a "target" for collectd to do something like:

  vl->time -= (vl->time % vl->interval);

If you use such a match in the *PreCache* chain, collectd won't mind the
changed time and RRDtool will most likely produce a value much closer to
what you expect. See the [[Chains]] wiki page for an in-depth
explanation of the filter subsystem of collectd.

Regards,
—octo
-- 
Florian octo Forster
Hacker in training
GnuPG: 0x0C705A15
http://octo.it/


signature.asc
Description: Digital signature
___
collectd mailing list
collectd@verplant.org
http://mailman.verplant.org/listinfo/collectd


Re: [collectd] [PATCH] lpar plugin update

2010-09-11 Thread Florian Forster
Hi Aurélien,

thanks for the update :) I've pushed the changes to the "ar/lpar"
branch.

On Thu, Sep 09, 2010 at 10:43:16PM +0200, Aurélien Reynaud wrote:
> - get back to the original implementation with gauges only. A new type
>   "vcpu" is created (it was "lpar_pcpu" in the original)

Yeah, makes a lot more sense now that I understand what's going on ;)

> - the "consumed" metric might seem superfluous at first sight […]
>   But I thought it might come in handy when dealing with dedicated
>   partitions, where donated and stolen values are no easy concepts.

Would it make sense to activate this metric only if the partition is a
dedicated partition and donations have been enabled?

Likewise, would it make sense to submit "entitled" capacity only if the
partition is a shared partition? For dedicated partitions you should be
able to calculate "entitlement" as:

  entitled = user + sys + wait + idle + {busy,idle}_donated

> I posted a fix ("Fix errno thread-safety under AIX") on Sat, 19 Jun
> 2010, which if I am not mistaken has not been merged yet.

Thanks for the reminder, I must have overlooked that email. I applied
the fix to the collectd-4.9 branch and will merge it to master
eventually.

> + ssnprintf (typinst, sizeof (typinst), "pool-%X-total", 
> lparstats.pool_id);
> + lpar_submit (typinst, (double) pool_max_ns / XINTFRAC / 
> (double) ticks);

I'd prefer to account "busy" and "used" (non-busy) rather than "busy"
and "total". Do you see any problem with changing that?

> + save_last_values (&lparstats);

I think it might be easier to keep a (module) global
"perfstat_partition_total_t" around and simply do

  memcpy (&lparstats_old, lparstats_new, sizeof (lparstats_old));

in the "save values" function. What do you think?

Regards,
—octo
-- 
Florian octo Forster
Hacker in training
GnuPG: 0x0C705A15
http://octo.it/


signature.asc
Description: Digital signature
___
collectd mailing list
collectd@verplant.org
http://mailman.verplant.org/listinfo/collectd


Re: [collectd] PATCH apache.c: support IBM HTTP Server.

2010-09-11 Thread Florian Forster
Hi Manuel,

On Wed, Sep 08, 2010 at 11:01:23AM +0200, Manuel Luis Sanmartín Rozada wrote:
> IBM HTTP Server is a version of apache 2 that comes with Websphere.
> IBM change the server name header to:
> 
> Server: IBM_HTTP_Server

thanks for the patch, I've applied it to the master branch.

Regards,
—octo
-- 
Florian octo Forster
Hacker in training
GnuPG: 0x0C705A15
http://octo.it/


signature.asc
Description: Digital signature
___
collectd mailing list
collectd@verplant.org
http://mailman.verplant.org/listinfo/collectd


Re: [collectd] New plugin - tempernet

2010-09-11 Thread Florian Forster
Hi Nicolas,

On Tue, Aug 10, 2010 at 07:43:19PM +1000, Nicolas Guillaumin wrote:
> I just wrote a new plugin to collect temperature readings from a cheap
> TemperNET USB sensor (
> http://www.pcsensor.com/index.php?_a=viewProd&productId=15 ).
> Please find attached the patch file + temper.c plugin source. The
> patch is against release v4.10.1

thank you very much for your patch :)

> Usual warning applies ;-)
> - I've not coded in C since my student years,

And I have a couple of requests for changes ;) Namely:

  * If you pass a numeric argument to a function directly, i.e. without
assigning it to a variable first, please add the name of the
argument in the call. For example, instead of "foo (1, 0);" do
something like "foo (/* index = */ 1, /* flags = */ 0);". This makes
calls such as
  usb_control_msg(sensor->handle, 0x21, 9, 0x200, 0x01, (char *) buf, 32, 
USB_TIMEOUT);
readable.

  * In the above call, rather than passing "32" (presumable the size of
the buffer) use "sizeof (buf)". Same same applies to other functions
operating on the buffer, for example
"memset (buf, 0, sizeof (buf));" [*]

  * Speaking of which: "bzero" is a BSD extension. Please use "memset"
instead.

  * You only ever allocate one instance of "struct tempernet". It'd be
easier to program and read if you'd simply use two static variables
instead of the struct.

  * If you do allocate memory, please check the return value of "malloc"
and friends.

  * You seem to copy several functions verbatim from the TEMPer driver
by Robert Kavaler (linked in the header of the file). You should
therefore list him and yourself as copyright holders. Also,
originally the code is not under the GPL but some sort of 1-clause
BSD license. Please do one of the following:

* Provide your code under the same license.
* Make it clear which parts of the plugin are GPL and which are the
  original license.
* Contact the original author whether he agrees with the relicensing
  of his code. Please CC me or the collectd mailing list if possible.

Sorry this has become such a long list ;) I think that all the problems
are easy to fix, though.

Please consider to adopt the following coding style, though:
-- 8< --
  for (many iterations)
  {
if (!condition)
{
  error handling;
  continue;
}

many;
more;
code;
  }
-- >8 --

This is much easier on the eye than:
-- 8< --
  for (many iterations)
  {
if (condition)
{
  many;
  more;
  code;
}
else
{
  error handling;
}
  }
-- >8 --
  
> - I've never used autoconf/automake before.

Don't worry about it, I'll take care of that.

Regards,
—octo

[*] "memset" is such a common function that passing "0" as the second
argument without description is okay as a special exception ;)
-- 
Florian octo Forster
Hacker in training
GnuPG: 0x0C705A15
http://octo.it/


signature.asc
Description: Digital signature
___
collectd mailing list
collectd@verplant.org
http://mailman.verplant.org/listinfo/collectd


Re: [collectd] [PATCH] lpar plugin update

2010-09-11 Thread Florian Forster
Hey Aurélien,

On Sat, Sep 11, 2010 at 03:00:28PM +0200, Aurélien Reynaud wrote:
> No problem, I've been throught this myself! Your will to thoroughly
> understand every part of the code is IMHO the main reason behind
> collectd's excellent quality.

yeah, and now I'm craving for a POWER7-based system and am beginning to
think that 10,000 EUR for a server is "cheap" and it's all your fault! ;)

> > > - the "consumed" metric might seem superfluous at first sight […]
> > 
> > Would it make sense to activate this metric only if the partition is
> > a dedicated partition and donations have been enabled? Likewise,
> > would it make sense to submit "entitled" capacity only if the
> > partition is a shared partition?
> […]
> 'consumed' is a generic concept, materializing the hardware processing
> power really allocated to the partition by the hypervisor. 
> 
> Be it a shared or dedicated partition, comparing this value to the
> hard/soft limits you configured (entitlement, capping, donation) is
> possibly what admins care the most about. Many will be satisfied with
> just having the 'entitled' and 'consumed' metrics.

You're probably right. I'm sure "my admins" would threaten me in the
most dreadful ways [*] if I was about to remove anything I/O specific
from a plugin such as this, though ;)

> So maybe we could collect only entitled and consumed by default, and
> provide some 'Details = true/false' option to enable the collection of
> the different processor states.

Would be okay for me. Power to the user ;)

> > > I posted a fix ("Fix errno thread-safety under AIX") [...]
> > I applied the fix to the collectd-4.9 branch and will merge it to
> > master eventually.
> Thanks.

No, thank you for figuring this out :) I didn't find anything useful on
"_THREAD_SAFE", by the way, other than something along the lines of
"your thread provider should set this". So I'm assuming that an equally
well solution would be to include  before . Just a
guess, though.

> > I'd prefer to account "busy" and "used" (non-busy) rather than
> > "busy" and "total". Do you see any problem with changing that?
> 
> None. I can provide a patch, unless you want to code it yourself. I
> just find a little confusing to have busy/used as metrics. Maybe
> busy/idle or used/unused would be more straightforward?

Oh, "used" was a typo. I meant "idle", of course. The reason for this
(and to some degree, the "consumed" / "entitled" business above) is that
I prefer each tick to be accounted for *once*. So the busy ticks would
be included in both, the "busy" metric and the "total" metric. When both
are equally available (or trivially computable) I prefer used/unused
over used/total.

I'll probably implement this tonight, unless you beat me to it. It
should basically be one subtraction and a changed string .. ;)

> >   memcpy (&lparstats_old, lparstats_new, sizeof (lparstats_old));
> 
> Fine with me. Just need to make sure no struct members are pointers to
> other structs.

Currently only the counters and the time_base (or base_time?), which is
another counter, are required. I think a comment along the lines of
"don't use pointers from this struct" should be sufficient. But I doubt
that there are pointers in the struct -- we never need to free it, so
the called code wouldn't know when it was safe to free the pointed to
data.

> IMHO the implementation requiring the fewest lines of code will be the
> best here...

My thinking .. We save ~10 global variables and updating the state
becomes a trivial "memcpy".

Regards,
--octo

[*] Like cutting off my caffeine supply.
-- 
Florian octo Forster
Hacker in training
GnuPG: 0x0C705A15
http://octo.it/


signature.asc
Description: Digital signature
___
collectd mailing list
collectd@verplant.org
http://mailman.verplant.org/listinfo/collectd


[collectd] Announcing “Collection 4” (C4 )

2010-09-13 Thread Florian Forster
Hi everybody,

I've been working on a new front-end for collectd for some time and
finally reached a state worth publishing. “Collection 4” (abbreviation
“C4”) is intended to supersede "Collection 3" eventually, but this
doesn't mean that I won't accept patches for Collection 3 anymore, of
course. C4 is far from finished, so expect changes in functionality
frequently and also in between patch releases.

The main goal of C4 is performance. Collection 3 had a lot of startup
overhead, because a lot of Perl-Modules needed to be parsed even when
only totally unrelated graphs were to be drawn ([0]). C4 uses FastCGI to
use the same process to handle multiple requests, reducing startup
costs several magnitudes. Because the available graphs and the list of
required files for each are kept in memory, most requests can be handled
without disk access once the process has been initialized. In my
experience, the browser is the bottleneck now. C4 tries to be friendly
in this respect by limiting the amount of data sent to the browser to a
useful size ([1]).

The second central feature is graph configuration. While many front-ends
do a great job of displaying arbitrary data nicely, ultimately I want to
be able to configure the graphs to look exactly the way I want them to
look. That's why that feature was present in Collection 3 and in
Collection 2, too. The configuration syntax used in C4 is more flexible
than the syntax used in Collection 3. It allows create a graph from
multiple identifiers (think: files) but also allows one identifier to
appear in multiple graphs. Unfortunately, this is not yet fully
implemented to users cannot see this feature in all its glory.

Another important aspect are "data providers": C4 doesn't entirely focus
on RRD files as the source of its data, but abstracts away the source of
the data to make it easy to add new "providers" in the future. This
means, however, that reading ("fetching") the data needs to be separate
from rendering ("graphing") the data ⇒ no rrdgraph(1) ([2]). Currently
graphs are rendered using gRaphaël, which currently lacks many graphing
features supported by the configuration (hence the limitation in the
previous paragraph). In the near future I hope to see a data provider
for RRDCacheD (using "rrdc_fetch") and possibly CSV files.

The HTML-based interface provided by the front-end is very simple and
leaves a lot of room for improvements. Patches are especially welcome in
this respect. (Almost?) all the actions are also available in a JSON-
encoded form. My hope is that better front-end programmers than me use
this to create nice front-ends based on C4 ([3]), basically using it as a
back-end ([4]).

So, without further ado, here's the link to the homepage, wiki page and
download links:

  Homepage:  http://octo.it/c4/
  Wiki:  http://collectd.org/wiki/index.php/Collection_4
  Download:  http://octo.it/c4/files/collection-4.0.0.tar.bz2

Needless to say: Feedback is *very* welcome. So are patches :)

Best regards,
—octo

[0] When using the front-end on a smaller scale, you probably don't
notice this too much, but when you have 50k+ RRD files and the
system is very busy just updating RRD files, you can deliver maybe
two graphs per second. I would have expected loading of Perl modules
to be a CPU-bound operation once the files containing the modules
are in the page cache, but experience tells a different story.
[1] For example, printing a list of all hosts may result in 100+ kByte
sent to the browser on large setups. Having such a list in the menu
may be nice when you have 10 hosts. When you have 1000, this will
render the interface slow.
[2] This is only a half-truth: Support for rrdgraph(1)-based images is
currently present but not very well maintainable. It *might* be
removed in the future or at very least it will be reorganized.
[3] Lindsay / Visage, I'm looking at you! ;)
[4] Hey, if it's a front-end and a back-end at the same time, did I just
create one of those "middlewares" businesses always get all excited
about?! ;)
-- 
Florian octo Forster
Hacker in training
GnuPG: 0x0C705A15
http://octo.it/


signature.asc
Description: Digital signature
___
collectd mailing list
collectd@verplant.org
http://mailman.verplant.org/listinfo/collectd


Re: [collectd] Collection 4 - Two programs you should check for in configure

2010-09-20 Thread Florian Forster
Hi Andreas,

thank you for reporting this problem!

On Fri, Sep 17, 2010 at 03:30:00PM +0200, Andreas Maus wrote:
> To cut a long story short, I think the configure script should exit
> with an error if lex/flex and bison/yacc are not found.

f?lex and yacc|bison are only required when building the sources from
scratch. The tarballs created by "make dist" will include the generated
.c and .h files, so lex and yacc are not required when building a
release.

You are however right that we should make it easier for a user to find
this out. As a first step I have added a note regarding this to the
README file. I'll probably also change the new "autogen.sh" file to
report an error in this case.

Best regards,
--octo
-- 
Florian octo Forster
Hacker in training
GnuPG: 0x0C705A15
http://octo.it/


signature.asc
Description: Digital signature
___
collectd mailing list
collectd@verplant.org
http://mailman.verplant.org/listinfo/collectd


Re: [collectd] multiple sections?

2010-09-24 Thread Florian Forster
Hi Neal,

On Thu, Sep 23, 2010 at 04:17:22PM -0700, Neal Tucker wrote:
> Is there a way I can define two  sections without having one
> clobber the other?  Is there another way to solve this problem?

yes, all you have to do it repeat the '' block. For
example:

-- 8< --
# File "group-0/task-a"

  Exec "task-a-user" "/usr/local/lib/collectd/task-a"

-- >8 --

-- 8< --
# File "group-1/task-b"

  Exec "task-b-user" "/usr/lib/collectd/task-b" "arg0"

-- >8 --

HTH. Regards,
—octo
-- 
Florian octo Forster
Hacker in training
GnuPG: 0x0C705A15
http://octo.it/


signature.asc
Description: Digital signature
___
collectd mailing list
collectd@verplant.org
http://mailman.verplant.org/listinfo/collectd


Re: [collectd] [PATCH] lpar plugin: use pool_idle_time to account for cpu pool usage

2010-09-26 Thread Florian Forster
Hi Aurelien,

On Thu, Sep 23, 2010 at 11:29:49AM +0200, Aurelien Reynaud wrote:
> The current implementation uses pool_busy_time (expressed in ns) but
> experience shows this metric isn't accurate: It shows lower cpu usage
> for the entire pool than the sum of the participating lpars.
> Using pool_idle_time (expressed in clock ticks) in contrast is almost
> a perfect match.

thanks for the update! :) So what you're saying is that "busy + idle"
may not be equal to "max"? If so, What happens to the missing CPU
cycles? Would it make sense to keep track of this separately? Something
like "missing = max - (idle + busy)" could be used, for example.

I think I remember something about ticks varying in the time they
consume, due to power-saving facilities built into the CPUs. This would
explain why the (physical) CPU time available to the cluster is measured
in nanoseconds rather than ticks. Also, if there are more and shorter
ticks in the same wallclock time due to power-saving measures, this
would explain the perceived lower CPU usage when converting the ns back
to ticks using a larger "ns per tick" constant. So maybe the "missing"
metric above could be named "power_save". What do you think?

Regarding the patch, I'd like to propose one tweak:

> -#define NS_TO_TICKS(ns) ((ns) / XINTFRAC)
> [...]
> + pool_idle_cpus = (double) (lparstats.pool_idle_time - 
> lparstats_old.pool_idle_time) / XINTFRAC / (double) ticks;

I'd really like to keep this macro: "diff / XINTFRAC / ticks" doesn't do
a good job at describing to the reader what's going on. With the macro
this becomes "NS_TO_TICKS (diff) / ticks": you can see without looking
at the macro's implementation that "diff" is converted from nanoseconds
to ticks and then divided by ticks, which results in a ratio.

Best regards,
--octo
-- 
Florian octo Forster
Hacker in training
GnuPG: 0x0C705A15
http://octo.it/


signature.asc
Description: Digital signature
___
collectd mailing list
collectd@verplant.org
http://mailman.verplant.org/listinfo/collectd


Re: [collectd] Problem in compiling ORACLE plugin

2010-09-27 Thread Florian Forster
Hi Carlo,

On Sun, Sep 26, 2010 at 12:13:20PM +0200, ROMAGNOLI Carlo wrote:
> configure:20337: gcc -o conftest -g -O2  
> -I/home/xxx/instantclient/rdbms/public  -L/home/xxx/instantclient/lib 
> -lclntsh conftest.c -ldl  >&5
> /usr/bin/ld: skipping incompatible /home/xxx/instantclient/lib/libclntsh.so 
> when searching for -lclntsh
> /usr/bin/ld: cannot find -lclntsh

I haven't seen this error message yet, but my guess would be that it's a
32bit library and you're building a 64bit executable -- or vice versa.
To narrow the problem down, could you please provide the output of the
following commands?:

  file /home/xxx/instantclient/lib/libclntsh.so
  ldd /home/xxx/instantclient/lib/libclntsh.so
  objdump -x /home/xxx/instantclient/lib/libclntsh.so

Regards,
--octo
-- 
Florian octo Forster
Hacker in training
GnuPG: 0x0C705A15
http://octo.it/


signature.asc
Description: Digital signature
___
collectd mailing list
collectd@verplant.org
http://mailman.verplant.org/listinfo/collectd


Re: [collectd] Illegal attemp to update ...

2010-09-28 Thread Florian Forster
Hi,

On Tue, Sep 28, 2010 at 05:46:24PM +0700, Anh K. Huynh wrote:
> I see the following messages from collectd:
> 
> > illegal attempt to update using time 1285669914 when last update
> > time is 1285669915 

> What's kind of problem?

somehow a value with the timestamp "1285669914" is submitted to the
daemon. The daemon checks its internal cache and finds out that the last
data received for this metric is "1285669915" -- i.e. the "new" value is
older than the value before that.

The most common cause for this is that the metric is collected twice.
Maybe the name isn't unique or maybe you're running two instances of
collectd on a client machine.

Regards,
--octo
-- 
Florian octo Forster
Hacker in training
GnuPG: 0x0C705A15
http://octo.it/


signature.asc
Description: Digital signature
___
collectd mailing list
collectd@verplant.org
http://mailman.verplant.org/listinfo/collectd


Re: [collectd] Using SNMP plugin with collectd

2010-09-28 Thread Florian Forster
Hi Kakoli,

On Tue, Sep 28, 2010 at 10:03:26AM +, Sen, Kakoli wrote:
> I need to collect cpu/memory/disk usage of windows node.

if you could use an evaluation version of SSC Serv [0], please let me
know. The "free edition" is capable of collecting CPU and interface
statistics, the full version collects memory, disk (among others) as
well.

> In collectd.conf, in snmp Data block it is given as
> Table true
> Values "HOST-RESOURCES-MIB::hrProcessorLoad "
> 
> But for a multi-processor node, in the same csv file values of both
> processors are coming.

When configuring a *table*, the plugin expects multiple values to be
returned. If you do not specify the "Instance" option, the last part of
the OID will be used as instance, in the hope that this is unique.

If, however, the data is returned as

  hrProcessorLoad.1.0 = 12
  hrProcessorLoad.2.0 = 13

both values will use "0" (the last part of the OID) as instance.

Could you provide an snmpwalk of the entire table? This would make it a
lot easier to tell you how to configure the plugin.

Best regards,
--octo

[0] 
-- 
Florian octo Forster
Hacker in training
GnuPG: 0x0C705A15
http://octo.it/


signature.asc
Description: Digital signature
___
collectd mailing list
collectd@verplant.org
http://mailman.verplant.org/listinfo/collectd


Re: [collectd] MySQL plugin reports wrong numbers on mysqld restart

2010-09-29 Thread Florian Forster
Hi Bostjan, Mariusz and everybody else ;)

On Wed, Sep 29, 2010 at 02:47:08PM +0200, XANi wrote:
> Dnia 2010-09-29, śro o godzinie 03:09 +0200, Bostjan Skufca pisze:
> > Maybe it's time to make a script for automated dump/change obviously
> > invalid data to NaN/import of rrds.

What's obvious to you is not obvious for a script. So I'm afraid you'll
be stuck with some manual tuning or write a script specifically suited
for your needs.

> > Do you know how  why it was not DERIVE specified at rrd creation?
> Datatype was not DERIVE probably because mysql plugin (as many other)
> is much older than actual DERIVE support in collectd :)

100% right: Support for DERIVE was added relatively late – much later
than the MySQL plugin. The next major version, version 5.0, will likely
convert most of the COUNTER data sources to DERIVE – it's the better
default in most cases.

> > Probably I should consult google and manuals for this, but: is there
> > an easy way to change from COUNTER to DERIVE or do I have to dump
> > and reload the data?
> As for changing counter to derive, simple rrdtool dump to xml file and
> find/replace should be enougth

That's too complicated. You can use "rrdtool tune" (rrdtune(1)) to do
this without much hassle:

  rrdtool tune mysql_commands-select.rrd --data-source-type value:DERIVE

Regards,
—octo
-- 
Florian octo Forster
Hacker in training
GnuPG: 0x0C705A15
http://octo.it/


signature.asc
Description: Digital signature
___
collectd mailing list
collectd@verplant.org
http://mailman.verplant.org/listinfo/collectd


Re: [collectd] Collection4 - SI units for JSON graphs?

2010-10-01 Thread Florian Forster
Hi Andreas,

On Thu, Sep 30, 2010 at 02:28:13PM +0200, Andreas Maus wrote:
> While testing the new collectio4 frontend it seems that the JSON
> generated output has some problems with large values and messes up the
> labels for the y-axis (see json.png).

> Is there an option to use SI units for the y-axis in the JSON
> generated graphs?

this is something that the underlying JavaScript library, gRaphaeël,
doesn't support.

I'm no expert in JavaScript, so if you or anyone else on the list knows
a more powerful graphing library, please let me know. If anyone feels up
to the task: The (unmangled) JavaScript source code is up on Github, I
guess the author would welcome contributions ;)

Regards,
--octo
-- 
Florian octo Forster
Hacker in training
GnuPG: 0x0C705A15
http://octo.it/


signature.asc
Description: Digital signature
___
collectd mailing list
collectd@verplant.org
http://mailman.verplant.org/listinfo/collectd


Re: [collectd] swap plugin improvements

2010-10-06 Thread Florian Forster
Hi Matthew and Aurélien,

sorry for the very delayed reply. Fortunately I've kept track of this
patch in the version 5.0 wiki page, otherwise I would have forgotten
about it.

On Tue, May 25, 2010 at 12:01:40PM +0200, Aurélien REYNAUD wrote:
> To be clear on the subject, the solaris swap command can be run with
> one of the two arguments -l or -s.

Okay, if I understand you correctly, then "-l" reports the swap usage
from the device's perspective while "-s" uses the point of view from
virtual memory.

I think either view has its advantages, so I think the best way to go is
to let the user chose what to see. I can't really think if a clever name
for the config option though. What do you think about "ReportPhysical" /
"ReportVirtual"? (Setting both to true will result in both views being
collected.)

With regard to the physical view: Would it make sense to report this on
a per-disk (per-partition) basis? For example:

> r...@uv8801xr:/root> swap -l
> swapfile dev  swaplo blocks   free
> /dev/vx/dsk/swapvol 242,5  16 24576704 23138752

This could be reported as

  uv8801xr/swap-vx-dsk-swapvol/swap-used = 24576704 - 23138752
  uv8801xr/swap-vx-dsk-swapvol/swap-free = 23138752

We could possibly configurate this behavior with something like
  ReportPhysical no/yes/separate/combined
where "yes" and "combined" are synonymous.

Regards,
—octo
-- 
Florian octo Forster
Hacker in training
GnuPG: 0x0C705A15
http://octo.it/


signature.asc
Description: Digital signature
___
collectd mailing list
collectd@verplant.org
http://mailman.verplant.org/listinfo/collectd


Re: [collectd] Using SNMP plugin with collectd

2010-10-06 Thread Florian Forster
Hi Kakoli,

On Sun, Oct 03, 2010 at 04:16:21PM +, Sen, Kakoli wrote:
> host.hrDevice.hrProcessorTable.hrProcessorEntry.hrProcessorLoad.15
> host.hrDevice.hrProcessorTable.hrProcessorEntry.hrProcessorLoad.16
> 
> I have put the following block in Collectd.conf for capturing this data:
> 
> 
>Type "cpustats_percent"
>Table true
>Instance "HOST-RESOURCES-MIB::hrProcessorEntry"
>Values "HOST-RESOURCES-MIB::hrProcessorLoad"
> 

with this configuration, the SNMP plugin will do two "SNMP walks", one
over "hrProcessorEntry" and one over "hrProcessorLoad", using successive
"GETNEXT" requests.

With GETNEXT, either OID will be used as the root of the respective
sub-tree. The first GETNEXT request will return the first leaf in the
tree, then the second leaf and so forth until the returned OID is
outside the scope of the sub-tree.

When all is done, you'll have two lists of values, one for "Instance"
and one for "Values". Ideally that's something like:

  Values:  23, 42, 4711
  Instance "foo", "bar", "qux"

The list of values returned from the "Instance" and the "Values" OIDs
will be matched together by their order. In our example:

  "foo":   23
  "bar":   42
  "qux": 4711

As you can see, ideally both lists have the same number of items. With
your configuration, the "hrProcessorLoad" sub-tree is contained within
the "hrProcessorEntry" subtree, which is a very strong indicator that
something is wrong. In your case, the "Instance" list will probably
contain garbage, because your root is too wide up in the OID-tree. And
this is likely the source of your problems.

> We need the configuration to be generic, applicable to any windows
> node.

If you didn't set the "Instance" option, the plugin would use the last
part of the value OID as instance. For example, "hrProcessorLoad.42"
translates to ".1.3.6.1.2.1.25.3.3.1.2.42", the last part would be "42".
I guess this is your best shot at getting unique identifiers. I admit
that calling the CPUs "15" and "16" when there are only two CPUs total
is confusing, but I don't know any elegant and generic way to cope with
that.
 
If that solution doesn't satisfy your needs, I recommend to take a look
at SSC Serv [0]. It reports CPU usage broken up into IRQ, System and
User code and – of course – uses the usual zero-based numbering.

Regards,
—octo

[0] 
-- 
Florian octo Forster
Hacker in training
GnuPG: 0x0C705A15
http://octo.it/


signature.asc
Description: Digital signature
___
collectd mailing list
collectd@verplant.org
http://mailman.verplant.org/listinfo/collectd


Re: [collectd] [PATCH] Plugin for Linux Software-RAID devices

2010-10-06 Thread Florian Forster
Hi Michael,

thank you very much for your patch :) Especially thanks for including
changes to the manpage and the build system!

On Wed, Oct 06, 2010 at 12:26:24AM +0200, Michael Hanselmann wrote:
> It reports the number of component devices, number of devices in
> array, number of active, working, failed and spare disks.

What's the difference between "number of component devices", "number of
devices in array", and "number of active disks"? From looking at the
"mdu_array_info_t" struct is looks like

  active == working + failed + spare

and that "active_disks" is equal to either "nr_disks" or "raid_disks".

Your code looks very clean and readable, great job! :) I have some
comments nonetheless …

> +static const char *config_keys[] =
> +{
> + "Device",
> + "IgnoreSelected",
> + NULL
> +};
> +static int config_keys_num = STATIC_ARRAY_SIZE (config_keys);

You should remove the NULL entry from this array (preferred) or subtract
one from "config_keys_num" here.

> + goto out_close;

Please remove that "goto". Calling "close" and "return" in each case is
only one line more for each case, totalling in three more lines (for
which you can remove four lines at the end of the function ;).

> + if (st.st_rdev != makedev (MD_MAJOR, minor))
> + {
> + WARNING ("md: Major/minor of %s are %i:%i, should be %i:%i",
> +  path, (int)major(st.st_rdev), (int)minor(st.st_rdev),
> +  (int)MD_MAJOR, minor);
> + goto out_close;
> + }

Since "major" and "minor" are macros defined by the (GNU) libc, having
variables of the same name around is very confusing IMHO. To be honest,
I'm surprised the preprocessor is fine with this.

> + if (ioctl (fd, GET_ARRAY_INFO, &array) < 0) {
> […]
> + assert (array.md_minor == minor);
> […]
> + md_submit (minor, "md_disks", "number", array.nr_disks);

I think I wouldn't assert this here. If you want to make absolutely
certain that the ioctl didn't return something else, you should check
using a normal "if" condition and return an error. And if you replace
"minor" with "array.md_minor" in the call to "md_submit", the worst that
can happen without the check is that an minor number is dispatched
although it was ignored using its device name.

Regards,
—octo
-- 
Florian octo Forster
Hacker in training
GnuPG: 0x0C705A15
http://octo.it/


signature.asc
Description: Digital signature
___
collectd mailing list
collectd@verplant.org
http://mailman.verplant.org/listinfo/collectd


Re: [collectd] [PATCH] lpar plugin: use pool_idle_time to account for cpu pool usage

2010-10-06 Thread Florian Forster
Hi Aurélien,

I just added the changes you sent me to the LPAR plugin, i.e. "pool
busy time" is now calculated from "pool idle time" and not the other way
around.

It'd be awesome if you could test my changes and tell me whether the
plugin now works as expected or if further changes are required. I'll
merge the branch as soon as you give me the thumbs-up ;)

Regards,
—octo
-- 
Florian octo Forster
Hacker in training
GnuPG: 0x0C705A15
http://octo.it/


signature.asc
Description: Digital signature
___
collectd mailing list
collectd@verplant.org
http://mailman.verplant.org/listinfo/collectd


Re: [collectd] collectd exec and perl

2010-10-28 Thread Florian Forster
Hi,

On Thu, Oct 28, 2010 at 12:07:03PM +0200, Mariusz Gronczewski wrote:
> if u use "interval" you have to use actual time instead of "N:" (at
> least it didn't want to work otherwise for me)

no, you should be able to mix "interval=" and "N:" as you wish.

> If it still dont work, turn off input buffering
> $| = 1;

If you use "N:", you *have* to disable output buffering. Please see [0]
for details.

Regards,
—octo

[0] 
-- 
Florian octo Forster
Hacker in training
GnuPG: 0x0C705A15
http://octo.it/


signature.asc
Description: Digital signature
___
collectd mailing list
collectd@verplant.org
http://mailman.verplant.org/listinfo/collectd


Re: [collectd] lpar plugin final patches

2010-11-06 Thread Florian Forster
Hi Aurélien,

thank you very much for you continued work on the plugin and your
patience with me ;)

On Thu, Oct 14, 2010 at 10:09:26PM +0200, Aurélien Reynaud wrote:
> here is a final series of patches regarding the lpar plugin.
> [...]
> As far as I am concerned it is ready for inclusion.

I applied your patches and merged the "ar/lpar" branch to the "master"
branch. So the plugin will definitely make it into version 5.0. :)

Thanks again and regards,
--octo
-- 
Florian octo Forster
Hacker in training
GnuPG: 0x0C705A15
http://octo.it/


signature.asc
Description: Digital signature
___
collectd mailing list
collectd@verplant.org
http://mailman.verplant.org/listinfo/collectd


Re: [collectd] collectd on solaris10 sparcv

2010-11-17 Thread Florian Forster
Hello anonymous user ;)

On Wed, Nov 17, 2010 at 11:53:29AM -0500, dlsirr...@upsfreight.com wrote:
> SO_BINDTODEVICE will not work with Solaris. wondering if there's a
> workaround. 

You're right, that option is not available on Solaris. I'm assuming you
have the problem within a  block. If you need the "Interface"
option within a  block, could you please elaborate why you
need it?

The common way (and the reason why this option is rarely used) is to let
the operating system chose which interface / source address to use. It
usually does this using the routing table.

An alternative for Solaris might be to add a "BindAddress" option to
 blocks. This is something we would need to add to the
sources, though -- there's no code available currently.

Last but not least: The node name given is passed to getaddrinfo(3). It
*might* be possible to add the interface name to the address, for
example to use IPv6 link-local addresses. I have absolutely no idea if
this would work with other addresses as well, and I'm certain that using
link-local addresses is an "interesting" (read: bad) idea. But if you
must try, it should work somewhat like this:

  Server "fe80::d55f:1534%eth0"

Hope this helps.. Regards,
--octo
-- 
Florian octo Forster
Hacker in training
GnuPG: 0x0C705A15
http://octo.it/


signature.asc
Description: Digital signature
___
collectd mailing list
collectd@verplant.org
http://mailman.verplant.org/listinfo/collectd


Re: [collectd] lua plugin

2010-11-23 Thread Florian Forster
Hi Julien,

On Mon, Nov 22, 2010 at 11:03:14PM +0100, Schmurfy wrote:
> this discussion started with octo as private messages on github but he
> wisely asked me to post it on the mailing list so anyone can take part
> in it so here I am !

thanks for taking this here :) And sorry for not looking closer at your
branch yet. I certainly hope to make some time later this week.

> […] the problem comes from the fact that the official lua distribution
> is a static library which cannot be embedded in the shared library
> built for collectd (I tried but failed miserably)

The problem is that in order to be able to link static libraries into
shared objects (dynamic libraries), the static library has be be
compiled with some special linker settings (-fPIC). This is an annoying
problem, because many people, including many maintainers of packages of
libraries, don't know this and have a hard time reproducing this bug,
because it doesn't show up on a x86 system.

About the two options:

> The first one is to link to a shared library release of lua […]
> The second one is to include the lua sources themselves into the collectd
> plugin, […]

In my opinion, the second option is out of question. We've done this
with liboping and libiptc in the past and nothing good came out of this.
You have all the problems you have with shared objects [*] but in an
even more severe variant [**]. We still include libltdl because that's
the way libtool works, but with the same result [0].

I'd go for a variant of the first option: Simply do a normal check for
the library using autoconf and link with libtool. This will
automatically get you the shared library if it is available and fall
back to the static library if necessary. Being the default behavior this
is what most people, including package maintainers, expect and know how
to deal with.

I'd like a comment on the (apparently common) "other languages should be
statically linked" misconception: Shared objects have a system to cope
with incompatible binary versions. A dynamic libraries implementing a
scripting language is nothing special and certainly no exception. If the
language itself changes, this is either transparent to the application
(annoying for the user, but not our problem) or it's an API change, for
which numerous detection mechanisms and solutions exist. All the reasons
for dynamic linking apply to scripting languages and DSLs just as well.

In the worst case, the API provided to scripts depends on the scripting
language's version. If so, that's how it is. Period.

To back my claims up by facts:

 * The Python plugin supports Python 2 *and* Python 3, which has a
   substantially changed concept of what a "string" is. Changes to a
   scripting languages hardly ever get any more sever than this.
 * The Java plugin has been tested with Java 1.4 through 1.6.
 * The Perl plugin works with Perl 5.8 and 5.10. I'm not sure about
   version 5.6, but I *think* that worked, too.

If we can handle these complex and mainly "stand alone" languages and
they manage to maintain mechanisms to work with changes made to them,
I'm confident a language *intended to be integrated* will manage to keep
up, too.

> I was also thinking about something else but linked to lua: have
> anyone thought about using a higher level language to configure
> filters ?

That does sound like an interesting though. It'll be a lot of work to
implement though. Be sure to toss around some ideas on the list before
implementing anything.

Regards,
—octo

[*]  In particular: You need to rebuild collectd and other binaries the
 library has been linked in when there's a security problem in the
 library. 
[**] You cannot determine that collectd uses liblua using the package's
 build dependencies, making it extra hard to realize there's a need
 to rebuild collectd.
[0]  
-- 
Florian octo Forster
Hacker in training
GnuPG: 0x0C705A15
http://octo.it/


signature.asc
Description: Digital signature
___
collectd mailing list
collectd@verplant.org
http://mailman.verplant.org/listinfo/collectd


Re: [collectd] lua plugin

2010-11-23 Thread Florian Forster
Hi again,

On Tue, Nov 23, 2010 at 09:33:23PM +0100, Schmurfy wrote:
> > I'd go for a variant of the first option: Simply do a normal check for
> > the library using autoconf and link with libtool.
> >
> I will certainly need your help on this ;)

no problem ;) While I still don't particularly *like* autoconf, I at
least learned to work with it. I'll look into it as soon as I find some
time for it.

> > I'd like a comment on the (apparently common) "other languages should be
> > statically linked" misconception: […]
> >
> As far as i am am concerned it was more a thought than a deeply
> anchored misconception ^^

Sorry, I think I got carried away there … I worked with another project
for a while and those guys insisted on pulling in the sources of their
tiny language (Squirrel in that case).

> I just wish to find a way that makes the installation painless for
> users.

In my experience that's synonymous to "make the package maintainer
happy" ;)

Best regards,
—octo
-- 
Florian octo Forster
Hacker in training
GnuPG: 0x0C705A15
http://octo.it/


signature.asc
Description: Digital signature
___
collectd mailing list
collectd@verplant.org
http://mailman.verplant.org/listinfo/collectd


Re: [collectd] lua plugin

2010-11-24 Thread Florian Forster
Hi,

On Tue, Nov 23, 2010 at 10:20:10PM +0100, Florian Forster wrote:
> no problem ;) While I still don't particularly *like* autoconf, I at
> least learned to work with it. I'll look into it as soon as I find
> some time for it.

I've done some improvements to the build system and the plugin itself. I
rebased the commits to the current master branch, so you probably have
to (re)set your branch to there instead of merging the changes.

I didn't take a close look yet, but it appears the build scripts have
changed between Lua 5.0 and 5.1 – from a custom build script to a
pkg-config file. I'll try to incorporate this eventually.

Unfortunately I don't have time to explain my changes in detail, I hope
you'll figure out what I did. As always, I tried to keep the code as
self-explanatory as possible, but this stack based VM doesn't exactly
make this easy :/

Best regards,
—octo
-- 
Florian octo Forster
Hacker in training
GnuPG: 0x0C705A15
http://octo.it/


signature.asc
Description: Digital signature
___
collectd mailing list
collectd@verplant.org
http://mailman.verplant.org/listinfo/collectd


Re: [collectd] Vmware Client SDK plugin

2010-11-25 Thread Florian Forster
Hi Edward,

On Tue, Nov 23, 2010 at 05:33:03PM -0800, Edward Muller wrote:
> I'm not sure the process to get this merged in, but I figured it would
> be useful to some people.

thank you very much for your code :) Sending it to the mailing list is,
of course, a great fist step ;) I'm certain many people welcome a plugin
for VMware.

> This is against collectd-4.7.4, so it may be a little stale as well.

I think you're not using anything that has changed since then. You may
get trivial merge conflicts in "configure.in" and "src/Makefile.am", of
course. If you want I can rebase the branch to the current master for
you.

> Let me know what's needed to get this merged officially upstream and
> I'll be happy to work on it in my spare time.

What caught my eye first was the long list of dynamically loaded
functions and the dlopen() code to get links to them. Isn't there a
header file available declaring this functions, so we can simply link
against the client library?

If the VMware Client SDK is available to the public I can take a look
myself. Do you know?

The second thing is a missing license. In my experience the license of
libraries and SDKs like the VMware Client SDK are hardly ever GPL
compatible. I suggest to use the MIT licesne because it's used by a few
other plugins already [0] (you can copy the license header from the
"NetApp" plugin for example), but any other permissive license should be
fine, too.

The schema in which data is dispatched should probably be changed a bit,
too. For example, the "mapped", "active", "overhead", balooned",
"swapped", "shared", and "used" memory blocks each have their own
"type". This should be changed to:

  type:  memory
  type instance: mapped, active, overhead, ...

Last and least: The (C-)type "counter_t" and supporting code should be
changed to "derive_t".

Best regards,
—octo

[0] 
-- 
Florian octo Forster
Hacker in training
GnuPG: 0x0C705A15
http://octo.it/


signature.asc
Description: Digital signature
___
collectd mailing list
collectd@verplant.org
http://mailman.verplant.org/listinfo/collectd


Re: [collectd] Vmware Client SDK plugin

2010-11-25 Thread Florian Forster
On Thu, Nov 25, 2010 at 09:27:50AM +0100, Florian Forster wrote:
> If you want I can rebase the branch to the current master for you.

I've done that: The code is now available from the "em/vmware" branch in
my Github account [0].

—octo

[0] <https://github.com/octo/collectd/tree/em%2Fvmware>
-- 
Florian octo Forster
Hacker in training
GnuPG: 0x0C705A15
http://octo.it/


signature.asc
Description: Digital signature
___
collectd mailing list
collectd@verplant.org
http://mailman.verplant.org/listinfo/collectd


Re: [collectd] lua plugin

2010-11-26 Thread Florian Forster
Hey,

On Fri, Nov 26, 2010 at 09:46:40PM +0100, Schmurfy wrote:
> I tried your branch but sadly I cannot build it, a dependency was
> added which does not exists under Mac OS X (clock_gettime):

I did some digging and apparently Mac OS X really doesn't have
clock_gettime(2), it's not just a special define required or something
like that.

I implemented a fallback using gettimeofday(2) which, hopefully, is
available everywhere. That function has been deprecated, so it *must* be
in widespread use, right? ;)

Hope this helps :)
—octo
-- 
Florian octo Forster
Hacker in training
GnuPG: 0x0C705A15
http://octo.it/


signature.asc
Description: Digital signature
___
collectd mailing list
collectd@verplant.org
http://mailman.verplant.org/listinfo/collectd


[collectd] Versions 4.9.4 and 4.10.2 available.

2010-11-27 Thread Florian Forster
Hello everybody,

I've packaged new patch releases of the current stable versions of
collectd. They fix a couple of bugs in collectd, plugins and the
documentation.

Since of the bugs, the failed assertion in the RRDtool and RRDCacheD
plugins, can be used for a Denial of Service (DoS) attack, all users are
urged to upgrade.


Download


The new version is available in source-code form from collectd's
download page. The direct download links are:

  Version 4.10.2:

  * http://collectd.org/files/collectd-4.10.2.tar.bz2
SHA-1: 8d83dd2d68ac4c0871774af99079564880abb5ef

  * http://collectd.org/files/collectd-4.10.2.tar.gz
SHA-1: 4163be3de4f5f7234eca43a23b2205c7931ba6f7

  Version 4.9.4:

  * http://collectd.org/files/collectd-4.9.4.tar.bz2
SHA-1: ea99c0e5eaa2bebe7ca95adc9be205749d49bf9d

  * http://collectd.org/files/collectd-4.9.4.tar.gz
SHA-1: f37c56dfccadac591ac8b213f594ea4ac7720495


Thanks
--

Thanks to everybody who helped with this new version. In particular,
bugs have been reported and fixed by:

  * Aurélien Reynaud
  * Sebastian Harl
  * Sven Trenkel


ChangeLog
-
2010-11-27, Version 4.10.2
  * Documentation: Various documentation fixes.
  * collectd: If including one configuration file fails, continue with
the rest of the configuration if possible.
  * collectd: Fix a bug in the read function scheduling. In rare cases
read functions may not have been called as often as requested.
  * collectd: Concurrency issues with errno(3) under AIX have been
fixed: A thread-safe version of errno has to be requested under AIX.
Thanks to Aurélien Reynaud for his patch.
  * collectd: A left-over hard-coded 2 has been replaced by the
configurable timeout value. (Version 4.10.2 only)
  * curl, memcachec, tail plugins: Fix handling of "DERIVE" data
sources. Matching the end of a string has been improved; thanks to
Sebastian Harl for the patch.
  * curl_json plugin: Fix a problem when parsing 64bit integers. Reading
JSON data from non-HTTP sources has been fixed.
  * netapp plugin: Pass the interval setting to the dispatch function.
Restore compatibility to NetApp Release 7.3. Thanks to Sven Trenkel
for the patch.
  * network plugin: Be less verbose about unchecked signatures, in order
to prevent spamming the logs.
  * notify_email plugin: Concurrency problems have been fixed.
  * python plugin: Set "sys.argv", since many scripts don't expect that
it may not be set. Thanks to Sven Trenkel for the patch.
  * rrdtool, rrdcached plugin: Fix a too strict assertion when creating
RRD files.
  * swap plugin: A bug which lead to incorrect I/O values has been
fixed. (Version 4.10.2 only)
  * value match: A minor memory leak has been fixed. Thanks to Sven
Trenkel for the patch.

Best regards,
—octo
-- 
Florian octo Forster
Hacker in training
GnuPG: 0x91523C3D
http://verplant.org/


signature.asc
Description: Digital signature
___
collectd mailing list
collectd@verplant.org
http://mailman.verplant.org/listinfo/collectd


Re: [collectd] lua plugin

2010-11-28 Thread Florian Forster
Hi Schmurfy,

I have now implemented basic support for write callbacks and improved
the thread-handling: There is now one Lua thread created for each
callback, which makes it possible for a Lua-based read function to
indirectly call a Lua-based write function in the same script.

On Thu, Nov 25, 2010 at 12:37:33AM +0100, Schmurfy wrote:
> For the read/write callbacks we can call a function by name from the C
> code which is enough in my opinion and should reduce the C code of the
> plugin, with one function called you can call as many as you want in
> the lua code itself as you wish.

As a matter of fact, passing functions is actually easier than looking
up a function by its name, though storing the function pointers is a bit
harder, because they can't be exported to C. Nonetheless, I think the
interface is nicer if we can pass callback functions to the register
functions, because it's more Lua-like. No native Lua code would
reference a function using a string, right?

With the current status, I was able to run the following test-script
successfully:

-- 8< --
 require ("collectd");
 
 function test_read ()
collectd_dispatch_values ({
host = "leeloo.octo.it",
plugin   = "lua",
type = "gauge",
interval = 10.0,
values   = { 3.1337 }
});
return (0);
 end
 
 collectd_register_read (test_read);
 collectd_register_write (function (vl)
collectd_info ("Lua: Anonymous write function has been called.");
return (0);
 end);
-- >8 --

This script demonstrates how to:

 * Register a named function.
 * Register an anonymous function.
 * Register read and write functions.
 * Dispatch values to collectd.

With that the basic points are now covered. There are, of course, still
more interfaces to port to Lua, but read and write callbacks are the
most important ones in my opinion. Currently open todos are:

 * Export oconfig_item_t to Lua and provide configuration hooks.
 * Provide hooks for notifications and make it possible to dispatch
   notifications to collectd.
 * Provide hooks for "match" and "target" callbacks. (Requires the
   configuration stuff above.)

What would you like to focus on now? The filter subsystem as you
indicated in an earlier email?

On Thu, Nov 25, 2010 at 12:37:33AM +0100, Schmurfy wrote:
> I already did most of the work you surely had to do to write this, I
> just wanted to have a build system running before adding more code to
> the plugin.

Sorry about that.. Please make sure to always push your changes to your
Git repository, so I can see that you're working on the plugin and what
you're working on. If you plan to rebase or modify your commits, I
suggest to use a special branch to indicate this, for example "lua/wip"
for "work in progress". You can also join our IRC channel [*] so I can
give you a heads up before working on the code.

Regards,
—octo

[*] #collectd on freenode
-- 
Florian octo Forster
Hacker in training
GnuPG: 0x0C705A15
http://octo.it/


signature.asc
Description: Digital signature
___
collectd mailing list
collectd@verplant.org
http://mailman.verplant.org/listinfo/collectd


Re: [collectd] lua plugin

2010-11-28 Thread Florian Forster
Hi again ;)

On Sun, Nov 28, 2010 at 04:30:10PM +0100, Schmurfy wrote:
> … callbacks …
> do you really think the added complexity of the C code is really worth
> what it brings to the lua side ?

Just to make sure I understand your approach correctly: After the script
is loaded, check if there is a global function called, for example,
"cb_read". If there is, call it every $interval seconds. The script does
not call any register_* function for this to happen – defining the
function with this special name is sufficient.

I think providing the register_* functions is worth the trouble: It is
closer to the C API and the interfaces provided by the Perl and Python
plugins. The Java API is a bit special because Java doesn't provide
function pointers, but you can still register objects implementing a
special interface.

> I am more than disappointed by the way you took over my plugin

Oh, sorry about that :/ I didn't mean to "steal" your plugin, I just got
excited and carried away. When I get excited about something I usually
start hacking away in a frenzy, sorry … I'll hold myself back.

> I am not angry at you (and I hope you won't be at me for being so direct)

No offense taken.

> I finally managed to compile a working lua plugin on my system the
> only problem I had is that the pkg-config package is called "lua" for
> me with 5.1 as its version

Oh $deity! The pkg-config names are the one thing that should be
constant over all platforms … Oh well, I'll see if I can work around
that.

Best regards,
—octo
-- 
Florian octo Forster
Hacker in training
GnuPG: 0x0C705A15
http://octo.it/


signature.asc
Description: Digital signature
___
collectd mailing list
collectd@verplant.org
http://mailman.verplant.org/listinfo/collectd


Re: [collectd] lua plugin

2010-12-04 Thread Florian Forster
Hi Schmurfy,

On Sun, Nov 28, 2010 at 04:30:10PM +0100, Schmurfy wrote:
> $PKG_CONFIG --exists lua5.1 2>/dev/null
> becomes
> $PKG_CONFIG --exists "lua >= 5.1" 2>/dev/null

I've fixed this in my "ja/lua" branch. The name used by pkg-config is
not determined fully automatically, but you can specify it manually by
using the following form:

  $ ./configure [$OTHER_OPTIONS] LIBLUA_PKG_CONFIG_NAME="lua"

Regards,
—octo
-- 
Florian octo Forster
Hacker in training
GnuPG: 0x0C705A15
http://octo.it/


signature.asc
Description: Digital signature
___
collectd mailing list
collectd@verplant.org
http://mailman.verplant.org/listinfo/collectd


Re: [collectd] swap plugin improvements

2010-12-06 Thread Florian Forster
Hi Aurélien,

I've just tinkered with the Swap plugin so swapctl(2) and kstat(3KSTAT)
can both be used on Solaris. The result is a mess of preprocessor #if
blocks and weird defaults, but I think all in all the default behavior
is intuitive:

  * If swapctl(2) is available, it is the default facility. You can set
the "ReportPhysical" option to "separate" to get per-device
statistics.
  * If kstat(3KSTAT) is available, too, you can set "ReportVirtual" to
"true" to enable virtual swap statistics. You can also set
"ReportPhysical" to "false" to disable physical swap information and
get the collectd 4 behavior.
  * If kstat(3KSTAT) is available and swapctl(2) isn't, the plugin will
behave like it did under collectd 4.

> Sounds reasonable to be, as long as "combined" is the default. The
> total values are what IT execs and developers are interested in. The
> exact physical layout of swap is for admins fine-tuning their
> systems...

I totally agree ;)

On Thu, Oct 07, 2010 at 11:24:24PM +0200, Aurélien Reynaud wrote:
> In my original posting, I suggested moving the "-s" (virtual memory)
> to the vmem plugin which is currently linux-only. Don't you think this
> would make sense?

I guess it'd make sense. I don't have the experience with Solaris to
implement useful vmem statistics, though.

It might make sense to remove the kstat code from the swap plugin
altogether, even if this means that the functionality won't be available
at first. It can then be added to the vmem plugin whenever someone with
enough knowledge of Solaris virtual memory steps up and implements it.
What do you think?

Best regards,
—octo
-- 
Florian octo Forster
Hacker in training
GnuPG: 0x0C705A15
http://octo.it/


signature.asc
Description: Digital signature
___
collectd mailing list
collectd@verplant.org
http://mailman.verplant.org/listinfo/collectd


[collectd] Version 5.0.0.beta0 available

2010-12-06 Thread Florian Forster
Hello everybody,

I have packaged and uploaded version 5.0.0.beta0 of collectd. This
version includes a lot of changes to the core daemon, the data sets used
and a number of backwards incompatible changes to plugins. Many of those
changes did not get thorough testing and I expect more new bugs in this
release than usual.

The goal of this beta release is to make these new features easier to
access by the general public in the hope that we can catch as many bugs
as possible before a final 5.0.0 release.

With this release I consider 5.0 "frozen", i.e. feature complete. I'm
sorry that I didn't manage to incorporate all the patches from the
mailing list or new additions cooking on Github. After the 5.0 release I
hope to get back to the 3-4 minor versions per year, so new features get
published in a reasonable time. If your patch is not included in 5.0,
please make sure there is an appropriate section under "Planned
features" on the "Roadmap" wiki page [0].

Depending on how many bugs are reported I may package other beta
versions in the next days and weeks. If you want to be kept up to date
with new beta releases, please follow @collectd on identi.ca or Twitter.


Download


The new version is available in source-code form from collectd's
download page. The direct download links are:

  Version 5.0.0.beta0

  * http://collectd.org/files/collectd-5.0.0.beta0.tar.bz2
SHA-1: 724790880c77cd48d433ede976f3dac1053fb555

  * http://collectd.org/files/collectd-5.0.0.beta0.tar.gz
SHA-1: 7e9124c5f9c91009fb6c10269ab9f28d84967851


ChangeLog
-

For a preliminary changelog please see:

http://git.verplant.org/?p=collectd.git;a=blob;f=ChangeLog;hb=7ee6787a10ac4d29116511b207a7e539153b5863


Best regards,
—octo

[0] 
-- 
Florian octo Forster
Hacker in training
GnuPG: 0x0C705A15
http://octo.it/


signature.asc
Description: Digital signature
___
collectd mailing list
collectd@verplant.org
http://mailman.verplant.org/listinfo/collectd


Re: [collectd] Version 5.0.0.beta0 available

2010-12-08 Thread Florian Forster
Hi Thorsten,

> On Tue, Dec 07, 2010 at 09:52:52AM -0800, Thorsten von Eicken wrote:
> [version 5.0.0.beta0 available]
> > What is compatible with what?

in general, sending data from a v4 client to a v5 server should be
relatively painless. In many cases, using the "v5upgrade" target should
convert received v4 data to the appropriate v5 layout.

The actual migration path is still a bit a work in progress. I've
recently pushed a migration script which will emit a shell script for
tuning and moving RRD files.

If anyone could test the migration (on a separate server, of course) and
report back what worked and what problems were encountered, this
information would be greatly valued!

> > It gets a bit tricky when considering that you changed a lot of the
> > details how plugins report data (e.g. to 'value' and derive instead
> > of counter), that affects all front-ends that display data.

I'm afraid so, but all in all it got easier: The data source name of an
RRD file with exactly one data source is "value". This is no longer a
rule of thumb, it is now a reality. I will not accept any changes which
don't adhere to this rule and I will try very hard not to forget this ;)

The switch to DERIVE should not affect any front-ends. They read gauge
values from the RRD files or directly let RRDtool create graphs, so they
don't get in contact with the data source type.

Scripts and other custom code may have to be adapted if it reports
counter values greater than 9223372036854775807. In my experience this
is an unusual case.

On Tue, Dec 07, 2010 at 07:33:28PM +0100, Sebastian Harl wrote:
> For a, probably incomplete, list of changes see [1].

I was actually aiming for "complete", so if you find anything missing,
please add it, make a note on the discussion page or send me a message
here.

Best regards,
—octo
-- 
Florian octo Forster
Hacker in training
GnuPG: 0x0C705A15
http://octo.it/


signature.asc
Description: Digital signature
___
collectd mailing list
collectd@verplant.org
http://mailman.verplant.org/listinfo/collectd


Re: [collectd] hi. i am new..

2010-12-08 Thread Florian Forster
Hi,

On Tue, Dec 07, 2010 at 07:32:44PM +0800, chenxiong_0815 wrote:
>   I am new in this open source technology, 
> if I have plug-in implementation problems can contact this email
> address? 

yes, this mailing list is used to discuss development of collectd and to
answer questions regarding collectd. If you ask questions here, please
include as much information about the problem as you can. This will
greatly increase the chances of getting a satisfactory answer.

Best regards,
—octo
-- 
Florian octo Forster
Hacker in training
GnuPG: 0x0C705A15
http://octo.it/


signature.asc
Description: Digital signature
___
collectd mailing list
collectd@verplant.org
http://mailman.verplant.org/listinfo/collectd


Re: [collectd] Version 5.0.0.beta0 available

2010-12-08 Thread Florian Forster
Hi,

On Wed, Dec 08, 2010 at 10:44:32PM +0700, Anh K. Huynh wrote:
> I can't compile this version. The error can be found at
>   http://viettug.pastebin.com/k7Kxbg2r

thank you very much for reporting this problem! I have fixed this
problem earlier today; the fix is available from the Git repository
(it's in the master branch).

In a couple of hours you can download a new daily snapshot containing
this fix from:

  

Best regards,
—octo
-- 
Florian octo Forster
Hacker in training
GnuPG: 0x0C705A15
http://octo.it/


signature.asc
Description: Digital signature
___
collectd mailing list
collectd@verplant.org
http://mailman.verplant.org/listinfo/collectd


Re: [collectd] Version 5.0.0.beta0 available

2010-12-08 Thread Florian Forster
Hi Thorsten,

On Wed, Dec 08, 2010 at 09:04:55AM -0800, Thorsten von Eicken wrote:
> Has the network protocol changed?

a little. v5 uses a much higher accuracy to specify time. This is
represented in an addition to the protocol, which v4 servers don't
understand.

> Can a v5 client send to a v4 server?

No, because without a timestamp the data is invalid. Or maybe just
erroneous and you'll receive tons of error. I didn't try.

> Any issues when a v4 client sends to v5?

In theory, this should work. Finding out if this works as expected is
what I hope to achieve with the beta version.

> You don't state these things explicitly anywhere.

I'll try to improve the wiki page and possibly other accompanying
documentation.

> The situation we're gonna be in is that we'll have a mix of v4 and v5
> clients sending data to the same servers. I'm wondering whether there's
> any easy way to flag the resulting rrds.

The RRD layout, for example whether the "cpu" type uses COUNTER or
DERIVE, depends on the version of the server -- the client version has
no influence on that. Again, for front-ends this is negligible.

> I know I can open the rrd and list the ds names, but right now I have
> a config table that maps rrd > filename to ds names (and associated
> config).

Same for the DS names. So you could use the CTIME to determine whether a
file was created before or after the server was updated to 5.0. But I'd
strongly recommend to convert the RRD files with the migration script.
Both, v4 and v5 servers could update those RRD files. If a v4 server
creates a new file it will have a "wrong" DS name, though, and you may
need to re-run the migration script.

> Another thought: would it be possible to insert a "shim" into the v5
> server's stack (perhaps close to the network input) that converts v4
> data to v5 data? It would have to use some heuristics, but it might
> not be that bad.

This is *exactly* what the "v5upgrade" target does. It'd be great if you
could share any experience you may have with this tool. Its wiki page is
at 

Thanks for your questions, it really does help to pinpoint where more
documentation is necessary!

Regards,
--octo
-- 
Florian octo Forster
Hacker in training
GnuPG: 0x0C705A15
http://octo.it/


signature.asc
Description: Digital signature
___
collectd mailing list
collectd@verplant.org
http://mailman.verplant.org/listinfo/collectd


  1   2   3   4   5   >