Re: [collectd] New plugin for collectd : write_mysql

Daniel Hilst Thu, 03 May 2012 11:48:06 -0700

On 02/14/2012 09:54 AM, Cyril Feraudet wrote:

Hi all,


Here the write_mysql output plugin patch (from 5.0.2) with modifications
suggested by Octo.

By the way, notification are handled, all memory used is freed on
shutdown (thx Valgrind).

Regards,

Cyril

Le 04.01.2012 10:49, Cyril Feraudet a écrit :

Hi Octo, Hi all,

Thanks for your time. It will permit to me to improve this plugin and
my
knowledge of C.

Le 22.12.2011 01:06, Florian Forster aécrit :

Hi Cyril, thank you very much for your patches! I'm very sorry, but I
can't give the plugin the thorough review it deserves right now. But
maybe you find the following comments helpful. On Tue, Dec 20, 2011
at 03:03:43PM +0200, Cyril Feraudet wrote:

- with_mysql_libs=`$with_mysql_config --libs 2>/dev/null` +
with_mysql_libs=`$with_mysql_config --libs_r 2>/dev/null`

This looks like a bug fix. Would it make sense to add this change to
the 4.10 branch? If so, would you be willing to send a separate patch
for this?

You are right, I will submit a patch soon. write_mysql is often called
than
mysql plugin and non thread safe library make collectd crashing
quickly.

+CREATE TABLE `data` ( + `id` bigint(20) NOT NULL auto_increment, +
`timestamp` double NOT NULL, + `host_id` int(11) NOT NULL, +
`plugin_id` int(11) NOT NULL, + `plugin_instance` varchar(255)
default NULL, + `type_id` int(11) NOT NULL, + `typeinstance`
varchar(255) default NULL, + `dataset_id` int(11) NOT NULL, +
`value` double NOT NULL,

One of the reasons why I didn't write a "write_dbi" or "write_mysql"
plugin before is because I was uncertain about the schema to use. Why
did you chose to have a separate table for host, plugin,
plugin_instance, type, and type_instance? I currently think that just
having an "identifier" table with five columns would be a better
trade- off between normalization and complexity.

The idea was to save space in `data` table and improve querying in it.
For now, I've 800MB `data` table (MEMORY type) just for the last value
of
each metrics (4207 servers / 462,395 metrics) using REPLACE instead of
INSERT
statement. I think that SQL join are faster on numeric id than varchar.
But all of this things are questionable and dependent of the final use.
I think about a customizable SQL statement in plugin configuration.

ATE TABLE `dataset` (

Why do you store datasets in the database (in addition to the
types.db file)? Having the definition in more than one place is bound
to create inconsistencies at some point

This table is for readonly purpose and to make SQL query independent of
number
of metrics in a type. It also contain type added on the way by a
plugin.

diff --git a/src/write_mysql.c b/src/w

/blockquote> typedef struct host_s host_t; +struct host_s { + char

name[DATA_MAX_NAME_LEN]; + int id; + host_t *next_host; +};

You will be doing a lot of lookups in this cache(es). I think using a
binary search tree (an implementation if provided in
src/utils_avltree.h) would improve performance here.

Thanks for the tips, Il will do that.

int port =

tatic int write_mysql_config (const char *key, const char *value) {
[...] + } else if (strcasecmp ("Port", key) == 0) { + port = value; You

a char* to an int. This won't produce what you expect. Use the
function service_name_to_port_number() from src/common.h. +static int wri

Your'e right I will fix it.

t (void) { + conn = mysql_init(NULL); + if (mysql_real_connect(conn,
host, user, passwd, database, port, NULL, 0) == NULL) { The conn

be (re-)established in the write() callback, when it is needed.
Otherwise you will never recover from connection failures. + if
(!mysql_th

Your'e right, I just issued it. I will fix it too.

{ + ERROR("write_mysql plugin: mysqlclient Thread Safe OFF"); I wasn't

s function. Good to know, thanks! +static int add_host_id (

me) { [...] + len = ssnprintf (query, sizeof (query), "SELECT id
FROM host WHERE name = '%s'", hostname); I think we should

statement with bound arguments (or whatever the terminology). This
way (a) MySQL doesn't have to parse and optimize the query each time
we need to insert a host and (b) quoting is done for us by the mysql
library. +static int add_host_id (char *

I will looking for. I never used it before.

[...] + pthread_mutex_lock(&mutex); [...] + if (row =
mysql_fetch_row(result)) { + id = atoi(row[0]); +
mysql_free_result(result); + pthread_mutex_unlock(&mutex); [...] +
pthread_mutex_unlock(&mutex); + return id; +} Mutex is released twice

+static int add_plugin_id (char

I'll fix that.

{ +static int add_type_id (char *typename) { A *lot* of duplicate code

ite a function like: int add_string (const char *table, const char
*str); /* returns id on success, less than zero on failure. */
+static int write_mysql_write(con

t *ds, const value_list_t *vl, + user_data_t __attribute__((unused))
*user_data) { [...] + len = ssnprintf (tmpquery, sizeof (tmpquery),
"INSERT INTO data " +

"(timestamp,host_id,plugin_id,plugin_instance,type_id,typeinstance,dataset_id,value)"

+ "VALUES (%.3f,%d,%d,'%s',%d,'%s',%d,%%lf)", CDTIME_T_TO_DOUBLE
(vl->time), host_id, + plugin_id, vl->plugin_instance, type_id,
vl->type_instance, dataset_id );

You really should prepare this statement. Parsing this several
thousand times per second will be a major performance problem.

+if (dso->type == DS_TYPE_GAUGE) { + len = ssnprintf (query, sizeof
(query), tmpquery, vl->values[i].gauge); +} else { + if (rates ==
NULL) { + rates = uc_get_rate (ds, vl); + } + if (isnan(rates[i])) {
continue; } + len = ssnprintf (query, sizeof (query), tmpquery,
rates[i]); +} +//INFO("toto: %d", toto);
+pthread_mutex_lock(&mutex); +DEBUG("write_mysql plugin: %s",
query); +mysql_real_query(conn, query, len);
+pthread_mutex_unlock(&mutex);

So you're inserting each data source in a element? How can a client
distinguish between the different data sources? At least, you should
add a "ds_index" column or so. Thanks again and best

There are dataset_id and type_id in `dataset` for that. Here a query
sample to get
all metric in a type in right order :

select data.timestamp AS date, dataset.name AS dataset_name, data.value
AS value
from data, host, plugin, type, dataset
where host.id = data.host_id
and plugin.id = data.plugin_id
and type.id = data.type_id
and data.dataset_id = dataset.id
and host.name = 'eqds3pcold001'
and plugin.name = 'interface'
and data.plugin_instance = 'eth0'
and type.name = 'if_octets'

In my case, write_mysql is used to make, for example, sum of cpu used
on several thousand
of server and re-inject it in collectd to graph it.

I've made a more flexible than Ganglia application based on Collectd at
work. My compagny
allow me to share all my work on collectd in GPL like licence. I will
Release a beta of
my unamed application soon a possible.

Regards,

Cyril Feraudet



_______________________________________________
collectd mailing list
[email protected]  <mailto:[email protected]>
http://mailman.verplant.org/listinfo/collectd




_______________________________________________
collectd mailing list
[email protected]
http://mailman.verplant.org/listinfo/collectd


Hello, I'm trying to compile collectd 5.0.2 with write_mysql pluing

I've done this: http://pastebin.com/dPc6zgUH
I'm on rhel 5.5 x86_64

Any idea?

--
Follow the white rabbit!

_______________________________________________
collectd mailing list
[email protected]
http://mailman.verplant.org/listinfo/collectd

Re: [collectd] New plugin for collectd : write_mysql

Reply via email to