Hi Cyril, thank you very much for your patches! I'm very sorry, but I can't give the plugin the thorough review it deserves right now. But maybe you find the following comments helpful.
On Tue, Dec 20, 2011 at 03:03:43PM +0200, Cyril Feraudet wrote: > - with_mysql_libs=`$with_mysql_config --libs 2>/dev/null` > + with_mysql_libs=`$with_mysql_config --libs_r 2>/dev/null` This looks like a bug fix. Would it make sense to add this change to the 4.10 branch? If so, would you be willing to send a separate patch for this? > +CREATE TABLE `data` ( > + `id` bigint(20) NOT NULL auto_increment, > + `timestamp` double NOT NULL, > + `host_id` int(11) NOT NULL, > + `plugin_id` int(11) NOT NULL, > + `plugin_instance` varchar(255) default NULL, > + `type_id` int(11) NOT NULL, > + `typeinstance` varchar(255) default NULL, > + `dataset_id` int(11) NOT NULL, > + `value` double NOT NULL, One of the reasons why I didn't write a "write_dbi" or "write_mysql" plugin before is because I was uncertain about the schema to use. Why did you chose to have a separate table for host, plugin, plugin_instance, type, and type_instance? I currently think that just having an "identifier" table with five columns would be a better trade- off between normalization and complexity. > +CREATE TABLE `dataset` ( Why do you store datasets in the database (in addition to the types.db file)? Having the definition in more than one place is bound to create inconsistencies at some point ... > diff --git a/src/write_mysql.c b/src/write_mysql.c > +typedef struct host_s host_t; > +struct host_s { > + char name[DATA_MAX_NAME_LEN]; > + int id; > + host_t *next_host; > +}; You will be doing a lot of lookups in this cache(es). I think using a binary search tree (an implementation if provided in src/utils_avltree.h) would improve performance here. > int port = 0; > [...] > +static int write_mysql_config (const char *key, const char *value) { > [...] > + } else if (strcasecmp ("Port", key) == 0) { > + port = value; You're assining a char* to an int. This won't produce what you expect. Use the function service_name_to_port_number() from src/common.h. > +static int write_mysql_init (void) { > + conn = mysql_init(NULL); > + if (mysql_real_connect(conn, host, user, passwd, database, port, NULL, > 0) == NULL) { The connection should be (re-)established in the write() callback, when it is needed. Otherwise you will never recover from connection failures. > + if (!mysql_thread_safe()) { > + ERROR("write_mysql plugin: mysqlclient Thread Safe OFF"); I wasn't aware of this function. Good to know, thanks! > +static int add_host_id (char *hostname) { > [...] > + len = ssnprintf (query, sizeof (query), "SELECT id FROM host WHERE name > = '%s'", hostname); I think we should prepare that statement with bound arguments (or whatever the terminology). This way (a) MySQL doesn't have to parse and optimize the query each time we need to insert a host and (b) quoting is done for us by the mysql library. > +static int add_host_id (char *hostname) { > [...] > + pthread_mutex_lock(&mutex); > [...] > + if (row = mysql_fetch_row(result)) { > + id = atoi(row[0]); > + mysql_free_result(result); > + pthread_mutex_unlock(&mutex); > [...] > + pthread_mutex_unlock(&mutex); > + return id; > +} Mutex is released twice in this path. > +static int add_plugin_id (char *pluginname) { > +static int add_type_id (char *typename) { A *lot* of duplicate code. Can't we write a function like: int add_string (const char *table, const char *str); /* returns id on success, less than zero on failure. */ > +static int write_mysql_write(const data_set_t *ds, const value_list_t *vl, > + user_data_t __attribute__((unused)) *user_data) { > [...] > + len = ssnprintf (tmpquery, sizeof (tmpquery), "INSERT INTO data > " > + > "(timestamp,host_id,plugin_id,plugin_instance,type_id,typeinstance,dataset_id,value)" > + "VALUES (%.3f,%d,%d,'%s',%d,'%s',%d,%%lf)", CDTIME_T_TO_DOUBLE > (vl->time), host_id, > + plugin_id, vl->plugin_instance, type_id, vl->type_instance, > dataset_id ); You really should prepare this statement. Parsing this several thousand times per second will be a major performance problem. > + if (dso->type == DS_TYPE_GAUGE) { > + len = ssnprintf (query, sizeof (query), tmpquery, > vl->values[i].gauge); > + } else { > + if (rates == NULL) { > + rates = uc_get_rate (ds, vl); > + } > + if (isnan(rates[i])) { continue; } > + len = ssnprintf (query, sizeof (query), tmpquery, > rates[i]); > + } > + //INFO("toto: %d", toto); > + pthread_mutex_lock(&mutex); > + DEBUG("write_mysql plugin: %s", query); > + mysql_real_query(conn, query, len); > + pthread_mutex_unlock(&mutex); So you're inserting each data source in a separate statement? How can a client distinguish between the different data sources? At least, you should add a "ds_index" column or so. Thanks again and best regards, —octo -- Florian octo Forster Hacker in training GnuPG: 0x0C705A15 http://octo.it/
signature.asc
Description: Digital signature
_______________________________________________ collectd mailing list collectd@verplant.org http://mailman.verplant.org/listinfo/collectd