Re: [collectd] New plugin for collectd : write_mysql

Daniel Hilst Mon, 07 May 2012 13:46:55 -0700

On 05/04/2012 07:07 AM, Cyril Feraudet wrote:

Le 04.05.2012 08:52, Cyril Feraudet a écrit :

Le 03.05.2012 20:05, Daniel Hilst a écrit :

    On 05/03/2012 03:46 PM, Daniel Hilst wrote:

        On 02/14/2012 09:54 AM, Cyril Feraudet wrote:

            Hi all, Here the write_mysql output plugin patch (from
            5.0.2) with modifications suggested by Octo. By the way,
            notification are handled, all memory used is freed on
            shutdown (thx Valgrind). Regards, Cyril Le 04.01.2012
            10:49, Cyril Feraudet a écrit :

                Hi Octo, Hi all, Thanks for your time. It will permit
                to me to improve this plugin and my knowledge of C. Le
                22.12.2011 01:06, Florian Forster aécrit :

                    Hi Cyril, thank you very much for your patches!
                    I'm very sorry, but I can't give the plugin the
                    thorough review it deserves right now. But maybe
                    you find the following comments helpful. On Tue,
                    Dec 20, 2011 at 03:03:43PM +0200, Cyril Feraudet
                    wrote:

                        - with_mysql_libs=`$with_mysql_config --libs
                        2>/dev/null` +
                        with_mysql_libs=`$with_mysql_config --libs_r
                        2>/dev/null`

                    This looks like a bug fix. Would it make sense to
                    add this change to the 4.10 branch? If so, would
                    you be willing to send a separate patch for this?

                You are right, I will submit a patch soon. write_mysql
                is often called than mysql plugin and non thread safe
                library make collectd crashing quickly.

                        +CREATE TABLE `data` ( + `id` bigint(20) NOT
                        NULL auto_increment, + `timestamp` double NOT
                        NULL, + `host_id` int(11) NOT NULL, +
                        `plugin_id` int(11) NOT NULL, +
                        `plugin_instance` varchar(255) default NULL, +
                        `type_id` int(11) NOT NULL, + `typeinstance`
                        varchar(255) default NULL, + `dataset_id`
                        int(11) NOT NULL, + `value` double NOT NULL,

                    One of the reasons why I didn't write a
                    "write_dbi" or "write_mysql" plugin before is
                    because I was uncertain about the schema to use.
                    Why did you chose to have a separate table for
                    host, plugin, plugin_instance, type, and
                    type_instance? I currently think that just having
                    an "identifier" table with five columns would be a
                    better trade- off between normalization and
                    complexity.

                The idea was to save space in `data` table and improve
                querying in it. For now, I've 800MB `data` table
                (MEMORY type) just for the last value of each metrics
                (4207 servers / 462,395 metrics) using REPLACE instead
                of INSERT statement. I think that SQL join are faster
                on numeric id than varchar. But all of this things are
                questionable and dependent of the final use. I think
                about a customizable SQL statement in plugin
                configuration.

                        ATE TABLE `dataset` (

                    Why do you store datasets in the database (in
                    addition to the types.db file)? Having the
                    definition in more than one place is bound to
                    create inconsistencies at some point

                This table is for readonly purpose and to make SQL
                query independent of number of metrics in a type. It
                also contain type added on the way by a plugin.

                        diff --git a/src/write_mysql.c b/src/w

                    /blockquote> typedef struct host_s host_t; +struct
                    host_s { + char

                        name[DATA_MAX_NAME_LEN]; + int id; + host_t
                        *next_host; +};

                    You will be doing a lot of lookups in this
                    cache(es). I think using a binary search tree (an
                    implementation if provided in src/utils_avltree.h)
                    would improve performance here.

                Thanks for the tips, Il will do that.

                    int port =

                        tatic int write_mysql_config (const char *key,
                        const char *value) { [...] + } else if
                        (strcasecmp ("Port", key) == 0) { + port =
                        value; You

                    a char* to an int. This won't produce what you
                    expect. Use the function
                    service_name_to_port_number() from src/common.h.
                    +static int wri

                Your'e right I will fix it.

                        t (void) { + conn = mysql_init(NULL); + if
                        (mysql_real_connect(conn, host, user, passwd,
                        database, port, NULL, 0) == NULL) { The conn

                    be (re-)established in the write() callback, when
                    it is needed. Otherwise you will never recover
                    from connection failures. + if (!mysql_th

                Your'e right, I just issued it. I will fix it too.

                        { + ERROR("write_mysql plugin: mysqlclient
                        Thread Safe OFF"); I wasn't

                    s function. Good to know, thanks! +static int
                    add_host_id (

                        me) { [...] + len = ssnprintf (query, sizeof
                        (query), "SELECT id FROM host WHERE name =
                        '%s'", hostname); I think we should

                    statement with bound arguments (or whatever the
                    terminology). This way (a) MySQL doesn't have to
                    parse and optimize the query each time we need to
                    insert a host and (b) quoting is done for us by
                    the mysql library. +static int add_host_id (char *

                I will looking for. I never used it before.

                        [...] + pthread_mutex_lock(&mutex); [...] + if
                        (row = mysql_fetch_row(result)) { + id =
                        atoi(row[0]); + mysql_free_result(result); +
                        pthread_mutex_unlock(&mutex); [...] +
                        pthread_mutex_unlock(&mutex); + return id; +}
                        Mutex is released twice

                    +static int add_plugin_id (char

                I'll fix that.

                        { +static int add_type_id (char *typename) { A
                        *lot* of duplicate code

                    ite a function like: int add_string (const char
                    *table, const char *str); /* returns id on
                    success, less than zero on failure. */ +static int
                    write_mysql_write(con

                        t *ds, const value_list_t *vl, + user_data_t
                        __attribute__((unused)) *user_data) { [...] +
                        len = ssnprintf (tmpquery, sizeof (tmpquery),
                        "INSERT INTO data " +

                
"(timestamp,host_id,plugin_id,plugin_instance,type_id,typeinstance,dataset_id,value)"


                        + "VALUES (%.3f,%d,%d,'%s',%d,'%s',%d,%%lf)",
                        CDTIME_T_TO_DOUBLE (vl->time), host_id, +
                        plugin_id, vl->plugin_instance, type_id,
                        vl->type_instance, dataset_id );

                    You really should prepare this statement. Parsing
                    this several thousand times per second will be a
                    major performance problem.

                        +if (dso->type == DS_TYPE_GAUGE) { + len =
                        ssnprintf (query, sizeof (query), tmpquery,
                        vl->values[i].gauge); +} else { + if (rates ==
                        NULL) { + rates = uc_get_rate (ds, vl); + } +
                        if (isnan(rates[i])) { continue; } + len =
                        ssnprintf (query, sizeof (query), tmpquery,
                        rates[i]); +} +//INFO("toto: %d", toto);
                        +pthread_mutex_lock(&mutex);
                        +DEBUG("write_mysql plugin: %s", query);
                        +mysql_real_query(conn, query, len);
                        +pthread_mutex_unlock(&mutex);

                    So you're inserting each data source in a element?
                    How can a client distinguish between the different
                    data sources? At least, you should add a
                    "ds_index" column or so. Thanks again and best

                There are dataset_id and type_id in `dataset` for
                that. Here a query sample to get all metric in a type
                in right order : select data.timestamp AS date,
                dataset.name AS dataset_name, data.value AS value from
                data, host, plugin, type, dataset where host.id =
                data.host_id and plugin.id = data.plugin_id and
                type.id = data.type_id and data.dataset_id =
                dataset.id and host.name = 'eqds3pcold001' and
                plugin.name = 'interface' and data.plugin_instance =
                'eth0' and type.name = 'if_octets' In my case,
                write_mysql is used to make, for example, sum of cpu
                used on several thousand of server and re-inject it in
                collectd to graph it. I've made a more flexible than
                Ganglia application based on Collectd at work. My
                compagny allow me to share all my work on collectd in
                GPL like licence. I will Release a beta of my unamed
                application soon a possible. Regards, Cyril Feraudet
                _______________________________________________
                collectd mailing list [email protected]
                <mailto:[email protected]> [email protected]>
                http://mailman.verplant.org/listinfo/collectd

            _______________________________________________ collectd
            mailing list [email protected]
            <mailto:[email protected]>
            http://mailman.verplant.org/listinfo/collectd

        Hello, I'm trying to compile collectd 5.0.2 with write_mysql
        pluing I've done this: http://pastebin.com/dPc6zgUH I'm on
        rhel 5.5 x86_64 Any idea?

    I have to run libtoolize --force before compile

    here is:
    tar vxf collectd-5.0.2.tar.bz2
    cd collectd-5.0.2
    patch -p1 -i ../0001-Adding-write_mysql-output-plugin.patch
    aclocal
    autoconf
    automake
    libtoolize  --force
    ./configure
    make
    make install all


    Now I'm facing this error:
    error] write_mysql plugin: Failed to bind param to statement : Statement
    not prepared / INSERT INTO data
    
(date,host_id,plugin_id,plugin_instance,type_id,type_instance,dataset_id,value)VALUES
    (?,?,?,?,?,?,?,?)

    What that means????

    I tried Replace "false" without success

    Thanks in advance,

    Hilst

This is an old patch, please use this one : git clone -b
cf/perfwatcher http://github.com/feraudet/collectd.git

Regards,

Cyril

Here a ready to build one
https://github.com/downloads/feraudet/collectd/collectd-5.1.0-perfwatcher.tgz

You have to re-create database from collectd/contrib/write_mysql.sql and
take care to the new config block :

  <Plugin write_mysql>
      <Database master4example>
           Host "localhost"
           User "root"
           Passwd ""
           Database "collectd"
           Port "3306"
           Replace true
       </Database>
  </Plugin>

Regards,

Cyril


Hi, Cyril

The plugin is working with `Replace true' thanks!!

But when I change Replace to false it got this error:

[2012-05-07 17:04:31] [error] write_mysql plugin: Failed to executere-prepared statement : Duplicate entry '1-1--1-free-1' for key 2 /INSERT INTO data(date,host_id,plugin_id,plugin_instance,type_id,type_instance,dataset_id,value)VALUES (?,?,?,?,?,?,?,?)

To by pass this I had to drop the data table and create it again with aforeign key in place of unique one, here is:


DROP TABLE IF EXISTS `data`;
/*!40101 SET @saved_cs_client     = @@character_set_client */;
/*!40101 SET character_set_client = utf8 */;
CREATE TABLE `data` (
  `id` bigint(20) NOT NULL AUTO_INCREMENT,
  `date` datetime NOT NULL,
  `host_id` int(11) NOT NULL,
  `plugin_id` int(11) NOT NULL,
  `plugin_instance` varchar(255) DEFAULT NULL,
  `type_id` int(11) NOT NULL,
  `type_instance` varchar(255) DEFAULT NULL,
  `dataset_id` int(11) NOT NULL,
  `value` double NOT NULL,
  PRIMARY KEY (`id`),
  FOREIGN KEY `host_id_3` (`host_id`) REFERENCES `host` (`id`),
  FOREIGN KEY `plugin_id_3` (`plugin_id`) REFERENCES `plugin` (`id`),
  FOREIGN KEY `type_id_3` (`type_id`) REFERENCES `type` (`id`),
  FOREIGN KEY `dataset_id_3` (`dataset_id`) REFERENCES `dataset` (`id`)
) ENGINE=MEMORY AUTO_INCREMENT=0 DEFAULT CHARSET=utf8;

Would that ruin something later?

Thanks in advance..

Hilst,

--
Follow the white rabbit!

_______________________________________________
collectd mailing list
[email protected]
http://mailman.verplant.org/listinfo/collectd

Re: [collectd] New plugin for collectd : write_mysql

Reply via email to