[rrd-developers] Patch: rrdtool+libdbi-Database backend - patch for rrdtool 1.3

Martin Sperl Mon, 03 Nov 2008 11:28:57 -0800

Hi!

As some of you may know that I have created a patch for rrdtool 1.2 afew years ago, so that a database could be queried for values for graphing.

I have created an updated patch for rrdtool version 1.3 (against SVNversion 1644).

The patch has been mostly rewritten and the following changes have beenmade:


   * high dependency on mysql has been reduced by avoiding the
     temporary tables (which was bad for mysql replication)
   * The number of executed SQL-Statements for one CDEF has been
     reduced to 1 compared to 11 SQLs (including CREATE TEMPORARY
     TABLE) - for patch against version 1.2
   * All consolidation is done in rrdtool itself (MIN,MAX,AVERAGE)
   * Additional consolidation functions are COUNT and SIGMA, which give
     information on statistics on a per "time-bin" basis.
   * All these consolidation values are always returned as separate
     columns, that are returned by RRD and the consolidation function
     given as Argument is ignored.
     Main reason is that this way there is only one call to rrd_fetcht
     and thus the database even if we need to fetch for example min,
     avg and max. Compare this to 3 calls in case of different
     consolidation functions - and if you want to get SIGMA and COUNT
     as well it is still only one call to the backend and the database.
   * Some previous existing features have been taken out at the moment
     to allow for this reduced set of SQL queries.
         o prediction using the values from the last X days at the same
           time
         o the corresponding sigma calculation
   * The idea is to create generic CDEF's that will do the same thing,
     but that is also available when using RRD-files (similar to TREND,
     but with another scope)
     This will get posted as a separate patch.
   * Overall performance should be much better and the patch as a whole
     simpler.
   * The patch also includes modifications to the configuration
     infrastructure, to make libdbi support optional.

As requested by Tobi in a private email-thread, the patch should behighly localized (and configurable via configure), to allow a possiblemerge with minimal shared code touched.


Please review the code and comment...

Thanks,
            Martin

P.s: I know that there is the "new" approach of rrdcached to improvewrite-performance with massive updates on rrd. This patch was an initialattempt to get a similar problem solved.Here for comparison: our production system (running with the 1.2 patchfor graphing) is updating on average 150k different values every 300seconds or approximately 500 values/s on a HP-DL360G3 with Dual Xenon3.0GHz, 4GB RAM connected via ISCSI (over GBit network) to our storagesystem. (data-gathering Application and the mysql DB is done on the samehost)

Index: configure.ac
===================================================================
--- configure.ac	(revision 1644)
+++ configure.ac	(working copy)
@@ -468,6 +468,25 @@
 
 CONFIGURE_PART(Find 3rd-Party Libraries)
 
+AC_ARG_ENABLE(libdbi,[  --disable-libdbi        do not build in support for libdbi],[have_libdbi=no],[
+  XXX=$LIBS
+  LIBS="$LIBS -ldbi -ldl"
+  AC_MSG_CHECKING(for libdbi)
+  AC_LINK_IFELSE(
+    [AC_LANG_PROGRAM([[#include <dbi/dbi.h>]], 
+                     [[dbi_initialize(NULL)]]
+                    )
+    ],[AC_DEFINE(HAVE_LIBDBI,[1],[have got libdbi installed])
+       AC_MSG_RESULT([yes])
+       have_libdbi=yes
+    ],[LIBS=$XXX
+       AC_MSG_RESULT([no])
+       have_libdbi=no
+       exit 1
+    ]
+  )
+])
+AM_CONDITIONAL(BUILD_LIBDBI,[test $have_libdbi != no])
 
 AM_CONDITIONAL(BUILD_RRDCGI,[test $enable_rrdcgi != no])
 
@@ -926,6 +945,7 @@
 echo "          Build rrdcgi: $enable_rrdcgi"
 echo "       Build librrd MT: $enable_pthread"
 echo "     Link with libintl: $enable_libintl"
+echo "           With libDBI: $have_libdbi"
 echo
 echo "             Libraries: $ALL_LIBS"
 echo
Index: src/Makefile.am
===================================================================
--- src/Makefile.am	(revision 1644)
+++ src/Makefile.am	(working copy)
@@ -58,6 +58,10 @@
 	fnv.h rrd_graph.h \
 	rrd_is_thread_safe.h
 
+if BUILD_LIBDBI
+RRD_C_FILES += rrd_fetch_libdbi.c
+endif
+
 if BUILD_GETOPT
 noinst_HEADERS += rrd_getopt.h
 UPD_C_FILES += rrd_getopt.c rrd_getopt1.c
Index: src/rrd_tool.h
===================================================================
--- src/rrd_tool.h	(revision 1644)
+++ src/rrd_tool.h	(working copy)
@@ -92,6 +92,16 @@
             char ***ds_namv,
             rrd_value_t **data);
 
+
+#ifdef HAVE_LIBDBI
+int rrd_fetch_fn_libdbi(char *filename, enum cf_en cf_idx,
+ 			time_t *start,time_t *end,
+ 			unsigned long *step,
+ 			unsigned long *ds_cnt,
+ 			char        ***ds_namv,
+ 			rrd_value_t **data);
+#endif
+
 #define RRD_READONLY    (1<<0)
 #define RRD_READWRITE   (1<<1)
 #define RRD_CREAT       (1<<2)
Index: src/rrd_fetch.c
===================================================================
--- src/rrd_fetch.c	(revision 1644)
+++ src/rrd_fetch.c	(working copy)
@@ -235,6 +235,15 @@
             *start, *end, *step);
 #endif
 
+#ifdef HAVE_LIBDBI
+    /* handle libdbi datasources */
+    if (strncmp("sql",filename,3)==0) {
+      if (filename[3]==filename[4]) {
+	return rrd_fetch_fn_libdbi(filename,cf_idx,start,end,step,ds_cnt,ds_namv,data);
+      }
+    }
+#endif
+
     rrd_init(&rrd);
     rrd_file = rrd_open(filename, &rrd, RRD_READONLY);
     if (rrd_file == NULL)
Index: src/rrd_fetch_libdbi.c
===================================================================
--- src/rrd_fetch_libdbi.c	(revision 0)
+++ src/rrd_fetch_libdbi.c	(revision 0)
@@ -0,0 +1,533 @@
+#include "rrd_tool.h"
+#include <dbi/dbi.h>
+#include <time.h>
+
+/* the structures */
+struct sql_table_helper {
+  dbi_conn conn;
+  int connected;
+  dbi_result result;
+  char const* filename;
+  char const* dbdriver;
+  char* table_start;
+  char* table_next;
+  char const* where;
+  char * timestamp;
+  char * value;
+};
+
+/* the prototypes */
+int _sql_setparam(struct sql_table_helper* th,char* key, char* value);
+int _sql_fetchrow(struct sql_table_helper* th,time_t *timestamp, rrd_value_t *value);
+char* _find_next_separator(char* start,char separator);
+char* _find_next_separator_twice(char*start,char separator);
+char _hexcharhelper(char c);
+int _inline_unescape (char* string);
+double rrd_fetch_dbi_double(dbi_result *result,int idx);
+long rrd_fetch_dbi_long(dbi_result *result,int idx);
+
+/* the real code */
+
+/* helpers to get correctly converted values from DB*/
+long rrd_fetch_dbi_long(dbi_result *result,int idx) {
+  char *ptmp="";
+  switch (dbi_result_get_field_type_idx(result,idx)) {
+  case DBI_TYPE_STRING:
+    ptmp=(char*)dbi_result_get_string_idx(result,idx);
+    return atoi(ptmp);
+  default:
+    return dbi_result_get_longlong_idx(result,idx);
+  }
+}
+
+double rrd_fetch_dbi_double(dbi_result *result,int idx) {
+  char *ptmp="";
+  /* return NAN if NULL */
+  if(dbi_result_field_is_null_idx(result,idx)) { return DNAN; }
+  /* do some conversions */
+  switch (dbi_result_get_field_type_idx(result,idx)) {
+  case DBI_TYPE_STRING:
+    ptmp=(char*)dbi_result_get_string_idx(result,idx);
+    return strtod(ptmp,NULL);
+  default:
+    return dbi_result_get_double_idx(result,idx);
+  }
+}
+
+int _sql_close(struct sql_table_helper* th) {
+  /* close only if connected */
+  if (th->conn) {
+    /* shutdown dbi */
+    dbi_conn_close(th->conn);
+    dbi_shutdown();
+    /* and assign empty */
+    th->conn=NULL;
+  }
+}
+
+int _sql_setparam(struct sql_table_helper* th,char* key, char* value) {
+  char* dbi_errstr=NULL;
+  /* if not connected */
+  if (! th->conn) {
+    /* initialize some stuff */
+    th->table_next=th->table_start;
+    th->result=NULL;
+    th->connected=0;
+    /* initialize db */
+    dbi_initialize(NULL);
+    th->conn=dbi_conn_new(th->dbdriver);
+    /* and handle errors */
+    if (! th->conn) { 
+      dbi_conn_error(th->conn,(const char**)&dbi_errstr);
+      rrd_set_error( "libdbi - no such driver: %s (possibly a dynamic link problem of the driver being linked without -ldbi)",dbi_errstr); 
+      dbi_shutdown();
+      return -1; 
+    }
+  }
+  if (th->connected) {
+    rrd_set_error( "we are already connected - can not set parameter %s=%s",key,value);
+    _sql_close(th);
+    return -1; 
+  }
+  if (dbi_conn_set_option(th->conn,key,value)) {
+    dbi_conn_error(th->conn,(const char**)&dbi_errstr);
+    rrd_set_error( "libdbi: problems setting %s to %s - %s",key,value,dbi_errstr);
+    _sql_close(th);
+    return -1;
+  }
+  return 0;
+}
+
+int _sql_fetchrow(struct sql_table_helper* th,time_t *timestamp, rrd_value_t *value) {
+  char* dbi_errstr=NULL;
+  char sql[10240];
+  time_t startt=0,endt=0;
+  /*connect to the database if needed */
+  if (! th->conn) {
+      rrd_set_error( "libdbi no parameters set for libdbi",th->filename,dbi_errstr);
+      return -1;
+  }
+  if (! th->connected) {
+    /* and now connect */
+    if (dbi_conn_connect(th->conn) <0) {
+      dbi_conn_error(th->conn,(const char**)&dbi_errstr);
+      rrd_set_error( "libdbi: problems connecting to db with connect string %s - error: %s",th->filename,dbi_errstr);
+      _sql_close(th);
+      return -1;
+    }
+    th->connected=1;
+  }
+  /* now find out regarding an existing result-set */
+  if (! th->result) {
+    /* return if table_next is NULL */
+    if (th->table_next==NULL) { 
+      /* but first close connection */
+      _sql_close(th);
+      /* and return with end of data */
+      return 0;
+    }
+    /* calculate the table to use next */
+    th->table_start=th->table_next;
+    th->table_next=_find_next_separator(th->table_start,'+');
+    _inline_unescape(th->table_start);
+    /* and prepare FULL SQL Statement */
+    snprintf(sql,sizeof(sql)-1,"SELECT %s as rrd_time, %s as rrd_value FROM %s WHERE %s GROUP BY rrd_time",
+	     th->timestamp,th->value,th->table_start,th->where);
+    /* and execute sql */
+    if (getenv("RRDDEBUGSQL")) { startt=time(NULL); fprintf(stderr,"RRDDEBUGSQL: %li: executing %s\n",startt,sql); }
+    th->result=dbi_conn_query(th->conn,sql);
+    if (startt) { endt=time(NULL);fprintf(stderr,"RRDDEBUGSQL: %li: timing %li\n",endt,endt-startt); }
+    /* handle error case */
+    if (! th->result) {
+      dbi_conn_error(th->conn,(const char**)&dbi_errstr);      
+      if (startt) { fprintf(stderr,"RRDDEBUGSQL: %li: error %s\n",endt,dbi_errstr); }
+      rrd_set_error("libdbi: problems with query: %s - errormessage: %s",sql,dbi_errstr);
+      _sql_close(th);
+      return -1;
+    }
+  }
+  /* and now fetch key and value */
+  if (! dbi_result_next_row(th->result)) {
+    /* free result */
+    dbi_result_free(th->result);
+    th->result=NULL;
+    /* and call recursively - this will open the next table or close connection as a whole*/
+    return _sql_fetchrow(th,timestamp,value);
+  } 
+  /* and return with flag for one value */
+  *timestamp=rrd_fetch_dbi_long(th->result,1);
+  *value=rrd_fetch_dbi_double(th->result,2);
+  return 1;
+}
+
+char* _find_next_separator(char* start,char separator) {
+  char* found=strchr(start,separator);
+  /* have we found it */
+  if (found) {
+    /* then 0 terminate current string */
+    *found=0; 
+    /* and return the pointer past the separator */
+    return (found+1);
+  }
+  /* not found, so return NULL */
+  return NULL;
+}
+
+char* _find_next_separator_twice(char*start,char separator) {
+  char *found=start;
+  /* find next separator in string*/
+  while (found) {
+    /* if found and the next one is also a separator */
+    if (found[1] == separator) {
+      /* then 0 terminate current string */
+      *found=0;
+      /* and return the pointer past the current one*/
+      return (found+2);
+    }
+    /* find next occurance */
+    found=strchr(found+1,separator);
+  }
+  /* not found, so return NULL */
+  return NULL;
+}
+
+char _hexcharhelper(char c) {
+  switch (c) {
+  case '0': return 0 ; break;
+  case '1': return 1 ; break;
+  case '2': return 2 ; break;
+  case '3': return 3 ; break;
+  case '4': return 4 ; break;
+  case '5': return 5 ; break;
+  case '6': return 6 ; break;
+  case '7': return 7 ; break;
+  case '8': return 8 ; break;
+  case '9': return 9 ; break;
+  case 'a': return 10 ; break;
+  case 'b': return 11 ; break;
+  case 'c': return 12 ; break;
+  case 'd': return 13 ; break;
+  case 'e': return 14 ; break;
+  case 'f': return 15 ; break;
+  case 'A': return 10 ; break;
+  case 'B': return 11 ; break;
+  case 'C': return 12 ; break;
+  case 'D': return 13 ; break;
+  case 'E': return 14 ; break;
+  case 'F': return 15 ; break;
+  }
+  return -1;
+}
+
+int _inline_unescape (char* string) {
+  char *src=string;
+  char *dst=string;
+  char c,h1,h2;
+  while((c= *src)) {
+    src++;
+    if (c == '%') {
+      if (*src == '%') { 
+	/* increase src pointer by 1 skiping second % */
+	src+=1;
+      } else {
+	/* try to calculate hex value from the next 2 values*/
+	h1=_hexcharhelper(*src);
+	if (h1<0) { rrd_set_error( "string escape error at: %s\n",string);return(1); }
+	h2=_hexcharhelper(*(src+1));
+	if (h2<0) { rrd_set_error( "string escape error at: %s\n",string);return(1); }
+	c=h2+(h1<<4);
+	/* increase src pointer by 2 skiping 2 chars */
+	src+=2;
+      } 
+    }
+    *dst=c;
+    dst++;
+  }
+  *dst=0;
+  return 0;
+}
+
+int
+rrd_fetch_fn_libdbi(
+    char           *filename,  /* name of the rrd */
+    enum cf_en     cf_idx,     /* which consolidation function ?*/
+    time_t         *start,
+    time_t         *end,       /* which time frame do you want ?
+			        * will be changed to represent reality */
+    unsigned long  *step,      /* which stepsize do you want? 
+				* will be changed to represent reality */
+    unsigned long  *ds_cnt,    /* number of data sources in file */
+    char           ***ds_namv, /* names of data_sources */
+    rrd_value_t    **data)     /* two dimensional array containing the data */
+{
+  /* the separator used */
+  char separator='/';
+  /* a local copy of the filename - used for copying plus some pointer variables */
+  char filenameworkcopy[10240];
+  char *tmpptr=filenameworkcopy;
+  char *nextptr=NULL;
+  char *libdbiargs=NULL;
+  char *sqlargs=NULL;
+  /* the settings for the "works" of rrd */
+  int fillmissing=0;
+  unsigned long minstepsize=300;
+  /* the result-set */
+  long r_timestamp,l_timestamp,d_timestamp;
+  double r_value,l_value,d_value;
+  int r_status;
+  int rows;
+  long idx;
+  int derive=0;
+  /* the libdbi connection data and the table_help structure */
+  struct sql_table_helper table_help;
+  char where[10240];
+  table_help.conn=NULL;
+  table_help.where=where;
+
+  /* some loop variables */
+  int i=0;
+
+  /* check header */
+  if (strncmp("sql",filename,3)!=0) { 
+    rrd_set_error( "formatstring wrong - %s",filename );return -1; 
+  }
+  if (filename[3]!=filename[4]) { 
+    rrd_set_error( "formatstring wrong - %s",filename );return -1; 
+  }
+
+  /* now make this the separator */
+  separator=filename[3];
+
+  /* copy filename for local modifications during parsing */
+  strncpy(filenameworkcopy,filename+5,sizeof(filenameworkcopy));
+
+  /* get the driver */
+  table_help.dbdriver=tmpptr;
+  libdbiargs=_find_next_separator(tmpptr,separator);
+  if (! libdbiargs) { 
+    /* error in argument */
+    rrd_set_error( "formatstring wrong as we did not find \"%c\"- %s",separator,table_help.dbdriver);
+    return -1; 
+  }
+
+  /* now find the next double separator - this defines the args to the database */
+  sqlargs=_find_next_separator_twice(libdbiargs,separator);
+  if (!sqlargs) {
+    rrd_set_error( "formatstring wrong for db arguments as we did not find \"%c%c\" in \"%s\"",separator,separator,libdbiargs);
+    return 1;
+  }
+
+  /* now we can start with the SQL Statement - best to start with this first, 
+     as then the error-handling is easier, as we do not have to handle libdbi shutdown as well */
+
+  /* parse the table(s) */
+  table_help.table_start=sqlargs;
+  nextptr=_find_next_separator(table_help.table_start,separator);
+  if (! nextptr) { 
+    /* error in argument */
+    rrd_set_error( "formatstring wrong - %s",tmpptr);
+    return -1; 
+  }
+  /* hex-unescape the value */
+  if(_inline_unescape(table_help.table_start)) { return -1; }
+
+  /* parse the unix timestamp column */
+  table_help.timestamp=nextptr;
+  nextptr=_find_next_separator(nextptr,separator);
+  if (! nextptr) { 
+    /* error in argument */
+    rrd_set_error( "formatstring wrong - %s",tmpptr);
+    return -1; 
+  }
+  /* hex-unescape the value */
+  if(_inline_unescape(table_help.timestamp)) { return -1; }
+
+  /* parse the value column */
+  table_help.value=nextptr;
+  nextptr=_find_next_separator(nextptr,separator);
+  if (! nextptr) { 
+    /* error in argument */
+    rrd_set_error( "formatstring wrong - %s",tmpptr);
+    return -1; 
+  }
+  /* hex-unescape the value */
+  if(_inline_unescape(table_help.value)) { return -1; }
+  
+  /* now prepare WHERE clause as empty string*/
+  where[0]=0;
+
+  /* and the where clause */
+  sqlargs=nextptr;
+  while(sqlargs) {
+    /* find next separator */
+    nextptr=_find_next_separator(sqlargs,separator);
+    /* now handle fields */
+    if (strcmp(sqlargs,"derive")==0) { /* the derive option with the default allowed max delta */
+      derive=600;
+    } else if (strcmp(sqlargs,"prediction")==0) {
+      rrd_set_error("argument prediction is no longer supported in a VDEF - use new generic CDEF-functions instead");
+      return -1;
+    } else if (strcmp(sqlargs,"sigma")==0) {
+      rrd_set_error("argument sigma is no longer supported in a VDEF - use new generic CDEF-functions instead");
+      return -1;
+    } else if (*sqlargs==0) { /* ignore empty */
+    } else { /* else add to where string */
+      if (where[0]) {strcat(where," AND ");}
+      strcat(where,sqlargs);
+    }
+    /* and continue loop with next pointer */
+    sqlargs=nextptr;
+  }
+  /* and unescape */
+  if(_inline_unescape(where)) { return -1; }
+
+  /* now parse LIBDBI options - this start initializing libdbi and beyond this point we need to reset the db as well in case of errors*/
+  while (libdbiargs) {
+    /* find separator */
+    nextptr=_find_next_separator(libdbiargs,separator);
+    /* now find =, separating key from value*/
+    tmpptr=_find_next_separator(libdbiargs,'=');
+    if (! tmpptr) { 
+      rrd_set_error( "formatstring wrong for db arguments as we did not find \"=\" in \"%s\"",libdbiargs);
+      _sql_close(&table_help);
+      return 1;
+    }
+    /* hex-unescape the value */
+    if(_inline_unescape(tmpptr)) { return -1; }
+    /* now handle the key/value pair */
+    if (strcmp(libdbiargs,"rrdminstepsize")==0) { /* allow override for minstepsize */
+      i=atoi(tmpptr);if (i>0) { minstepsize=i; }
+    } else if (strcmp(libdbiargs,"rrdfillmissing")==0) { /* allow override for minstepsize */
+      i=atoi(tmpptr);if (i>0) { fillmissing=i; }
+    } else if (strcmp(libdbiargs,"rrdderivemaxstep")==0) { /* allow override for minstepsize */
+      i=atoi(tmpptr);if (i>0) { derive=i; }
+    } else { /* store in libdbi, as these are parameters */
+      if (_sql_setparam(&table_help,libdbiargs,tmpptr)) { 
+	_sql_close(&table_help);
+	return -1; 
+      }
+    }
+    /* and continue loop with next pointer */
+    libdbiargs=nextptr;
+  }
+  
+  /* and modify step if given */
+  if (*step<minstepsize) {*step=minstepsize;}
+  *start-=(*start)%(*step);
+  *end-=(*end)%(*step);
+
+  /* and append the SQL WHERE Clause for the timeframe calculated above (adding AND if required) */
+  if (where[0]) {strcat(where," AND ");}
+  i=strlen(where);
+  snprintf(where+i,sizeof(where)-1-i,"%li < %s AND %s < %li",*start,table_help.timestamp,table_help.timestamp,*end);
+
+  /* and now calculate the number of rows in the resultset... */
+  rows=((*end)-(*start))/(*step)+2;
+  
+  /* define the result set variables/columns returned */
+  *ds_cnt=5;
+  *ds_namv=(char**)malloc((*ds_cnt)*sizeof(char*));
+  for (i=0;i<(int)(*ds_cnt);i++) {
+    tmpptr=(char*)malloc(sizeof(char) * DS_NAM_SIZE);
+    (*ds_namv)[i]=tmpptr;
+    /* now copy what is required */
+    switch (i) {
+    case 0: strncpy(tmpptr,"min",DS_NAM_SIZE-1); break;
+    case 1: strncpy(tmpptr,"avg",DS_NAM_SIZE-1); break;
+    case 2: strncpy(tmpptr,"max",DS_NAM_SIZE-1); break;
+    case 3: strncpy(tmpptr,"count",DS_NAM_SIZE-1); break;
+    case 4: strncpy(tmpptr,"sigma",DS_NAM_SIZE-1); break;
+    }
+  }
+
+  /* allocate memory for resultset (with the following columns: min,avg,max,count,sigma) */
+  i=rows * sizeof(rrd_value_t)*(*ds_cnt);
+  if (((*data) = malloc(i))==NULL){
+    /* and return error */
+    rrd_set_error("malloc failed for %i bytes",i);
+    return(-1);
+  }
+  /* and fill with NAN */
+  for(i=0;i<rows;i++) {
+    (*data)[i*(*ds_cnt)+0]=DNAN; /* MIN */
+    (*data)[i*(*ds_cnt)+1]=DNAN; /* AVG */
+    (*data)[i*(*ds_cnt)+2]=DNAN; /* MAX */
+    (*data)[i*(*ds_cnt)+3]=0;    /* COUNT */
+    (*data)[i*(*ds_cnt)+4]=DNAN; /* SIGMA */
+  }
+  /* and assign undefined values for last - in case of derived calculation */
+  l_value=DNAN;l_timestamp=0;
+  /* here goes the real work processing all data */
+  while((r_status=_sql_fetchrow(&table_help,&r_timestamp,&r_value))>0) {
+    /* processing of value */
+    /* calculate index for the timestamp */
+    idx=(r_timestamp-(*start))/(*step);
+    /* some out of bounds checks on idx */
+    if (idx<0) { idx=0;}
+    if (idx>rows) { idx=rows;}
+    /* and calculate derivative if necessary */
+    if (derive) {
+      /* calc deltas */
+      d_timestamp=r_timestamp-l_timestamp;
+      d_value=r_value-l_value;
+      /* assign current as last values */
+      l_timestamp=r_timestamp;
+      l_value=r_value;
+      /* assign DNAN by default for value */
+      r_value=DNAN;
+      /* check for timestamp delta to be within an acceptable range */
+      if ((d_timestamp>0)&&(d_timestamp<2*derive)) {
+	/* only handle positive delta - avoid wrap-arrounds/counter resets showing up as spikes */
+	if (d_value>0) {
+	  /* and normalize to per second */
+	  r_value=d_value/d_timestamp;
+	}
+      }
+    }
+    /* only add value if we have a value that is not NAN */
+    if (! isnan(r_value)) {
+      if ((*data)[idx*(*ds_cnt)+3]==0) { /* count is 0 so assign to overwrite DNAN */
+	(*data)[idx*(*ds_cnt)+0]=r_value; /* MIN */
+	(*data)[idx*(*ds_cnt)+1]=r_value; /* AVG */
+	(*data)[idx*(*ds_cnt)+2]=r_value; /* MAX */
+	(*data)[idx*(*ds_cnt)+3]=1;       /* COUNT */
+	(*data)[idx*(*ds_cnt)+4]=r_value; /* SIGMA */
+      } else {
+	/* MIN */
+	if ((*data)[idx*(*ds_cnt)+0]>r_value) { (*data)[idx*(*ds_cnt)+0]=r_value; }
+        /* AVG - at this moment still sum - corrected in post processing */
+	(*data)[idx*(*ds_cnt)+1]+=r_value;
+        /* MAX */
+	if ((*data)[idx*(*ds_cnt)+2]<r_value) { (*data)[idx*(*ds_cnt)+2]=r_value; }
+        /* COUNT */
+	(*data)[idx*(*ds_cnt)+3]++;
+        /* SIGMA - at this moment still sum of squares - corrected in post processing */
+	(*data)[idx*(*ds_cnt)+4]+=r_value*r_value;
+      }
+    }
+  }
+  /* and check for negativ status, pass back immediately */
+  if (r_status==-1) { return -1; }
+
+  /* post processing */
+  for(idx=0;idx<rows;idx++) {
+    long count=(*data)[idx*(*ds_cnt)+3];
+    if (count>0) {
+      /* calc deviation first */
+      if (count>2) {
+	r_value=count*(*data)[idx*(*ds_cnt)+4]-(*data)[idx*(*ds_cnt)+1]*(*data)[idx*(*ds_cnt)+1];
+	if (r_value<0) { 
+	  r_value=DNAN; 
+	} else {
+	  r_value=sqrt(r_value/(count*(count-1)));
+	}
+      }
+      (*data)[idx*(*ds_cnt)+4]=r_value;
+      /* now the average */
+      (*data)[idx*(*ds_cnt)+1]/=i;
+    }
+  }
+
+  /* and return OK */
+  return 0;
+}
Index: doc/rrdgraph_libdbi.pod
===================================================================
--- doc/rrdgraph_libdbi.pod	(revision 0)
+++ doc/rrdgraph_libdbi.pod	(revision 0)
@@ -0,0 +1,150 @@
+=head1 NAME
+
+rrdgraph_libdbi - fetching data for graphing in rrdtool graph via libdbi
+
+=head1 SYNOPSIS
+
+E<lt>rrdfileE<gt> = B<sql//E<lt>libdbi driverE<gt>/E<lt>driver-option-nameE<gt>=E<lt>driver-option-valueE<gt>/...[/rrdminstepsize=E<lt>stepsizeE<gt>][/rrdfillmissing=E<lt>fill missing n samplesE<gt>]//E<lt>tableE<gt>/E<lt>unixtimestamp columnE<gt>/E<lt>data value columnE<gt>[/derive]/E<lt>where clause 1E<gt>/.../E<lt>where clause nE<gt>>
+
+=head1 DESCRIPTION
+
+This pseudo-rrd-filename defines a sql datasource:
+
+=over 8
+
+=item B<sql//>
+
+  magic cookie-prefix for a libdbi type datasource
+
+=item B<E<lt>libdbi driverE<gt>>
+
+  which libdbi driver to use (e.g: mysql)
+
+=item B<E<lt>driver-option-nameE<gt>>=B<E<lt>driver-option-valueE<gt>>
+
+  defines the parameters that are required to connect to the database with the given libdbi driver
+  (These drivers are libdbi dependent - for details please look at the driver documentation of libdbi!)
+
+=item B</rrdminstepsize>=B<E<lt>minimum step sizeE<gt>>
+
+  defines the minimum number of the step-length used for graphing (default: 300 seconds)
+
+=item B</rrdfillmissing>=B<E<lt>fill missing stepsE<gt>>
+
+  defines the number of steps to fill with the last value to avoid NaN boxes due to data-insertation jitter (default: 0 steps)
+
+=item B<E<lt>tableE<gt>>
+
+  defines the table from which to fetch the resultset.
+
+  If there is a need to fetch data from several tables, these tables can be defined by separating the tablenames with a "+"
+
+  hex-type-encoding via %xx are translated to the actual value, use %% to use %
+
+=item B<E<lt>unixtimestamp columnE<gt>>
+
+  defines the column of E<lt>tableE<gt> which contains the unix timestamp
+
+  hex-type-encoding via %xx are translated to the actual value, use %% to use %
+
+=item B<E<lt>data value columnE<gt>>
+
+  defines the column of E<lt>tableE<gt> which contains the value column, which should be graphed
+
+  hex-type-encoding via %xx are translated to the actual value, use %% to use %
+
+=item B</derive>
+
+  defines that the data value used should be the delta of the 2 consecutive values (to simulate COUNTER or DERIVE type datasources)
+
+=item B</E<lt>where clause(s)E<gt>>
+
+  defines one (ore more) where clauses that are joined with AND to filter the entries in the <lt>table<gt>
+
+  hex-type-encoding via %xx are translated to the actual value, use %% to use %
+
+=back
+
+the returned value column-names, which can be used as ds-names, are:
+
+=over 8
+
+=item B<min>, B<avg>, B<max>, B<count> and B<sigma>
+
+  are returned to be used as ds-names in your DS definition.
+  The reason for using this is that if the consolidation function is used for min/avg and max, then the engine is used several times.
+  And this results in the same SQL Statements used several times
+
+=back
+
+=head1 EXAMPLES
+
+Here an example of a table in a mysql database:
+
+  DB connect information
+    dbhost=127.0.0.1
+    user=rrd
+    password=secret
+    database=rrd
+
+  here the table:
+    CREATE TABLE RRDValue (
+      RRDKeyID      bigint(20) NOT NULL,
+      UnixTimeStamp int(11) NOT NULL,
+      value         double default NOT NULL,
+      PRIMARY KEY  (RRDKeyID,UnixTimeStamp)
+    );
+
+and the RRDKeyID we want to graph for is: 1141942900757789274
+
+The pseudo rrd-filename to access this is:
+"sql//mysql/host=127.0.0.1/dbname=rrd/username=rrd/password=secret//RRDValue/UnixTimeStamp/value/RRDKeyID=1141464142203608274"
+
+To illustrate this here a command to create a graph that contains the actual values.
+
+  DS_BASE="sql//mysql/host=127.0.0.1/dbname=rrd/username=rrd/password=passwd//RRDValue/UnixTimeStamp/value/RRDKeyID=1141942900757789274"
+  rrdtool graph test.png --imgformat=PNG --start=-1day --end=+3hours --width=1000 --height=600 \
+    "DEF:min=$DS_BASE:min:AVERAGE" \
+    "LINE1:min#FF0000:value" \
+    "DEF:avg=$DS_BASE:avg:AVERAGE" \
+    "LINE1:avg#00FF00:average" \
+    "DEF:max=$DS_BASE:max:AVERAGE" \
+    "LINE1:max#FF0000:max" \
+    "DEF:sigma=$DS_BASE:sigma:AVERAGE" \
+    "CDEF:upper=avg,4,sigma,*,+" \
+    "LINE1:upper#0000FF:+4 sigma" \
+    "CDEF:lower=avg,4,sigma,*,-" \
+    "LINE1:lower#0000FF:-4 sigma"
+
+=head1 NOTES
+
+* Naturally you can also use any other kind of driver that libdbi supports - e.g postgress,...
+
+* From the way the datasource is joined, it should also be possible to do joins over different tables 
+  (separate tables with "," in table and add in the WHERE Clauses the table equal joins. 
+  This has not been tested!!!)
+
+* It should also be relatively simple to add to the database using the same datasource string.
+  This has not been implemented...
+
+* The aggregation functions are ignored and several data columns are used instead 
+  to avoid querying the same SQL several times when minimum, average and maximum are needed for graphing...
+
+* for DB efficiency you should think of having 2 tables, one containing historic values and the other containing the latest data.
+  This second table should be kept small to allow for the least ammount of blocking SQL statements.
+  Whith mysql you can even use myisam table-type for the first and InnoDB for the second. 
+  This is especially interresting as with tables with +100M rows myisam is much smaller then InnoDB.
+
+* To debug the SQL statements set the environment variable RRDDEBUGSQL and the actual SQL statements and the timing is printed to stderr.
+
+=head1 BUGS
+
+* at least on Linux please make sure that the libdbi driver is explicitly linked against libdbi.so.0 
+  check via ldd /usr/lib/dbd/libmysql.so, that there is a line with libdbi.so.0. 
+  otherwise at least the perl module RRDs will fail because the dynamic linker can not find some symbols from libdbi.so.
+  (this only happens when the libdbi driver is actually used the first time!)
+  This is KNOWN to be the case with RHEL4 and FC4 and FC5! (But actually this is a bug with libdbi make files!)
+
+=head1 AUTHOR
+
+Martin Sperl <[EMAIL PROTECTED]>

_______________________________________________
rrd-developers mailing list
[email protected]
https://lists.oetiker.ch/cgi-bin/listinfo/rrd-developers

[rrd-developers] Patch: rrdtool+libdbi-Database backend - patch for rrdtool 1.3

Reply via email to