[rrd-developers] src/rrd_open.c: Check arguments in `rrd_dontneed'.

2008-10-07 Thread Florian Forster
From: Florian Forster <[EMAIL PROTECTED]>

Daniel Pocock reported that the argument may be NULL in low-diskspace
situations, so check for that here to prevent a segmentation fault.
---
 src/rrd_open.c |7 +++
 1 files changed, 7 insertions(+), 0 deletions(-)

diff --git a/src/rrd_open.c b/src/rrd_open.c
index 2796506..f262413 100644
--- a/src/rrd_open.c
+++ b/src/rrd_open.c
@@ -364,6 +364,13 @@ void rrd_dontneed(
 unsigned long i;
 ssize_t   _page_size = sysconf(_SC_PAGESIZE);
 
+if (rrd_file == NULL) {
+#if defined DEBUG && DEBUG
+   fprintf (stderr, "rrd_dontneed: Argument 'rrd_file' is NULL.\n");
+#endif
+   return;
+}
+
 #if defined DEBUG && DEBUG > 1
 mincore_print(rrd_file, "before");
 #endif
-- 
1.5.6.3

___
rrd-developers mailing list
rrd-developers@lists.oetiker.ch
https://lists.oetiker.ch/cgi-bin/listinfo/rrd-developers


[rrd-developers] crash - full hard disk

2008-10-07 Thread Daniel.Pocock




As part of the scalability tests I've been doing, I regularly fill up my
hard disk with RRDs.

I've noticed that rrdtool (trunk, linked with Ganglia 3.1) creates one
or more files with size 0 or with other unusual sizes when the disk
fills up, and shortly after, there is a seg fault (gdb output below)

I wanted to create a ticket for this on the Trac system, but I couldn't
find the link for creating an account, and the account published on the
welcome page doesn't have permissions.



Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 46913030207808 (LWP 31151)]
0x003eac43052c in rrd_dontneed (rrd_file=0x0, rrd=0x2aaaca7fff10)
at rrd_open.c:335
335 rra_start = rrd_file->header_len;

(gdb) bt
#0  0x003eac43052c in rrd_dontneed (rrd_file=0x0,
rrd=0x2aaaca7fff10)
at rrd_open.c:335
#1  0x003eac408afd in rrd_create_fn (
file_name=0x2aaaca800710 "/.../cpu_system.rrd", 
rrd=0x2aaaca800090) at rrd_create.c:807
#2  0x003eac407d3f in rrd_create_r (
filename=0x2aaaca800710 "/.../cpu_system.rrd", pdp_step=10, 
last_up=1223381383, argc=7, argv=0x2aaaca8002b0) at rrd_create.c:548
#3  0x003eac4065fb in rrd_create (argc=13, argv=0x2aaaca800280)
at rrd_create.c:103
___

This e-mail may contain information that is confidential, privileged or 
otherwise protected from disclosure. If you are not an intended recipient of 
this e-mail, do not duplicate or redistribute it by any means. Please delete it 
and any attachments and notify the sender that you have received it in error. 
Unless specifically indicated, this e-mail is not an offer to buy or sell or a 
solicitation to buy or sell any securities, investment products or other 
financial product or service, an official confirmation of any transaction, or 
an official statement of Barclays. Any views or opinions presented are solely 
those of the author and do not necessarily represent those of Barclays. This 
e-mail is subject to terms available at the following link: 
www.barcap.com/emaildisclaimer. By messaging with Barclays you consent to the 
foregoing.  Barclays Capital is the investment banking division of Barclays 
Bank PLC, a company registered in England (number 1026167) with its registered 
offic
 e at 1 Churchill Place, London, E14 5HP.  This email may relate to or be sent 
from other members of the Barclays Group.
___

___
rrd-developers mailing list
rrd-developers@lists.oetiker.ch
https://lists.oetiker.ch/cgi-bin/listinfo/rrd-developers


Re: [rrd-developers] src/rrd_open.c: Check arguments in `rrd_dontneed'.

2008-10-07 Thread Tobias Oetiker

Today Florian Forster wrote:

> From: Florian Forster <[EMAIL PROTECTED]>
>
> Daniel Pocock reported that the argument may be NULL in low-diskspace
> situations, so check for that here to prevent a segmentation fault.
> ---
Thanks,

applied
tobi

-- 
Tobi Oetiker, OETIKER+PARTNER AG, Aarweg 15 CH-4600 Olten, Switzerland
http://it.oetiker.ch [EMAIL PROTECTED] ++41 62 775 9902 / sb: -9900

___
rrd-developers mailing list
rrd-developers@lists.oetiker.ch
https://lists.oetiker.ch/cgi-bin/listinfo/rrd-developers


[rrd-developers] [PATCH] improved journal sanity checks, cleanup

2008-10-07 Thread kevin brintnall
This patch introduces some extra safety checks in journal processing,
and cleans up the code a little bit.

 * moved journal initialization to its own function; main() is cleaner

 * any time we process a file, log the results
   (previous code only loggded if there was a valid entry)

 * After reading journals at startup, only trigger full flush out to disk
   if the user specified -F.  Avoids unnecessary IO on startup unless the
   user also wants unnecessary IO on shutdown.

 * journal_replay is much more careful about files it will open
 * must be a regular file
 * must be owned by daemon user
 * must not be group/other writable

 * Ensure that the journal gets created with the right permissions.
... even when the daemon is invoked with a permissive umask.
equivalent to "chmod a-x,go-w"

---
diff --git a/src/rrd_daemon.c b/src/rrd_daemon.c
index e2726e3..ea607d8 100644
--- a/src/rrd_daemon.c
+++ b/src/rrd_daemon.c
@@ -170,6 +170,7 @@ typedef enum queue_side_e queue_side_t;
  * Variables
  */
 static int stay_foreground = 0;
+static uid_t daemon_uid;
 
 static listen_socket_t *listen_fds = NULL;
 static size_t listen_fds_num = 0;
@@ -1446,6 +1447,7 @@ static int handle_request (listen_socket_t *sock, /* {{{ 
*/
 static void journal_rotate(void) /* {{{ */
 {
   FILE *old_fh = NULL;
+  int new_fd;
 
   if (journal_cur == NULL || journal_old == NULL)
 return;
@@ -1460,11 +1462,20 @@ static void journal_rotate(void) /* {{{ */
   if (journal_fh != NULL)
   {
 old_fh = journal_fh;
+journal_fh = NULL;
 rename(journal_cur, journal_old);
 ++stats_journal_rotate;
   }
 
-  journal_fh = fopen(journal_cur, "a");
+  new_fd = open(journal_cur, O_WRONLY|O_CREAT|O_APPEND,
+S_IRUSR|S_IWUSR|S_IRGRP|S_IROTH);
+  if (new_fd >= 0)
+  {
+journal_fh = fdopen(new_fd, "a");
+if (journal_fh == NULL)
+  close(new_fd);
+  }
+
   pthread_mutex_unlock(&journal_lock);
 
   if (old_fh != NULL)
@@ -1542,6 +1553,44 @@ static int journal_replay (const char *file) /* {{{ */
 
   if (file == NULL) return 0;
 
+  {
+char *reason;
+int status = 0;
+struct stat statbuf;
+
+memset(&statbuf, 0, sizeof(statbuf));
+if (stat(file, &statbuf) != 0)
+{
+  if (errno == ENOENT)
+return 0;
+
+  reason = "stat error";
+  status = errno;
+}
+else if (!S_ISREG(statbuf.st_mode))
+{
+  reason = "not a regular file";
+  status = EPERM;
+}
+if (statbuf.st_uid != daemon_uid)
+{
+  reason = "not owned by daemon user";
+  status = EACCES;
+}
+if (statbuf.st_mode & (S_IWGRP|S_IWOTH))
+{
+  reason = "must not be user/group writable";
+  status = EACCES;
+}
+
+if (status != 0)
+{
+  RRDD_LOG(LOG_ERR, "journal_replay: %s : %s (%s)",
+   file, rrd_strerror(status), reason);
+  return 0;
+}
+  }
+
   fh = fopen(file, "r");
   if (fh == NULL)
   {
@@ -1582,17 +1631,36 @@ static int journal_replay (const char *file) /* {{{ */
 
   fclose(fh);
 
-  if (entry_cnt > 0)
-  {
-RRDD_LOG(LOG_INFO, "Replayed %d entries (%d failures)",
- entry_cnt, fail_cnt);
-return 1;
-  }
-  else
-return 0;
+  RRDD_LOG(LOG_INFO, "Replayed %d entries (%d failures)",
+   entry_cnt, fail_cnt);
 
+  return entry_cnt > 0 ? 1 : 0;
 } /* }}} static int journal_replay */
 
+static void journal_init(void) /* {{{ */
+{
+  int had_journal = 0;
+
+  if (journal_cur == NULL) return;
+
+  pthread_mutex_lock(&journal_lock);
+
+  RRDD_LOG(LOG_INFO, "checking for journal files");
+
+  had_journal += journal_replay(journal_old);
+  had_journal += journal_replay(journal_cur);
+
+  /* it must have been a crash.  start a flush */
+  if (had_journal && config_flush_at_shutdown)
+flush_old_values(-1);
+
+  pthread_mutex_unlock(&journal_lock);
+  journal_rotate();
+
+  RRDD_LOG(LOG_INFO, "journal processing complete");
+
+} /* }}} static void journal_init */
+
 static void close_connection(listen_socket_t *sock)
 {
   close(sock->fd) ;  sock->fd   = -1;
@@ -2075,6 +2143,8 @@ static int daemonize (void) /* {{{ */
   int fd;
   char *base_dir;
 
+  daemon_uid = geteuid();
+
   fd = open_pidfile();
   if (fd < 0) return fd;
 
@@ -2399,25 +2469,7 @@ int main (int argc, char **argv)
 return (1);
   }
 
-  if (journal_cur != NULL)
-  {
-int had_journal = 0;
-
-pthread_mutex_lock(&journal_lock);
-
-RRDD_LOG(LOG_INFO, "checking for journal files");
-
-had_journal += journal_replay(journal_old);
-had_journal += journal_replay(journal_cur);
-
-if (had_journal)
-  flush_old_values(-1);
-
-pthread_mutex_unlock(&journal_lock);
-journal_rotate();
-
-RRDD_LOG(LOG_INFO, "journal processing complete");
-  }
+  journal_init();
 
   /* start the queue thread */
   memset (&queue_thread, 0, sizeof (queue_thread));

___
rrd-developers mailing list
rrd-developers@lists.oetiker.ch
https://lists.oetiker.ch/cgi-bin/

Re: [rrd-developers] [PATCH] improved journal sanity checks, cleanup

2008-10-07 Thread Tobias Oetiker
Hi Kevin,

great ... applied

tobi


-- 
Tobi Oetiker, OETIKER+PARTNER AG, Aarweg 15 CH-4600 Olten, Switzerland
http://it.oetiker.ch [EMAIL PROTECTED] ++41 62 775 9902 / sb: -9900

___
rrd-developers mailing list
rrd-developers@lists.oetiker.ch
https://lists.oetiker.ch/cgi-bin/listinfo/rrd-developers


Re: [rrd-developers] crash - full hard disk

2008-10-07 Thread Bernard Li
Hi Daniel:

On Tue, Oct 7, 2008 at 5:18 AM,  <[EMAIL PROTECTED]> wrote:

> I wanted to create a ticket for this on the Trac system, but I couldn't
> find the link for creating an account, and the account published on the
> welcome page doesn't have permissions.

http://oss.oetiker.ch/rrdtool-trac/

The username and password is in that page under "Editing".

Cheers,

Bernard

___
rrd-developers mailing list
rrd-developers@lists.oetiker.ch
https://lists.oetiker.ch/cgi-bin/listinfo/rrd-developers


[rrd-developers] [PATCH] rrdcached: better permissions handling

2008-10-07 Thread kevin brintnall
This patch moves the permission handling code around a bit.

 * moved privilege checks into the command handler functions
   (possible now that we pass the sock data structures around)

 * on UPDATE, delay journal_write until after check_file_access().
   previously, it was possible for a high-priv socket to introduce
   commands into the journal that could be replayed if they were
   still in the journal at next startup.

 * moved has_privilege() further up in the file to avoid need
   for prototype.

---
diff --git a/src/rrd_daemon.c b/src/rrd_daemon.c
index ea607d8..30cf748 100644
--- a/src/rrd_daemon.c
+++ b/src/rrd_daemon.c
@@ -943,6 +943,20 @@ err:
   return 0;
 } /* }}} static int check_file_access */
 
+/* returns 1 if we have the required privilege level,
+ * otherwise issue an error to the user on sock */
+static int has_privilege (listen_socket_t *sock, /* {{{ */
+  socket_privilege priv)
+{
+  if (sock == NULL) /* journal replay */
+return 1;
+
+  if (sock->privilege >= priv)
+return 1;
+
+  return send_response(sock, RESP_ERR, "%s\n", rrd_strerror(EACCES));
+} /* }}} static int has_privilege */
+
 static int flush_file (const char *filename) /* {{{ */
 {
   cache_item_t *ci;
@@ -1169,6 +1183,11 @@ static int handle_request_flush (listen_socket_t *sock, 
/* {{{ */
 
 static int handle_request_flushall(listen_socket_t *sock) /* {{{ */
 {
+  int status;
+
+  status = has_privilege(sock, PRIV_HIGH);
+  if (status <= 0)
+return status;
 
   RRDD_LOG(LOG_DEBUG, "Received FLUSHALL");
 
@@ -1185,12 +1204,20 @@ static int handle_request_update (listen_socket_t 
*sock, /* {{{ */
   char *file;
   int values_num = 0;
   int status;
+  char orig_buf[CMD_MAX];
 
   time_t now;
   cache_item_t *ci;
 
   now = time (NULL);
 
+  status = has_privilege(sock, PRIV_HIGH);
+  if (status <= 0)
+return status;
+
+  /* save it for the journal later */
+  strncpy(orig_buf, buffer, sizeof(orig_buf)-1);
+
   status = buffer_get_field (&buffer, &buffer_size, &file);
   if (status != 0)
 return send_response(sock, RESP_ERR,
@@ -1258,6 +1285,10 @@ static int handle_request_update (listen_socket_t *sock, 
/* {{{ */
   } /* }}} */
   assert (ci != NULL);
 
+  /* don't re-write updates in replay mode */
+  if (sock != NULL)
+journal_write("update", orig_buf);
+
   while (buffer_size > 0)
   {
 char **temp;
@@ -1366,19 +1397,6 @@ static int batch_done (listen_socket_t *sock) /* {{{ */
   return send_response(sock, RESP_OK, "errors\n");
 } /* }}} static int batch_done */
 
-/* returns 1 if we have the required privilege level */
-static int has_privilege (listen_socket_t *sock, /* {{{ */
-  socket_privilege priv)
-{
-  if (sock == NULL) /* journal replay */
-return 1;
-
-  if (sock->privilege >= priv)
-return 1;
-
-  return send_response(sock, RESP_ERR, "%s\n", rrd_strerror(EACCES));
-} /* }}} static int has_privilege */
-
 /* if sock==NULL, we are in journal replay mode */
 static int handle_request (listen_socket_t *sock, /* {{{ */
char *buffer, size_t buffer_size)
@@ -1402,17 +1420,7 @@ static int handle_request (listen_socket_t *sock, /* {{{ 
*/
 sock->batch_cmd++;
 
   if (strcasecmp (command, "update") == 0)
-  {
-status = has_privilege(sock, PRIV_HIGH);
-if (status <= 0)
-  return status;
-
-/* don't re-write updates in replay mode */
-if (sock != NULL)
-  journal_write(command, buffer_ptr);
-
 return (handle_request_update (sock, buffer_ptr, buffer_size));
-  }
   else if (strcasecmp (command, "wrote") == 0 && sock == NULL)
   {
 /* this is only valid in replay mode */
@@ -1421,13 +1429,7 @@ static int handle_request (listen_socket_t *sock, /* {{{ 
*/
   else if (strcasecmp (command, "flush") == 0)
 return (handle_request_flush (sock, buffer_ptr, buffer_size));
   else if (strcasecmp (command, "flushall") == 0)
-  {
-status = has_privilege(sock, PRIV_HIGH);
-if (status <= 0)
-  return status;
-
 return (handle_request_flushall(sock));
-  }
   else if (strcasecmp (command, "stats") == 0)
 return (handle_request_stats (sock));
   else if (strcasecmp (command, "help") == 0)

___
rrd-developers mailing list
rrd-developers@lists.oetiker.ch
https://lists.oetiker.ch/cgi-bin/listinfo/rrd-developers


[rrd-developers] [PATCH] rrdcached: "PENDING" and "FORGET" for cache management

2008-10-07 Thread kevin brintnall
This patch introduces two new commands for cache management:

 PENDING: shows any un-written updates for a file
 FORGET : remove a file completely from cache

This applies cleanly on top of my previous patch ("better permissions
handling").

---
diff --git a/doc/rrdcached.pod b/doc/rrdcached.pod
index 4bd1bb8..b01165e 100644
--- a/doc/rrdcached.pod
+++ b/doc/rrdcached.pod
@@ -363,6 +363,15 @@ sent B the node has been dequeued.
 Causes the daemon to start flushing ALL pending values to disk.  This
 returns immediately, even though the writes may take a long time.
 
+=item B I
+
+Shows any "pending" updates for a file, in order.  The updates shown have
+not yet been written to the underlying RRD file.
+
+=item B I
+
+Removes I from the cache.  Any pending updates B.
+
 =item B [I]
 
 Returns a short usage message. If no command is given, or I is
diff --git a/src/rrd_daemon.c b/src/rrd_daemon.c
index 30cf748..9c8847d 100644
--- a/src/rrd_daemon.c
+++ b/src/rrd_daemon.c
@@ -540,6 +540,34 @@ static void remove_from_queue(cache_item_t *ci) /* {{{ */
   ci->flags &= ~CI_FLAGS_IN_QUEUE;
 } /* }}} static void remove_from_queue */
 
+/* remove an entry from the tree and free all its resources.
+ * must hold 'cache lock' while calling this.
+ * returns 0 on success, otherwise errno */
+static int forget_file(const char *file)
+{
+  cache_item_t *ci;
+
+  ci = g_tree_lookup(cache_tree, file);
+  if (ci == NULL)
+return ENOENT;
+
+  g_tree_remove (cache_tree, file);
+  remove_from_queue(ci);
+
+  for (int i=0; i < ci->values_num; i++)
+free(ci->values[i]);
+
+  free (ci->values);
+  free (ci->file);
+
+  /* in case anyone is waiting */
+  pthread_cond_broadcast(&ci->flushed);
+
+  free (ci);
+
+  return 0;
+} /* }}} static int forget_file */
+
 /*
  * enqueue_cache_item:
  * `cache_lock' must be acquired before calling this function!
@@ -674,26 +702,10 @@ static int flush_old_values (int max_age)
 
   for (k = 0; k < cfd.keys_num; k++)
   {
-cache_item_t *ci;
-
-/* This must not fail. */
-ci = (cache_item_t *) g_tree_lookup (cache_tree, cfd.keys[k]);
-assert (ci != NULL);
-
-/* If we end up here with values available, something's seriously
- * messed up. */
-assert (ci->values_num == 0);
-
-/* Remove the node from the tree */
-g_tree_remove (cache_tree, cfd.keys[k]);
-cfd.keys[k] = NULL;
-
-/* Now free and clean up `ci'. */
-free (ci->file);
-ci->file = NULL;
-free (ci);
-ci = NULL;
-  } /* for (k = 0; k < cfd.keys_num; k++) */
+/* should never fail, since we have held the cache_lock
+ * the entire time */
+assert( forget_file(cfd.keys[k]) == 0 );
+  }
 
   if (cfd.keys != NULL)
   {
@@ -977,6 +989,9 @@ static int flush_file (const char *filename) /* {{{ */
 pthread_cond_wait(&ci->flushed, &cache_lock);
   }
 
+  /* DO NOT DO ANYTHING WITH ci HERE!!  The entry
+   * may have been purged during our cond_wait() */
+
   pthread_mutex_unlock(&cache_lock);
 
   return (0);
@@ -993,9 +1008,11 @@ static int handle_request_help (listen_socket_t *sock, /* 
{{{ */
   {
 "Command overview\n"
 ,
+"HELP []\n"
 "FLUSH \n"
 "FLUSHALL\n"
-"HELP []\n"
+"PENDING \n"
+"FORGET \n"
 "UPDATE   [ ...]\n"
 "BATCH\n"
 "STATS\n"
@@ -1020,6 +1037,26 @@ static int handle_request_help (listen_socket_t *sock, 
/* {{{ */
 "Triggers writing of all pending updates.  Returns immediately.\n"
   };
 
+  char *help_pending[2] =
+  {
+"Help for PENDING\n"
+,
+"Usage: PENDING \n"
+"\n"
+"Shows any 'pending' updates for a file, in order.\n"
+"The updates shown have not yet been written to the underlying RRD file.\n"
+  };
+
+  char *help_forget[2] =
+  {
+"Help for FORGET\n"
+,
+"Usage: FORGET \n"
+"\n"
+"Removes the file completely from the cache.\n"
+"Any pending updates for the file will be lost.\n"
+  };
+
   char *help_update[2] =
   {
 "Help for UPDATE\n"
@@ -1078,6 +1115,10 @@ static int handle_request_help (listen_socket_t *sock, 
/* {{{ */
   help_text = help_flush;
 else if (strcasecmp (command, "flushall") == 0)
   help_text = help_flushall;
+else if (strcasecmp (command, "pending") == 0)
+  help_text = help_pending;
+else if (strcasecmp (command, "forget") == 0)
+  help_text = help_forget;
 else if (strcasecmp (command, "stats") == 0)
   help_text = help_stats;
 else if (strcasecmp (command, "batch") == 0)
@@ -1198,6 +1239,73 @@ static int handle_request_flushall(listen_socket_t 
*sock) /* {{{ */
   return send_response(sock, RESP_OK, "Started flush.\n");
 } /* }}} static int handle_request_flushall */
 
+static int handle_request_pending(listen_socket_t *sock, /* {{{ */
+  char *buffer, size_t buffer_size)
+{
+  int status;
+  char *file;
+  cache_item_t *ci;
+
+  status = buffer_get_field(&buffer, &buffer_size, &file);
+  if (status != 0)
+return send_response(sock, RESP_ERR,
+

Re: [rrd-developers] [PATCH] rrdcached: "PENDING" and "FORGET" for cache management

2008-10-07 Thread Tobias Oetiker
Today kevin brintnall wrote:

> This patch introduces two new commands for cache management:
>
>  PENDING: shows any un-written updates for a file
>  FORGET : remove a file completely from cache
>
> This applies cleanly on top of my previous patch ("better permissions
> handling").

thanks
tobi

>

-- 
Tobi Oetiker, OETIKER+PARTNER AG, Aarweg 15 CH-4600 Olten, Switzerland
http://it.oetiker.ch [EMAIL PROTECTED] ++41 62 775 9902 / sb: -9900

___
rrd-developers mailing list
rrd-developers@lists.oetiker.ch
https://lists.oetiker.ch/cgi-bin/listinfo/rrd-developers


Re: [rrd-developers] [patch] rrdcached init script and spec file

2008-10-07 Thread Bernard Li
Hi all:

I have some comments regarding the rrdtool spec file that is in trunk
now (which includes changes to incorporate rrdcached).

First of all, thanks to Daniel for putting this together (saves me the
work, heh).

However, I have two comments:

1) I think we should break this out as a separate subpackage such as
rrdtool-rrdcached as I don't think rrdcached is needed by your
everyday installation (only large installations).  Incorporating it in
the main rrdtool package and especially by including an init script
gives users the impression that this is something that is needed by
everybody, which I don't think is the case.

2) By default, the rrdcached daemon is started once you install the
RPM.  While I think it is fine to add rrdcached as a service, I don't
think it's a good idea to start up the daemon by default especially
when one might want to make some configuration changes prior starting.
 It should be left to the user to start/top the daemon as they like.

If you guys agree, I can go ahead and create a patch for the above two
points.  I may have additional comments after I've had some time to
play with the new code.

Thanks,

Bernard

___
rrd-developers mailing list
rrd-developers@lists.oetiker.ch
https://lists.oetiker.ch/cgi-bin/listinfo/rrd-developers


[rrd-developers] [PATCH] Update spec file to include librrd.pc file

2008-10-07 Thread Bernard Li
Hi Tobi:

This patch updates the spec file and includes the librrd.pc file in
the -devel subpackage so that you can build the RPM again.

Thanks,

Bernard
Index: rrdtool.spec
===
--- rrdtool.spec	(revision 1588)
+++ rrdtool.spec	(working copy)
@@ -312,6 +312,7 @@
 %{_includedir}/*.h
 %exclude %{_libdir}/*.la
 %{_libdir}/*.so
+%{_libdir}/pkgconfig/librrd.pc
 
 %files doc
 %defattr(-,root,root,-)
@@ -357,6 +358,9 @@
 %endif
 
 %changelog
+* Tue Oct 07 2008 Bernard Li <[EMAIL PROTECTED]>
+- Include librrd.pc file in -devel package
+
 * Sun Jun 08 2008 Jarod Wilson <[EMAIL PROTECTED]> 1.3-0.20.rc9
 - Update to rrdtool 1.3 rc9
 - Minor spec tweaks to permit building on older EL
___
rrd-developers mailing list
rrd-developers@lists.oetiker.ch
https://lists.oetiker.ch/cgi-bin/listinfo/rrd-developers


Re: [rrd-developers] [patch] rrdcached init script and spec file

2008-10-07 Thread Bernard Li
Hi Daniel:

The init script does not work on my system (CentOS 4.x) as is, because
the `daemon` function which I have does not support --pidfile -- is
that argument necessary?

Also, as discussed previously, I think it would be a good idea to
create a 'rrdcached' user and group and start the daemon as that user
instead of nobody.  For application-specific (eg. Ganglia)
implementations, we can just put the necessary users (such as nobody,
apache, ganglia) in the rrdcached group.

Thanks,

Bernard

On Tue, Oct 7, 2008 at 3:58 PM, Bernard Li <[EMAIL PROTECTED]> wrote:
> Hi all:
>
> I have some comments regarding the rrdtool spec file that is in trunk
> now (which includes changes to incorporate rrdcached).
>
> First of all, thanks to Daniel for putting this together (saves me the
> work, heh).
>
> However, I have two comments:
>
> 1) I think we should break this out as a separate subpackage such as
> rrdtool-rrdcached as I don't think rrdcached is needed by your
> everyday installation (only large installations).  Incorporating it in
> the main rrdtool package and especially by including an init script
> gives users the impression that this is something that is needed by
> everybody, which I don't think is the case.
>
> 2) By default, the rrdcached daemon is started once you install the
> RPM.  While I think it is fine to add rrdcached as a service, I don't
> think it's a good idea to start up the daemon by default especially
> when one might want to make some configuration changes prior starting.
>  It should be left to the user to start/top the daemon as they like.
>
> If you guys agree, I can go ahead and create a patch for the above two
> points.  I may have additional comments after I've had some time to
> play with the new code.
>
> Thanks,
>
> Bernard
>

___
rrd-developers mailing list
rrd-developers@lists.oetiker.ch
https://lists.oetiker.ch/cgi-bin/listinfo/rrd-developers


[rrd-developers] [BUG] Typo with rrdcached manpage

2008-10-07 Thread Bernard Li
With rrdtool r1588, the manpage for rrdcached's "ERROR REPORTING" reads:

---cut---
Once this has happened, the daemon will send log messages to the
system logging daemon using syslog(3). The facility used it
"LOG_DAEMON".
---cut---

There is probably a typo in the last sentence.

Thanks,

Bernard

___
rrd-developers mailing list
rrd-developers@lists.oetiker.ch
https://lists.oetiker.ch/cgi-bin/listinfo/rrd-developers


[rrd-developers] rrdcached crashed with no logging

2008-10-07 Thread Bernard Li
Hi all:

I'm currently working with rrdcached from rrdtool r1588 and am having
problems getting it to integrate with Ganglia.

This has worked in the past (about 2 weeks ago).  Right now I'm trying
to figure out what's wrong.

It seems that the daemon crashed without logging to syslog.  I straced
the rrdcached process and here's what I got:

---cut---
accept(3, {sa_family=AF_FILE, [EMAIL PROTECTED], [2]) = 6
mmap2(NULL, 10489856, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS,
-1, 0) = 0xb613d000
mprotect(0xb613d000, 4096, PROT_NONE)   = 0
clone(child_stack=0xb6b3d4c4,
flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID|CLONE_DETACHED,
parent_tidptr=0xb6b3dbe8, {entry_number:6, base_addr:0xb6b3dba0,
limit:1048575, seg_32bit:1, contents:0, read_exec_only:0,
limit_in_pages:1, seg_not_present:0, useable:1},
child_tidptr=0xb6b3dbe8) = 10556
poll([{fd=3, events=POLLIN|POLLPRI, revents=POLLIN}], 1, 1000) = 1
accept(3, {sa_family=AF_FILE, [EMAIL PROTECTED], [2]) = 7
mmap2(NULL, 10489856, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS,
-1, 0) = 0xb573c000
mprotect(0xb573c000, 4096, PROT_NONE)   = 0
clone(child_stack=0xb613c4c4,
flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID|CLONE_DETACHED,
parent_tidptr=0xb613cbe8, {entry_number:6, base_addr:0xb613cba0,
limit:1048575, seg_32bit:1, contents:0, read_exec_only:0,
limit_in_pages:1, seg_not_present:0, useable:1},
child_tidptr=0xb613cbe8) = 10560
poll([{fd=3, events=POLLIN|POLLPRI, revents=POLLIN}], 1, 1000) = 1
brk(0x906f000)  = 0x906f000
futex(0x5ad820, FUTEX_WAKE, 1)  = 1
accept(3, {sa_family=AF_FILE, [EMAIL PROTECTED], [2]) = 8
mmap2(NULL, 10489856, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS,
-1, 0) = 0xb4bff000
mprotect(0xb4bff000, 4096, PROT_NONE)   = 0
clone(child_stack=0xb55ff4c4,
flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID|CLONE_DETACHED,
parent_tidptr=0xb55ffbe8, {entry_number:6, base_addr:0xb55ffba0,
limit:1048575, seg_32bit:1, contents:0, read_exec_only:0,
limit_in_pages:1, seg_not_present:0, useable:1},
child_tidptr=0xb55ffbe8) = 10561
poll([{fd=3, events=POLLIN|POLLPRI, revents=POLLIN}], 1, 1000) = 1
accept(3, {sa_family=AF_FILE, [EMAIL PROTECTED], [2]) = 9
mmap2(NULL, 10489856, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS,
-1, 0) = 0xb41fe000
mprotect(0xb41fe000, 4096, PROT_NONE)   = 0
clone(child_stack=0xb4bfe4c4,
flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID|CLONE_DETACHED,
parent_tidptr=0xb4bfebe8, {entry_number:6, base_addr:0xb4bfeba0,
limit:1048575, seg_32bit:1, contents:0, read_exec_only:0,
limit_in_pages:1, seg_not_present:0, useable:1},
child_tidptr=0xb4bfebe8) = 10562
poll([{fd=3, events=POLLIN|POLLPRI}], 1, 1000) = -1 EINTR (Interrupted
system call)
+++ killed by SIGABRT +++
---cut---

Ganglia was running `rrdtool graph - --daemon
unix:/var/run/rrdcached/rrdcached.sock ...` command when it crashed.

Thanks,

Bernard

___
rrd-developers mailing list
rrd-developers@lists.oetiker.ch
https://lists.oetiker.ch/cgi-bin/listinfo/rrd-developers


Re: [rrd-developers] rrdcached crashed with no logging

2008-10-07 Thread kevin brintnall
On Tue, Oct 07, 2008 at 06:01:02PM -0700, Bernard Li wrote:
> It seems that the daemon crashed without logging to syslog.  I straced
> the rrdcached process and here's what I got:

Bernard,

Do you have a backtrace?  Also, what OS are you using?

The interrupted poll() system call is in listen_thread_main (you can tell
by the timeout of 1sec).

I would not expect to catch a SIGABRT.  Possibly an assertion is being
violated.  A backtrace would be very helpful.

-- 
 kevin brintnall =~ /[EMAIL PROTECTED]/

> ---cut---
> accept(3, {sa_family=AF_FILE, [EMAIL PROTECTED], [2]) = 6
> mmap2(NULL, 10489856, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS,
> -1, 0) = 0xb613d000
> mprotect(0xb613d000, 4096, PROT_NONE)   = 0
> clone(child_stack=0xb6b3d4c4,
> flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID|CLONE_DETACHED,
> parent_tidptr=0xb6b3dbe8, {entry_number:6, base_addr:0xb6b3dba0,
> limit:1048575, seg_32bit:1, contents:0, read_exec_only:0,
> limit_in_pages:1, seg_not_present:0, useable:1},
> child_tidptr=0xb6b3dbe8) = 10556
> poll([{fd=3, events=POLLIN|POLLPRI, revents=POLLIN}], 1, 1000) = 1
> accept(3, {sa_family=AF_FILE, [EMAIL PROTECTED], [2]) = 7
> mmap2(NULL, 10489856, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS,
> -1, 0) = 0xb573c000
> mprotect(0xb573c000, 4096, PROT_NONE)   = 0
> clone(child_stack=0xb613c4c4,
> flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID|CLONE_DETACHED,
> parent_tidptr=0xb613cbe8, {entry_number:6, base_addr:0xb613cba0,
> limit:1048575, seg_32bit:1, contents:0, read_exec_only:0,
> limit_in_pages:1, seg_not_present:0, useable:1},
> child_tidptr=0xb613cbe8) = 10560
> poll([{fd=3, events=POLLIN|POLLPRI, revents=POLLIN}], 1, 1000) = 1
> brk(0x906f000)  = 0x906f000
> futex(0x5ad820, FUTEX_WAKE, 1)  = 1
> accept(3, {sa_family=AF_FILE, [EMAIL PROTECTED], [2]) = 8
> mmap2(NULL, 10489856, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS,
> -1, 0) = 0xb4bff000
> mprotect(0xb4bff000, 4096, PROT_NONE)   = 0
> clone(child_stack=0xb55ff4c4,
> flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID|CLONE_DETACHED,
> parent_tidptr=0xb55ffbe8, {entry_number:6, base_addr:0xb55ffba0,
> limit:1048575, seg_32bit:1, contents:0, read_exec_only:0,
> limit_in_pages:1, seg_not_present:0, useable:1},
> child_tidptr=0xb55ffbe8) = 10561
> poll([{fd=3, events=POLLIN|POLLPRI, revents=POLLIN}], 1, 1000) = 1
> accept(3, {sa_family=AF_FILE, [EMAIL PROTECTED], [2]) = 9
> mmap2(NULL, 10489856, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS,
> -1, 0) = 0xb41fe000
> mprotect(0xb41fe000, 4096, PROT_NONE)   = 0
> clone(child_stack=0xb4bfe4c4,
> flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID|CLONE_DETACHED,
> parent_tidptr=0xb4bfebe8, {entry_number:6, base_addr:0xb4bfeba0,
> limit:1048575, seg_32bit:1, contents:0, read_exec_only:0,
> limit_in_pages:1, seg_not_present:0, useable:1},
> child_tidptr=0xb4bfebe8) = 10562
> poll([{fd=3, events=POLLIN|POLLPRI}], 1, 1000) = -1 EINTR (Interrupted
> system call)
> +++ killed by SIGABRT +++
> ---cut---
> 
> Ganglia was running `rrdtool graph - --daemon
> unix:/var/run/rrdcached/rrdcached.sock ...` command when it crashed.
> 
> Thanks,
> 
> Bernard
> 
> ___
> rrd-developers mailing list
> rrd-developers@lists.oetiker.ch
> https://lists.oetiker.ch/cgi-bin/listinfo/rrd-developers

___
rrd-developers mailing list
rrd-developers@lists.oetiker.ch
https://lists.oetiker.ch/cgi-bin/listinfo/rrd-developers


Re: [rrd-developers] rrdcached crashed with no logging

2008-10-07 Thread Bernard Li
Hi Kevin:

On Tue, Oct 7, 2008 at 7:46 PM, kevin brintnall <[EMAIL PROTECTED]> wrote:

> Do you have a backtrace?  Also, what OS are you using?

Here's the backtrace:

(gdb) bt
#0  0x002e47a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
#1  0x004a2815 in raise () from /lib/tls/libc.so.6
#2  0x004a4279 in abort () from /lib/tls/libc.so.6
#3  0x004d6cca in __libc_message () from /lib/tls/libc.so.6
#4  0x004dd55f in _int_free () from /lib/tls/libc.so.6
#5  0x004dd93a in free () from /lib/tls/libc.so.6
#6  0x0804ca34 in close_connection (sock=0x94a4d80) at rrd_daemon.c:1784
#7  0x0804ccde in connection_thread_main (args=0x94a4d80) at rrd_daemon.c:1888
#8  0x0046e3cc in start_thread () from /lib/tls/libpthread.so.0
#9  0x005441ae in clone () from /lib/tls/libc.so.6

I'm using CentOS 4.x i386.

Please let me know if you need additional info.

This is the output when I ran rrdcached in the foreground:

---cut---
rrdcached -p /var/run/rrdcached/rrdcached.pid -l
//var/run/rrdcached/rrdcached.sock -g
*** glibc detected *** double free or corruption (!prev): 0x094a4d80 ***
Aborted (core dumped)
---cut---

Thanks a lot,

Bernard

___
rrd-developers mailing list
rrd-developers@lists.oetiker.ch
https://lists.oetiker.ch/cgi-bin/listinfo/rrd-developers


[rrd-developers] rrdtool --daemon option

2008-10-07 Thread Bernard Li
Hi all:

Would the developers consider renaming the --daemon option for rrdtool
to something like --cache(d)?  To the uninitiated, they might think
this is the option to start rrdtool in daemon mode.

Thanks,

Bernard

___
rrd-developers mailing list
rrd-developers@lists.oetiker.ch
https://lists.oetiker.ch/cgi-bin/listinfo/rrd-developers


[rrd-developers] [PATCH] connection_thread_main: avoid double calls to close_connection

2008-10-07 Thread kevin brintnall
---
 src/rrd_daemon.c |9 ++---
 1 files changed, 2 insertions(+), 7 deletions(-)

diff --git a/src/rrd_daemon.c b/src/rrd_daemon.c
index 9c8847d..36d418b 100644
--- a/src/rrd_daemon.c
+++ b/src/rrd_daemon.c
@@ -1844,23 +1844,18 @@ static void *connection_thread_main (void *args) /* {{{ 
*/
 else if (status < 0) /* error */
 {
   status = errno;
-  if (status == EINTR)
-continue;
-  RRDD_LOG (LOG_ERR, "connection_thread_main: poll(2) failed.");
+  if (status != EINTR)
+RRDD_LOG (LOG_ERR, "connection_thread_main: poll(2) failed.");
   continue;
 }
 
 if ((pollfd.revents & POLLHUP) != 0) /* normal shutdown */
-{
-  close_connection(sock);
   break;
-}
 else if ((pollfd.revents & (POLLIN | POLLPRI)) == 0)
 {
   RRDD_LOG (LOG_WARNING, "connection_thread_main: "
   "poll(2) returned something unexpected: %#04hx",
   pollfd.revents);
-  close_connection(sock);
   break;
 }
 
-- 
1.6.0.2

___
rrd-developers mailing list
rrd-developers@lists.oetiker.ch
https://lists.oetiker.ch/cgi-bin/listinfo/rrd-developers


Re: [rrd-developers] rrdcached crashed with no logging

2008-10-07 Thread Bernard Li
Hi all:

The patch Kevin provided solved my issue:

http://www.mail-archive.com/rrd-developers@lists.oetiker.ch/msg02651.html

Thanks,

Bernard

On Tue, Oct 7, 2008 at 8:35 PM, Bernard Li <[EMAIL PROTECTED]> wrote:
> Hi Kevin:
>
> On Tue, Oct 7, 2008 at 7:46 PM, kevin brintnall <[EMAIL PROTECTED]> wrote:
>
>> Do you have a backtrace?  Also, what OS are you using?
>
> Here's the backtrace:
>
> (gdb) bt
> #0  0x002e47a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
> #1  0x004a2815 in raise () from /lib/tls/libc.so.6
> #2  0x004a4279 in abort () from /lib/tls/libc.so.6
> #3  0x004d6cca in __libc_message () from /lib/tls/libc.so.6
> #4  0x004dd55f in _int_free () from /lib/tls/libc.so.6
> #5  0x004dd93a in free () from /lib/tls/libc.so.6
> #6  0x0804ca34 in close_connection (sock=0x94a4d80) at rrd_daemon.c:1784
> #7  0x0804ccde in connection_thread_main (args=0x94a4d80) at rrd_daemon.c:1888
> #8  0x0046e3cc in start_thread () from /lib/tls/libpthread.so.0
> #9  0x005441ae in clone () from /lib/tls/libc.so.6
>
> I'm using CentOS 4.x i386.
>
> Please let me know if you need additional info.
>
> This is the output when I ran rrdcached in the foreground:
>
> ---cut---
> rrdcached -p /var/run/rrdcached/rrdcached.pid -l
> //var/run/rrdcached/rrdcached.sock -g
> *** glibc detected *** double free or corruption (!prev): 0x094a4d80 ***
> Aborted (core dumped)
> ---cut---
>
> Thanks a lot,
>
> Bernard
>

___
rrd-developers mailing list
rrd-developers@lists.oetiker.ch
https://lists.oetiker.ch/cgi-bin/listinfo/rrd-developers


Re: [rrd-developers] [patch] rrdcached init script and spec file

2008-10-07 Thread Tobias Oetiker
Hi Bernard,

Yesterday Bernard Li wrote:

> Hi Daniel:
>
> The init script does not work on my system (CentOS 4.x) as is, because
> the `daemon` function which I have does not support --pidfile -- is
> that argument necessary?
>
> Also, as discussed previously, I think it would be a good idea to
> create a 'rrdcached' user and group and start the daemon as that user
> instead of nobody.  For application-specific (eg. Ganglia)
> implementations, we can just put the necessary users (such as nobody,
> apache, ganglia) in the rrdcached group.

I think coupled with a split of the package into a cached and a
normal one this would be a sensible thing. As I said before the
daemon should NOT run as nobody since it writes files and there
must never be any files oned by nobody ... (hence the name).

cheers
tobi


>
> Thanks,
>
> Bernard
>
> On Tue, Oct 7, 2008 at 3:58 PM, Bernard Li <[EMAIL PROTECTED]> wrote:
> > Hi all:
> >
> > I have some comments regarding the rrdtool spec file that is in trunk
> > now (which includes changes to incorporate rrdcached).
> >
> > First of all, thanks to Daniel for putting this together (saves me the
> > work, heh).
> >
> > However, I have two comments:
> >
> > 1) I think we should break this out as a separate subpackage such as
> > rrdtool-rrdcached as I don't think rrdcached is needed by your
> > everyday installation (only large installations).  Incorporating it in
> > the main rrdtool package and especially by including an init script
> > gives users the impression that this is something that is needed by
> > everybody, which I don't think is the case.
> >
> > 2) By default, the rrdcached daemon is started once you install the
> > RPM.  While I think it is fine to add rrdcached as a service, I don't
> > think it's a good idea to start up the daemon by default especially
> > when one might want to make some configuration changes prior starting.
> >  It should be left to the user to start/top the daemon as they like.
> >
> > If you guys agree, I can go ahead and create a patch for the above two
> > points.  I may have additional comments after I've had some time to
> > play with the new code.
> >
> > Thanks,
> >
> > Bernard
> >
>
>

-- 
Tobi Oetiker, OETIKER+PARTNER AG, Aarweg 15 CH-4600 Olten, Switzerland
http://it.oetiker.ch [EMAIL PROTECTED] ++41 62 775 9902 / sb: -9900

___
rrd-developers mailing list
rrd-developers@lists.oetiker.ch
https://lists.oetiker.ch/cgi-bin/listinfo/rrd-developers