Re: [Patch 1/8] CLD: cleanup: add cld_msg_rpc.x

2010-04-17 Thread Jeff Garzik

On 04/16/2010 10:18 PM, Pete Zaitcev wrote:

On Wed, 14 Apr 2010 15:55:01 -0400
Jeff Garzikj...@garzik.org  wrote:


+++ b/lib/Makefile.am
@@ -27,6 +27,7 @@ libcldc_la_SOURCES=   \
common.c\
libtimer.c  \
pkt.c   \
+   cld_msg_rpc.x   \
cld_msg_rpc_xdr.c


that's quite strange, because I built an official rawhide copy just fine
without this...


Strange indeed, I re-checked and it went away now. Oh well.


I wonder if it's a problem with the 'clean' functionality.  The 
EXTRA_DIST line contains a list of things forced to be included in the 
tarball, typically used for things not contained in *_SOURCES.  AFAICT 
from the autoconf/automake docs, that is where sources for generated 
sources[1] should reside.  So I still wonder how it disappeared for you...


Jeff




[1] Brought to you by the Department of Redundant Redundancies


--
To unsubscribe from this list: send the line unsubscribe hail-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Trivial Q about chunkd's main_loop

2010-04-17 Thread Pete Zaitcev
Is there a reason why the main_loop in chunkd uses a naked
g_hash_table_lookup instead of srv_poll_lookup? Performance?

@@ -1681,8 +1681,7 @@ static int main_loop(void)
 
fired++;
 
-   sp = g_hash_table_lookup(chunkd_srv.fd_info,
-   GINT_TO_POINTER(pfd-fd));
+   sp = srv_poll_lookup(pfd-fd);
if (G_UNLIKELY(!sp)) {
/* BUG! */

-- Pete
--
To unsubscribe from this list: send the line unsubscribe hail-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Trivial Q about chunkd's main_loop

2010-04-17 Thread Jeff Garzik

On 04/17/2010 09:36 PM, Pete Zaitcev wrote:

Is there a reason why the main_loop in chunkd uses a naked
g_hash_table_lookup instead of srv_poll_lookup? Performance?

@@ -1681,8 +1681,7 @@ static int main_loop(void)

fired++;

-   sp = g_hash_table_lookup(chunkd_srv.fd_info,
-   GINT_TO_POINTER(pfd-fd));
+   sp = srv_poll_lookup(pfd-fd);
if (G_UNLIKELY(!sp)) {


Looks like it should be changed to call srv_poll_lookup(), indeed. 
srv_poll_lookup() is marked 'static', so there should not be any 
performance difference after the compiler's optimizer passes get 
finished with it.


Jeff



--
To unsubscribe from this list: send the line unsubscribe hail-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Patch 01/12] CLD: fix crash in retransmissions

2010-04-17 Thread Pete Zaitcev
For a longest time I was plagued by (very infrequent) crashes like this:

Program received signal SIGSEGV, Segmentation fault.
sess_retry_output (timer=0x92070c0) at session.c:532
532 if (!next_retry || (op-next_retry  next_retry))
(gdb) info threads
* 1 Thread 0xb72f96c0 (LWP 22417)  sess_retry_output (timer=0x92070c0) at 
session.c:532
(gdb) where
#0  sess_retry_output (timer=0x92070c0) at session.c:532
#1  session_retry (timer=0x92070c0) at session.c:565
#2  0x08049aee in cld_timers_run (tlist=0x8056630) at ../lib/libtimer.c:95
#3  0x0804e9cc in main_loop (argc=5, argv=0xbff70bd4) at server.c:983
#4  main (argc=5, argv=0xbff70bd4) at server.c:1138

The crash happens because op is NULL. As it turned out, this happens
if a packet retransmit and a session expiration occur simultaneously
(in the same pass of timers_run). The scenario is:
 - timers_run collects expired timers at exec list
 - timers_run expires session
 - two timer_del are called, but one of them is on exec list already,
   so it's ineffective
 - session is freed, this zeroes -data in lists (later op)
 - timers_run continues along the exec list, invokes the retransmission
   callback, and that crashes with NULL op.

The proposed solution is to rework the timers_run, again. But this
time, we'll make it simpler by observing that timers are ordered by
expiration time. Therefore, we can pull next timer off the list,
expire it, and loop until expiration time is greater than the current
time. No execution list is kept. The integrity of the main list
is assured by never walking it and always referring to the head
anew at each iteration.

This patch appears to fix the problem and stands up to use that
crashed the old code.

Signed-off-by: Pete Zaitcev zait...@redhat.com

---
 include/cld_common.h |   10 ++
 lib/libtimer.c   |   41 -
 2 files changed, 26 insertions(+), 25 deletions(-)

I am absurdly proud of this. But the long history of timer problem
in CLD makes me cautious. I already thought that we fixed everything
in the past.

commit c1b4a0c4dda14d9fbc4d5896e9e0dc139c31798b
Author: Master zait...@lembas.zaitcev.lan
Date:   Sat Apr 17 19:02:35 2010 -0600

Fix persistent CLD crashes in retransmits (op == NULL).

diff --git a/include/cld_common.h b/include/cld_common.h
index d269e6c..379bd9e 100644
--- a/include/cld_common.h
+++ b/include/cld_common.h
@@ -24,6 +24,7 @@
 #include stdbool.h
 #include string.h
 #include time.h
+#include glib.h
 #include openssl/sha.h
 #include cld_msg_rpc.h
 
@@ -31,10 +32,6 @@
 
 struct hail_log;
 
-struct cld_timer_list {
-   void *list;
-};
-
 struct cld_timer {
boolfired;
boolon_list;
@@ -44,6 +41,11 @@ struct cld_timer {
charname[32];
 };
 
+struct cld_timer_list {
+   GList *list;/* of struct cld_timer */
+   time_t  runmark;
+};
+
 extern void cld_timer_add(struct cld_timer_list *tlist, struct cld_timer 
*timer,
  time_t expires);
 extern void cld_timer_del(struct cld_timer_list *tlist, struct cld_timer 
*timer);
diff --git a/lib/libtimer.c b/lib/libtimer.c
index c6f5241..75514f0 100644
--- a/lib/libtimer.c
+++ b/lib/libtimer.c
@@ -44,6 +44,17 @@ void cld_timer_add(struct cld_timer_list *tlist, struct 
cld_timer *timer,
if (timer-on_list)
timer_list = g_list_remove(timer_list, timer);
 
+   /*
+* This additional resiliency is required by the invocations from
+* session_retry(). For some reason the computations in it result
+* in attempts to add timers in the past sometimes, and then we loop
+* when trying to run those. FIXME: maybe fix that one day.
+*
+* Even if we fix the callers, we probably should keep this.
+*/
+   if (expires  tlist-runmark + 1)
+   expires = tlist-runmark + 1;
+
timer-on_list = true;
timer-fired = false;
timer-expires = expires;
@@ -66,38 +77,26 @@ time_t cld_timers_run(struct cld_timer_list *tlist)
struct cld_timer *timer;
time_t now = time(NULL);
time_t next_timeout = 0;
-   GList *tmp, *cur;
-   GList *timer_list = tlist-list;
-   GList *exec_list = NULL;
-
-   tmp = timer_list;
-   while (tmp) {
-   timer = tmp-data;
-   cur = tmp;
-   tmp = tmp-next;
+   GList *cur;
 
+   tlist-runmark = now;
+   for (;;) {
+   cur = tlist-list;
+   if (!cur)
+   break;
+   timer = cur-data;
if (timer-expires  now)
break;
 
-   timer_list = g_list_remove_link(timer_list, cur);
-   exec_list = g_list_concat(exec_list, cur);
-
+   tlist-list = g_list_delete_link(tlist-list, cur);
timer-on_list = false;
-   }
-   

[Patch 03/12] CLD: fix commentary

2010-04-17 Thread Pete Zaitcev
Add and fix some comments regarding the reasons behind the pipe etc.
No code changes.

Signed-off-by: Pete Zaitcev zait...@redhat.com

---
 lib/cldc.c |   27 ---
 1 file changed, 24 insertions(+), 3 deletions(-)

commit e675f2f316bbb24ca84c1bc23e4d1c6d53b029de
Author: Master zait...@lembas.zaitcev.lan
Date:   Sat Apr 17 19:14:30 2010 -0600

Fix commentary.

diff --git a/lib/cldc.c b/lib/cldc.c
index c1e8993..6ab2fe4 100644
--- a/lib/cldc.c
+++ b/lib/cldc.c
@@ -1372,7 +1372,7 @@ enum {
 
 /*
  * All the error printouts are likely to be lost for daemons, but it's
- * not a big deal. We abort instead of exist to indicate that something
+ * not a big deal. We abort instead in order to indicate that something
  * went wrong, so system features should report it (usualy as a core).
  * When debugging, strace or -F mode will capture the output.
  */
@@ -1503,13 +1503,29 @@ static void ncld_p_event(void *priv, struct 
cldc_session *csp,
if (!nsess-is_up)
return;
nsess-is_up = false;
-   /* XXX wake up all I/O waiters here */
+
+   /*
+* The cldc layer must deliver the callbacks for all pending
+* CLD operations of the failed session. If we force-wake their
+* waiters, all sorts of funny things happen with the lifetimes
+* of related structures.
+*/
+   // 
+   // g_cond_broadcast(nsess-cond);
+
/*
 * This is a trick. As a direct callback from clcd layer,
 * we are running under nsess-mutex, so we cannot call back
 * into a user of ncld. If we do, it may invoke another
 * ncld operation and deadlock. So, bump session callbacks
 * into the part of the helper thread that runs unlocked.
+*
+* Notice that we are already running on the context of the
+* thread that will deliver the event, so pipe really is not
+* needed: could as well set a flag and test it right after
+* the call to cldc_udp_receive_pkt(). But pipe also provides
+* a queue of events, just in case. It's not like these events
+* are super-performance critical.
 */
cmd = NCLD_CMD_SESEV;
write(nsess-to_thread[1], cmd, 1);
@@ -1820,7 +1836,12 @@ int ncld_del(struct ncld_sess *nsess, const char *fname)
g_mutex_unlock(nsess-mutex);
return -rc;
}
-   /* XXX A delete operation is not accounted for end-session */
+   /* 
+* A delete operation is not accounted for end-session (e.g. nios).
+* This means: do not call ncld_del and ncld_close together.
+* The ncld_close can be invoked by ncld_sess_close, so don't
+* do that either.
+*/
g_mutex_unlock(nsess-mutex);
 
rc = ncld_wait_del(dpb);
--
To unsubscribe from this list: send the line unsubscribe hail-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Patch 06/12] Chunk: do not call timer_add from event context

2010-04-17 Thread Pete Zaitcev
No matter if timer or cld_timer is used, this was not valid.
Obviously, locking is missing, so only one thread can access a
certain tlist. But the actual hang was more interesting than a
race and crash. Suppose that we add the first timer. In that
case, main thread invokes poll() with no timeout. If we add
the timer from the event callback, nobody wakes up the poll,
so the newly-added timer is not checked for expiration and never
fires, causing the hang.

Note that if we look at the whole stack, the session failure event
is bounced through two pipes: one in ncld (loopback), and another
here. The pipe in ncld is not really needed for this, but we use
it for convenience. Seems weird and inefficient, but it's short
code and session events are not performance-critical.

P.S. There's also an important bugfix in this patch: the is_dead
should not be cleared if a retry timeout is scheduled.

Signed-off-by: Pete Zaitcev zait...@redhat.com

---
 server/cldu.c |   77 +++-
 1 file changed, 69 insertions(+), 8 deletions(-)

commit d30028b4c681e99f7934ca264a175548c827ff04
Author: Master zait...@lembas.zaitcev.lan
Date:   Sat Apr 17 19:46:35 2010 -0600

Calling timer_add from event thread is invalid. Use a pipe instead.

diff --git a/server/cldu.c b/server/cldu.c
index 957bd81..fafcc3b 100644
--- a/server/cldu.c
+++ b/server/cldu.c
@@ -47,6 +47,7 @@ struct cld_session {
struct cld_host cldv[N_CLD];
 
struct cld_timer timer;
+   int event_pipe[2];
 
char *ffname;
struct ncld_fh *ffh;/* keep open for lock */
@@ -95,9 +96,8 @@ static void cldu_saveargs(struct cld_session *sp, char 
*infopath,
sp-ploc = loc;
 }
 
-static void cldu_timer_event(struct cld_timer *timer)
+static void cldu_sess_proc(struct cld_session *cs)
 {
-   struct cld_session *cs = timer-userdata;
int newactive;
 
if (cs-is_dead) {
@@ -106,29 +106,58 @@ static void cldu_timer_event(struct cld_timer *timer)
if (debugging)
applog(LOG_DEBUG, Reopening Chunk in %s, cs-ffname);
 
-   ncld_sess_close(cs-nsess);
-   cs-nsess = NULL;
+   if (cs-nsess) {
+   ncld_sess_close(cs-nsess);
+   cs-nsess = NULL;
+   }
cs-ffh = NULL; /* closed automatically */
-   cs-is_dead = false;
newactive = cldu_nextactive(cs);
if (cldu_set_cldc(cs, newactive)) {
/* Oops, should not happen. Just loop, then... */
timer_add(cs-timer, time(NULL) + 30);
return;
}
+   cs-is_dead = false;
+   } else {
+   /*
+* We want to see if this ever happens.
+* Probably harmless, but... let's print it.
+*/
+   applog(LOG_WARNING, Event on non-dead session);
}
 }
 
+static void cldu_timer_event(struct cld_timer *timer)
+{
+   struct cld_session *cs = timer-userdata;
+
+   cldu_sess_proc(cs);
+}
+
+static bool cldu_pipe_event(int fd, short events, void *userdata)
+{
+   struct cld_session *cs = userdata;
+   unsigned char cmd;
+   ssize_t rc;
+
+   rc = read(fd, cmd, 1);
+   if (rc  0)
+   cldu_sess_proc(cs);
+   else
+   applog(LOG_WARNING, Stray CLD event pipe poll);
+   return true;
+}
+
 static void cldu_sess_event(void *priv, uint32_t what)
 {
struct cld_session *cs = priv;
+   unsigned char cmd;
 
if (what == CE_SESS_FAILED) {
/*
 * In ncld, we are not allowed to free the session structures
 * from an event (it's wages of all-conquering 100% reliable
-* ncld_close_sess), so we bounce that off to a thread. Which
-* we do not have, so we steal a timer for an erzatz thread.
+* ncld_close_sess), so we bounce that off to the main thread.
 */
if (cs-nsess) {
applog(LOG_ERR, Session failed, sid  SIDFMT,
@@ -137,7 +166,10 @@ static void cldu_sess_event(void *priv, uint32_t what)
applog(LOG_ERR, Session open failed);
}
cs-is_dead = true;
-   timer_add(cs-timer, time(NULL) + 1);
+   cmd = 1;
+   if (write(cs-event_pipe[1], cmd, 1)  1) {
+   applog(LOG_ERR, Pipe write failed: %d, errno);
+   }
} else {
if (cs)
applog(LOG_INFO, cldc event 0x%x sid  SIDFMT,
@@ -438,6 +470,7 @@ int cld_begin(const char *thishost, uint32_t nid, char 
*infopath,
  struct geo *locp, void (*cb)(enum st_cld))
 {
static struct cld_session *cs = ses;
+   struct server_poll *sp;
 
if (!nid)
return 

[Patch 07/12] Chunk: retry initial CLD session open

2010-04-17 Thread Pete Zaitcev
This was an error in the conversion to ncld. In the cldc code, we
kick the state machine and the natural retries do the rest. Any
failures occure there. But in ncld the original kick can fail too.

Five retries give CLD server time to reboot. If it's down, then
clients refuse to start. This may be a bad idea, or may be not.
We may yet change the retries to be infinite, but for now it's
better if builds terminate somehow in case of unexpected problems.

Signed-off-by: Pete Zaitcev zait...@redhat.com

---
 server/cldu.c |   12 ++--
 1 file changed, 10 insertions(+), 2 deletions(-)

commit 44cdb98d2cceb2f4e081db2ee38ec60f1c1a8d8d
Author: Master zait...@lembas.zaitcev.lan
Date:   Sat Apr 17 19:50:06 2010 -0600

Retry the initial connection to the CLD server.

diff --git a/server/cldu.c b/server/cldu.c
index fafcc3b..58edf4b 100644
--- a/server/cldu.c
+++ b/server/cldu.c
@@ -471,6 +471,8 @@ int cld_begin(const char *thishost, uint32_t nid, char 
*infopath,
 {
static struct cld_session *cs = ses;
struct server_poll *sp;
+   int retry_cnt;
+   int newactive;
 
if (!nid)
return 0;
@@ -540,9 +542,15 @@ int cld_begin(const char *thishost, uint32_t nid, char 
*infopath,
 * -- Actually, it only works when recovering from CLD failure.
 *Thereafter, any slave CLD redirects us to the master.
 */
-   if (cldu_set_cldc(cs, 0)) {
+   newactive = 0;
+   retry_cnt = 0;
+   for (;;) {
+   if (!cldu_set_cldc(cs, newactive))
+   break;
/* Already logged error */
-   goto err_net;
+   if (++retry_cnt == 5)
+   goto err_net;
+   newactive = cldu_nextactive(cs);
}
 
return 0;
--
To unsubscribe from this list: send the line unsubscribe hail-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Patch 08/12] Chunk: fix wrong message

2010-04-17 Thread Pete Zaitcev
The message makes no sense. It was a carry-over from cldc where there
were many failure modes (fh is NULL, fh-valid false, etc.).

Signed-off-by: Pete Zaitcev zait...@redhat.com

---
 server/cldu.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

commit b723269c703c60560e18923ef49fdbe2a46be133
Author: Master zait...@lembas.zaitcev.lan
Date:   Sat Apr 17 19:52:40 2010 -0600

Bogus message.

diff --git a/server/cldu.c b/server/cldu.c
index 58edf4b..fd7e9a1 100644
--- a/server/cldu.c
+++ b/server/cldu.c
@@ -384,7 +384,7 @@ static int cldu_set_cldc(struct cld_session *cs, int 
newactive)
error, 0 /* CE_MASTER_FAILOVER | CE_SESS_FAILED */,
NULL, NULL);
if (cs-ffh == NULL) {
-   applog(LOG_ERR, CLD open(%s) failed: NULL fh, cs-ffname);
+   applog(LOG_ERR, CLD open(%s) failed: %d, cs-ffname, error);
goto err_fopen;
}
 
--
To unsubscribe from this list: send the line unsubscribe hail-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Patch 09/12] tabled: drop double prefixing

2010-04-17 Thread Pete Zaitcev
On Fedora 14, the following is seen in syslog:

Apr 17 19:58:52 niphredil tabled: tabled: connecting to site
 hitlain.zaitcev.lan:8083: No route to host
Apr 17 19:58:56 niphredil tabled: tabled: DB_ENV-rep_elect:WARNING:
 nvotes (1) is sub-majority with nsites (2)

Drop the extra prefix, it only wastes screen space.

Signed-off-by: Pete Zaitcev zait...@redhat.com

---
 lib/tdb.c |7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

commit eb60e6e5c97fe316d23b9b21ba020bca924e879e
Author: Master zait...@lembas.zaitcev.lan
Date:   Sat Apr 17 20:31:56 2010 -0600

Fix double tagging in syslog.

diff --git a/lib/tdb.c b/lib/tdb.c
index cd65371..12ff231 100644
--- a/lib/tdb.c
+++ b/lib/tdb.c
@@ -38,7 +38,12 @@ enum {
 
 static void db4syslog(const DB_ENV *dbenv, const char *errpfx, const char *msg)
 {
-   syslog(LOG_WARNING, %s: %s, errpfx, msg);
+   /*
+* Since we use syslog, we discard the prefix set in tdb_init,
+* because syslog adds our own prefix too. The errpfx would be
+* useful if we weren't dumping to syslog here.
+*/
+   syslog(LOG_WARNING, %s, msg);
 }
 
 static int buckets_owner_idx(DB *secondary, const DBT *pkey, const DBT *pdata,
--
To unsubscribe from this list: send the line unsubscribe hail-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Patch 11/12] tabled: check argument of -D better

2010-04-17 Thread Pete Zaitcev
The atoi() really does not cut it, as I discovered when I supplied
-D -E to tabled. Other arguments may benefit from such checking too,
but -D is unique in that nothing gets logged in case of this mistake.
So let's just add it here for now; others will at least report something.

Signed-off-by: Pete Zaitcev zait...@redhat.com

---
 server/server.c |4 
 1 file changed, 4 insertions(+)

commit b340bd6bbf9d7a82b69ad620f43799b616348a45
Author: Master zait...@lembas.zaitcev.lan
Date:   Sat Apr 17 20:40:12 2010 -0600

Deconfuse -D -E.

diff --git a/server/server.c b/server/server.c
index e1b0dbe..a28965c 100644
--- a/server/server.c
+++ b/server/server.c
@@ -184,6 +184,10 @@ static error_t parse_opt (int key, char *arg, struct 
argp_state *state)
tabled_srv.config = arg;
break;
case 'D':
+   if (arg[0] == '-') {
+   fprintf(stderr, Option -D requires an argument\n);
+   argp_usage(state);
+   }
v = atoi(arg);
if (v  0 || v  2) {
fprintf(stderr, invalid debug level: '%s'\n, arg);
--
To unsubscribe from this list: send the line unsubscribe hail-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Patch 12/12] tabled: print hostname always

2010-04-17 Thread Pete Zaitcev
This code clearly was obsolete and wishful thinking. Let's just be
simple. Most importantly print something that tells the sysadmin
what node is the problem.

Signed-off-by: Pete Zaitcev zait...@redhat.com

---
 server/storage.c |   19 +++
 server/tabled.h  |2 +-
 2 files changed, 4 insertions(+), 17 deletions(-)

commit 31dc52b7d177bd18a765a9fc407c2afdd103613e
Author: Master zait...@lembas.zaitcev.lan
Date:   Sat Apr 17 20:42:24 2010 -0600

Print host name in storage retries.

diff --git a/server/storage.c b/server/storage.c
index 1793fa0..a63012e 100644
--- a/server/storage.c
+++ b/server/storage.c
@@ -489,26 +489,13 @@ void stor_add_node(uint32_t nid, const char *hostname, 
const char *portstr,
 int stor_node_check(struct storage_node *stn)
 {
struct st_client *stc;
-   char host[41];
-   char port[6];
int rc;
 
rc = stor_new_stc(stn, stc);
if (rc  0) {
-   if (rc == -EINVAL) {
-   if (getnameinfo((struct sockaddr *) stn-addr,
-   stn-alen, host, sizeof(host),
-   port, sizeof(port),
-   NI_NUMERICHOST|NI_NUMERICSERV) == 0) {
-   applog(LOG_INFO, Error connecting to chunkd
-   on host %s port %s,
-  host, port);
-   } else {
-   applog(LOG_INFO, Error connecting to chunkd);
-   }
-   } else {
-   applog(LOG_INFO, Error %d connecting to chunkd, rc);
-   }
+   applog(LOG_INFO,
+  Error %d connecting to chunkd on host %s,
+  rc, stn-hostname);
return -1;
}
 
diff --git a/server/tabled.h b/server/tabled.h
index eced4b6..75fa147 100644
--- a/server/tabled.h
+++ b/server/tabled.h
@@ -96,7 +96,7 @@ struct storage_node {
 
unsignedalen;
struct sockaddr_in6 addr;
-   char *hostname; /* Only used because stc_new is overly smart. */
+   char*hostname;
 
int ref;/* number of open_chunk or other */
 };
--
To unsubscribe from this list: send the line unsubscribe hail-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html