Found the problem. The include file lib/llist.h is not compliant to what the
newest libe package (from 2004) is doing. It looks like ganglia has changed
the _llist_entry structure (actually only changed the order of member
elements). This means that gexec cannot work together with ganglia-3.0.3.

Since ganglia was the one who changed the structure, maybe somebody remembers
why this happened. If there was a good reason to do this, we should try to
change libe. Otherwise we should probably revert the change in the ganglia
tree.

Regards,
Erich

On Thursday 23 November 2006 10:59, Erich Focht wrote:
> On Thursday 23 November 2006 03:24, michael chang wrote:
> > What OS and compiler were used?
> 
> CentOS4.2 i386, gcc 3.4.4, glibc 2.3.4.
> 
> Regards,
> Erich
> 
> > On 11/22/06, Erich Focht <[EMAIL PROTECTED]> wrote:
> > > Hi,
> > >
> > > I'm trying to run gexec with ganglia-3.0.3. Built ganglia with
> > > --enable-gexec,
> > > built and installed gexec. gexec runs fine if executed standalone, but 
> > > when
> > > I try it together with ganglia, gexec segfaults.
> > >
> > > Does anybody have gexec (version 0.3.6) running with ganglia-3.0.3? Did
> > > anything change with ganglia-3.x which could lead to trouble with gexec?
> > >
> > > gdb shows the problem is in gexec.c:219
> > >
> > > 214         lli = cluster.gexec_hosts;
> > > 215         for (i = 0; i < *nhosts; i++) {
> > > 216             e_assert(lli != NULL);
> > > 217             (*ips)[i] = (char *)xmalloc(IP_STRLEN);
> > > 218             host = (gexec_host_t *)lli->val;
> > > 219             e_assert(strlen(host->ip) < IP_STRLEN);
> > > 220             strcpy((*ips)[i], host->ip);
> > > 221             lli = lli->next;
> > > 222         }
> > >
> > > The host variable is NULL.
> > >
> > > Any ideas?
> > >
> > > Erich


Reply via email to