Hello,
to follow up on this: I reinstalled ganglia version 3.1.7 (was using
3.1.2 before). It seems to work a lot better now, I get regular updates
of all the nodes etc. Still, gmond or gmetad crash after a while - I am
logging in debug mode now to see what's happening.
After I reinstalled ganglia 3.1.7 I am getting an error message upon
starting gmond. When loading the custom SGE metric module, I get:
$ gmond -d 10
loaded module: core_metrics
loaded module: cpu_module
loaded module: disk_module
loaded module: load_module
loaded module: mem_module
loaded module: net_module
loaded module: proc_module
loaded module: sys_module
loaded module: python_module
[PYTHON] Can't import the metric module [sge].
Traceback (most recent call last):
File "/opt/ganglia/lib/ganglia/python_modules/sge.py", line 79, in ?
import gmon.events
File "/opt/rocks/lib/python2.4/site-packages/gmon/events.py", line
147, in ?
class Event:
File "/opt/rocks/lib/python2.4/site-packages/gmon/events.py", line
177, in Event
def which(self, filename, path = os.environ['PATH']):
File "/usr/lib/python2.4/UserDict.py", line 17, in __getitem__
def __getitem__(self, key): return self.data[key]
KeyError: 'PATH'
udp_send_channel mcast_join=224.0.0.4 mcast_if=NULL host=10.1.1.1
port=8649
Unable to find the metric information for 'queue-state'. Possible that
the module has not been loaded.
Any idea where this might come from? Did any interfaces change between
3.1.2 and 3.1.7? Do I need to ajust the SGE module somewhere?
Cheers,
Arne
On Mon, 2010-11-15 at 11:24 -0600, Bernard Li wrote:
> Hi Arne:
>
> On Mon, Nov 15, 2010 at 11:13 AM, Arne Brutschy <[email protected]>
> wrote:
>
> > I am currently in the process of doing so. It does not seem actually to
> > hang, it seems like an endless loop of this:
> >
> > --- SIGCHLD (Child exited) @ 0 (0) ---
> > rt_sigaction(SIGINT, {0x1, [], 0}, {0x804f779, [], SA_INTERRUPT}, 8)
> > = 0
> > rt_sigaction(SIGQUIT, {0x1, [], 0}, {SIG_DFL, [], 0}, 8) = 0
> > rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0
> > clone(child_stack=0, flags=CLONE_PARENT_SETTID|SIGCHLD,
> > parent_tidptr=0xbfd8ac14) = 30550
> > waitpid(30550, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0) = 30550
> > rt_sigaction(SIGINT, {0x804f779, [], SA_INTERRUPT}, NULL, 8) = 0
> > rt_sigaction(SIGQUIT, {SIG_DFL, [], 0}, NULL, 8) = 0
> > rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
>
> And `gmond -d 2` gives you no additional information? Any other
> system-level issues on the server? Assuming you are using multicast,
> would it be possible for you to setup gmetad to poll _another_ gmond
> to see if the issue persists?
>
> What OS and arch are you running Ganglia on?
>
> Cheers,
>
> Bernard
------------------------------------------------------------------------------
Beautiful is writing same markup. Internet Explorer 9 supports
standards for HTML5, CSS3, SVG 1.1, ECMAScript5, and DOM L2 & L3.
Spend less time writing and rewriting code and more time creating great
experiences on the web. Be a part of the beta today
http://p.sf.net/sfu/msIE9-sfdev2dev
_______________________________________________
Ganglia-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/ganglia-general