Hello there,
I am trying to configure some custom metrics for our cluster.  Initially, I am 
creating the same python module example described on ganglia wiki ( 
http://sourceforge.net/apps/trac/ganglia/wiki/ganglia_gmond_python_modules).  
The only different I am doing by respect to the documentation is the directory 
areas used for the python modules and the .pyconf files (I am using the ones 
provided by Rocks Clusters 5.3).   Also, I have changed the name of the metric 
to tempHost by respect to the documentation. When I try to load the graphs by 
restarting the gmond daemon script, I get the following error message:

Apr 12 12:18:32 rocks /usr/sbin/gmond[16637]: Unable to find the metric 
information for 'tempHost'. Possible that the module has not been loaded.

Looking for the loaded modules, I have:

[r...@rocks ~]# lsof -p `pidof gmond` | grep ganglia
gmond   16637 nobody  txt    REG              104,1    80173 1173831 
/opt/ganglia/sbin/gmond
gmond   16637 nobody  mem    REG              104,1   117066 1076176 
/opt/ganglia/lib64/libganglia-3.1.2.so.0.0.0
gmond   16637 nobody  mem    REG              104,1   117933 1076166 
/opt/ganglia/lib64/ganglia/modcpu.so
gmond   16637 nobody  mem    REG              104,1   115507 1076167 
/opt/ganglia/lib64/ganglia/moddisk.so
gmond   16637 nobody  mem    REG              104,1   115475 1076168 
/opt/ganglia/lib64/ganglia/modload.so
gmond   16637 nobody  mem    REG              104,1   116749 1076169 
/opt/ganglia/lib64/ganglia/modmem.so
gmond   16637 nobody  mem    REG              104,1   115853 1076171 
/opt/ganglia/lib64/ganglia/modnet.so
gmond   16637 nobody  mem    REG              104,1   115219 1076172 
/opt/ganglia/lib64/ganglia/modproc.so
gmond   16637 nobody  mem    REG              104,1   116461 1076174 
/opt/ganglia/lib64/ganglia/modsys.so
gmond   16637 nobody  mem    REG              104,1    26760 1076173 
/opt/ganglia/lib64/ganglia/modpython.so

So I don’t understand why the module is loaded by the OS but the host is not 
sending any data to generated the corresponding graph (with its corresponding 
rrds information).  I have cleaned out the rdds files and regenerated them with 
the same result.   We are running RHEL 5.4 with Rocks Clusters 5.3 and Ganglia 
v3.1.2:

[r...@rocks ~]# uname -a
Linux rocks.local 2.6.18-164.el5 #1 SMP Tue Aug 18 15:51:48 EDT 2009 x86_64 
x86_64 x86_64 GNU/Linux

Any help will be really appreciated in order to solve this problem.

Regards,
-Hugo


[r...@rocks ~]# python /opt/ganglia/lib64/ganglia/python_modules/hostTemp.py
value for tempHost is 8

/opt/ganglia/lib64/ganglia/python_modules/hostTemp.py
def temp_handler(name):
    acpi_file = "/proc/acpi/thermal_zone/THM0/temperature"

    try:
        f = open(acpi_file, 'r')

    except IOError:
        return 0

    for l in f:
        line = l.split()

    return int(line[1])

def metric_init(params):
    global descriptors

    d1 = {'name': 'tempHost',
        'call_back': temp_handler,
        'time_max': 90,
        'value_type': 'uint',
        'units': 'C',
        'slope': 'both',
        'format': '%u',
        'description': 'Temperature of host',
        'groups': 'health'}

    descriptors = [d1]

    return descriptors

def metric_cleanup():
    '''Clean up the metric module.'''
    pass

#This code is for debugging and unit testing
if __name__ == '__main__':
    metric_init(None)
    for d in descriptors:
        v = d['call_back'](d['name'])
        print 'value for %s is %u' % (d['name'],  v)


/opt/ganglia/etc/conf.d/temp.pyconf
modules {
  module {
    name = "tempHost"
    language = "python"
    # The following params are examples only
    #  They are not actually used by the temp module
    param RandomMax {
      value = 600
    }
    param ConstantValue {
      value = 112
    }
  }
}

collection_group {
  collect_every = 10
  time_threshold = 50
  metric {
    name = "tempHost"
    title = "Temperature"
    value_threshold = 70
  }
}


--
"Si seus esforços, foram vistos com indefrença, não desanime, que o sol faze un 
espectacolo maravilhoso todas as manhãs cuando a maior parte das pessoas, ainda 
estam durmindo"

- Anónimo brasileiro

Disclaimer: The information in this e-mail and any of its attachments is 
confidential and may contain sensitive information. It should not be used by 
anyone who is not the original intended recipient. If you have received this 
e-mail in error please inform the sender and delete it from your mailbox or any 
other storage devices. National Institute of Allergy and Infectious Diseases 
shall not accept liability for any statements made that are sender's own and 
not expressly made on behalf of the NIAID by one of its representatives.


------------------------------------------------------------------------------
Download Intel® Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
_______________________________________________
Ganglia-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/ganglia-general

Reply via email to