Help,
We have a mixed grid environment spread across 3 geographic regions: europe,
america and asia. Each region has its own grid, which consists of clusters of
Linux machines only and clusters of Solaris machines only, as well as clusters
with both Solaris and Linux machines in them. When each grid runs in isolation
everything looks fine. However, we want to be able to see all grids from any of
the other grid's location. So, we have each of the 3 grids include the other 2
grids as data sources. This works up to a point. Once each grid could see the
other grids, the summary figures for each grid then became very odd. See an
example grid below :
CPUs Total: 1838400076 Hosts up: 4208660701 Hosts down: 4156806781
Avg Load (15, 5, 1m):
1207516708371981500000000000000000000000000000000000000000000000000000000%,
1295207231413906700000000000000000000000000000000000000000000000000000000%,
1342248779446190700000000000000000000000000000000000000000000000000000000%
Localtime:
2006-12-19 03:59
The total CPUs, Hosts up and Hosts down numbers keep increasing with time. Also
the load averages are not real either.
The only deviation I can see that we made to a standard Ganglia configuration
is that we run some gmertics on a sub-set of the clusters in each grid.
Any suggestions?
What I have tried so far is stop gmetad in each region and clear out the
SummaryInfo directories for all grids for all regions and then re-start gmetad.
When I do this everything appears as expected initially. However, once the
grids start updating each other the above types of figures immediately appear.
Regards,
Alexander M. Robinson
EPDTW Support
Shell Information Technology International B.V.
PO Box 60, 2280 AB Rijswijk-ZH, The Netherlands
Tel: +31 70 447 2682 2682 Other Tel: +31 610972682
Email: [EMAIL PROTECTED] / [EMAIL PROTECTED]
Internet: http://www.shell.com