Re: [Ganglia-developers] gmetad and rrdtool scalability

2009-12-13 Thread Vladimir Vuksan
I think you guys are complicating much :-). Can't you simply have multiple 
gmetads in different sites poll a single gmond. That way if one gmetad 
fails data is still available and updated on the other gmetads. That is 
what we used to do.

Vladimir

On Sun, 13 Dec 2009, Spike Spiegel wrote:

> indeed, os resources usage for caching should be tightly controlled.
> RRD does a pretty good job at that, and for example I know people that
> use collectd (which supports multiple output streams) and send data
> both remotely and keep a local copy with different retention policies
> to solve that problem.
>
>> This would be addressed by the use of SAN - there would only be one RRD
>> file, and the gmetad servers would need to be in some agreement so that they
>> both don't try to write the same file at the same time.
>
> sure, but even with a SAN you'd have to add some intelligence to
> gmetad, which from my pov is more than half of the work needed to
> achieve gmetad reliability and redundancy while keeping it's current
> distributed design.

--
Return on Information:
Google Enterprise Search pays you back
Get the facts.
http://p.sf.net/sfu/google-dev2dev
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] gmetad and rrdtool scalability

2009-12-13 Thread Spike Spiegel
On Fri, Dec 11, 2009 at 1:34 PM, Daniel Pocock  wrote:
> Thanks for sharing this - could you comment on the total number of RRDs per
> gmetad, and do you use rrdcached?

the largest colo has 140175 rrds and we use the tmpfs + cron hack, no rrdcached.

> I was thinking about gmetads attached to the same SAN, not a remote FS over
> IP.  In a SAN, each gmetad has a physical path to the disk (over fibre
> channel) and there are some filesystems (e.g. GFS) and locking systems (DLM)
> that would allow concurrent access to the raw devices.  If two gmetads mount
> the filesystem concurrently, you could tell one gmetad `stop monitoring
> cluster A, sync the RRDs' and then tell the other gmetad to start monitoring
> cluster A.
>
> DLM is quite a heavyweight locking system (cluster manager and heartbeat
> system required), some enterprises have solutions like Apache Zookeeper
> (Google has one called Chubby) and they can potentially allow the gmetad
> servers to agree on who is polling each cluster.

I see, and while I'm sure this solution works for many people and
might be popular in HPC environments I'm not really keen it's
something we'd want to go with ourselves, we tend to stick to "share
nothing" design which I realize has cons too, but as always it's a
matter of tradeoffs and even an implementation of paxos like chubby is
no silver bullet.

The other thing is of course costs. SANs aren't free and if I'm a
small gig, but for some reasons I actually have a clue and recognize
the importance of instrumenting everything, I wouldn't want to be
forced to having to add shared storage for the purpose of not losing
data.

>> I see two possible solutions:
>> 1. client caching
>> 2. built-in sync feature
>>
>> In 1. gmond would cache data locally if it could not contact the
>> remote end. This imho is the best solution because it helps not only
>> with head failures and maintenance, but possibly addresses a whole
>> bunch of other failure modes too.
>>
>
> The problem with that is that the XML is just a snapshot.  Maybe the XML
> could contain multiple values for each metric, e.g. all values since the
> last poll?  There would need to be some way of limiting memory usage too, so
> that an agent doesn't kill the machine if nothing is polling it.

indeed, os resources usage for caching should be tightly controlled.
RRD does a pretty good job at that, and for example I know people that
use collectd (which supports multiple output streams) and send data
both remotely and keep a local copy with different retention policies
to solve that problem.

> This would be addressed by the use of SAN - there would only be one RRD
> file, and the gmetad servers would need to be in some agreement so that they
> both don't try to write the same file at the same time.

sure, but even with a SAN you'd have to add some intelligence to
gmetad, which from my pov is more than half of the work needed to
achieve gmetad reliability and redundancy while keeping it's current
distributed design.


-- 
"Behind every great man there's a great backpack" - B.

--
Return on Information:
Google Enterprise Search pays you back
Get the facts.
http://p.sf.net/sfu/google-dev2dev
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] [RFC] two step gmond initialization

2009-12-13 Thread Carlo Marcelo Arenas Belon
On Sun, Dec 13, 2009 at 10:49:00AM +, Daniel Pocock wrote:
> Carlo Marcelo Arenas Belon wrote:
>> On Fri, Dec 11, 2009 at 01:31:22PM -0600, Brooks Davis wrote:
>>   
>>> On Fri, Dec 11, 2009 at 04:56:51PM +, Carlo Marcelo Arenas Belon wrote:
>>> 
 I presume the reason why you haven't seen this show up in the APR list, is
 because it makes probably more sense for the apache httpd list instead for
 help understanding how apache is able to "work around" the leakiness of
 apr_poll and that also requires some reading from apache's code (which I
 am not at least that familiar with, neither really interested)
   
>>> Looking at the prefork mpm, the pollsets are created and used only
>>> in child_main() and thus are created after the fork.  I suspect that
>>> changing the ganglia code to open all the sockets, but defer creation of
>>> the pollset until after fork is the right way to go.
>>
>> That is the way we did the initialization before r2025 so I guess that could
>> explain why we weren't affected just like apache is not.
>>   
> Not quite - pre-r2025, we did this:
>
> a) detach
> b) socket init
> c) pollset init
>
> Post r2025:
>
> a) socket init
> b) pollset init
> c) detach
>
> Brooks' solution:
>
> a) socket init
> b) detach
> c) pollset init
>
> I could accept Brooks' solution, because it means gmond would only fail  
> for something like out-of-memory, while any configuration failure, port  
> in use, etc would cause it to fail before detaching.

If gmond still fails silently in some cases, you have not accomplished the
objective that you were trying to obtain with r2025 anyway.

The solution I proposed addresses the problem of reporting to the OS any
failure while initialization (which was the original bug to fix anyway)
in a straight forward way and is therefore the right way to correct this
IMHO, without introducing any regressions by changing long relied upon
semantics.

> Basically, we would have to split the code in  
> setup_listen_channels_pollset() into two functions, one that gets called  
> before detaching, and one that is called after detaching.

Why make the code more complicated, and are you really expecting to do that
in scope for getting it backported into 3.1.6 considering how intrusive that
would be?

Also be aware there are bugfixes on that code that hadn't yet been backported
and so you are going to either have to certify as well all those fixes or
cherry pick the changes needed and test all different combinations.

Carlo

--
Return on Information:
Google Enterprise Search pays you back
Get the facts.
http://p.sf.net/sfu/google-dev2dev
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] [RFC] two step gmond initialization

2009-12-13 Thread Daniel Pocock
Carlo Marcelo Arenas Belon wrote:
> On Fri, Dec 11, 2009 at 01:31:22PM -0600, Brooks Davis wrote:
>   
>> On Fri, Dec 11, 2009 at 04:56:51PM +, Carlo Marcelo Arenas Belon wrote:
>>
>> 
>>> I presume the reason why you haven't seen this show up in the APR list, is
>>> because it makes probably more sense for the apache httpd list instead for
>>> help understanding how apache is able to "work around" the leakiness of
>>> apr_poll and that also requires some reading from apache's code (which I
>>> am not at least that familiar with, neither really interested)
>>>   
>> Looking at the prefork mpm, the pollsets are created and used only
>> in child_main() and thus are created after the fork.  I suspect that
>> changing the ganglia code to open all the sockets, but defer creation of
>> the pollset until after fork is the right way to go.
>> 
>
> That is the way we did the initialization before r2025 so I guess that could
> explain why we weren't affected just like apache is not.
>   
Not quite - pre-r2025, we did this:

a) detach
b) socket init
c) pollset init

Post r2025:

a) socket init
b) pollset init
c) detach

Brooks' solution:

a) socket init
b) detach
c) pollset init

I could accept Brooks' solution, because it means gmond would only fail 
for something like out-of-memory, while any configuration failure, port 
in use, etc would cause it to fail before detaching.

Basically, we would have to split the code in 
setup_listen_channels_pollset() into two functions, one that gets called 
before detaching, and one that is called after detaching.

--
Return on Information:
Google Enterprise Search pays you back
Get the facts.
http://p.sf.net/sfu/google-dev2dev
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers