Vladimir Vuksan wrote: > On Mon, 21 Dec 2009, Spike Spiegel wrote: > >>> a. Get all the rrds (rsync) from gmetad2 before you restart gmetad1 >> which unless you have small amount or data or fast network between the >> two nodes won't complete before the next write is initiated, meaning >> they won't be identical. > > > Granted they will never be identical. Even on fastest networks there > will be a window of data lost. On fast networks/smaller # of nodes it > will be small. On bigger networks a larger window etc. > > >> how do you tell which one has most up to date data? > > > This is in no respect an automatic processes (even though if I really > wanted to I could Point proxy to your primary node. If it fails point > to secondary or tertiary. > > >> if you really mean "most recent" then both would, because both would >> have fetched the last reading assuming they are both functional, but >> gmetad1 would have a hole in its graphs. To me that does not really >> count as up to date. Up to date would be the one with the most >> complete data set which you have no way to identify programmatically. >> >> Also, assume now gmetad2 fails and both have holes, which one is the >> most up to date? > > That is up to you decide. This is in no way perfect. > > >> I guess it does if I look at it from your perspective which if I >> understood it correctly implies that: >> * some data loss doesn't matter >> * manual interaction to fix things is ok >> >> But that isn't my perspective. Scalable (distributed) applications >> should be able to guarantee by design no data loss in as many cases as >> possible and not force you to centralized designs or hackery in order >> to do so. >> >> There are ways to make this possible without changes to the current >> gmetad code by adding a helper webservice that proxies the access to >> rrd. This way it's perfectly fine to have different locations with >> different data and the webservice will take care of interrogating one >> or more gmetads/backends to retrieve the full set and present it to >> the user. Fully distributed, no data loss. This could be of course >> built into gmetad by making something like port 8652 access the rrds, >> but to me that's the wrong path, makes gmetad's code more complicated >> and it's potentially a functionality that has nothing to do with >> ganglia and is backend dependent. > > > The issue is value of this data. If these were financial transactions > than no loss would be acceptable however these are not. They are > performance, trending data which get "averaged" down as time goes by > so loss of couple hours or even days of data is not tragic. >
I agree - it doesn't have to be perfect. To come back to my own requirement though, it is about horizontal scalability. Let's say you have a hypothetical big enterprise that has just decided to adopt Ganglia as a universal solution on every node in every data center globally, including subsidiary companies, etc. No one really wants to manually map individual servers to clusters and gmetad servers. They want plug-and-play. They just want to allocate some storage and gmetad hardware in each main data center, plug them in, and watch the graphs appear. If the CPU or IO load gets too high on some of the gmetad servers in a particular location, they want to re-distribute the load over the others in that location. When the IO load gets too high on all of the gmetads, they want to be able to scale horizontally - add an extra 1 or 2 gmetad servers and see the load distributed between them. Maybe this sounds a little bit like a Christmas wish-list, but does anyone else feel that this is a valid requirement? Imagine something even bigger - if a state or national government decided to deploy the gmond agent throughout all their departments in an effort to gather utilization data - would it scale? Would it be easy enough for a diverse range of IT departments to just plug it in? Carlo also made some comments about RDBMS instead of RRD. This raises a few discussion points: a) An RDBMS shared by multiple gmetads could provide a suitable locking mechanism for each gmetad to show which clusters it is polling, thereby co-ordinating access to RRD files on SAN. The list of clusters would be kept in a table, and if one gmetad could no longer poll a particular cluster (maybe to reduce IO load), it would lock the table, remove it's name from that row, and unlock the table. Another gmetad could then lock the table, update the row with it's name, unlock again. b) As for metric storage in RRD, I personally believe the RRD algorithm and API is quite appropriate for the type of data. The question then is should gmetad write metrics directly to an RDBMS, or should rrdtool be modified to use RDBMS as a back end? The whole RRD file structure could be represented in a series of database tables. The individual RRA regions of the file could be mapped to BLOBs, or the data samples could be stored on a row-by-row basis. In the latter case, views could be constructed to easily see the data in time order (so you wouldn't need to know which row was current). ------------------------------------------------------------------------------ This SF.Net email is sponsored by the Verizon Developer Community Take advantage of Verizon's best-in-class app development support A streamlined, 14 day to market process makes app distribution fast and easy Join now and get one step closer to millions of Verizon customers http://p.sf.net/sfu/verizon-dev2dev _______________________________________________ Ganglia-developers mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/ganglia-developers
