Jason A. Smith wrote:

If you really want long term storage of the raw or nearly raw data then
rrdtool is probably not the right tool to use.  You would be better off
writing your own ganglia frontend client that would collect the xml data
from gmetad at the interval you need, parse it and store it into some
other database or archive.  This could also be done from another
computer so it would have a negligible impact on the gmetad host.

~Jason
I have thought about this too.

The problem with this is the fact that if I go to something SQL-ish or similar, I will have to store about 25+ billion rows (<43 metrics> * <275 hosts> * <1 year of seconds>) because I'd want to store for about 1 year's worth of metrics, of the detailed view. Meaning a new value every 15 seconds per host per metric.

I am having nightmare's allready about working with a SQL database with 25+ billion rows, I doubt it will ever work on the hardware I have available for the project.

It would allmost be more useable (performance and storage wise) to just write additional .rrd files in the same manner gmetad does and perhaps use a ramdisk for this.

I agree a SQL dbase would be much more desireable, however I am very tempted to just write a tool that grabs the xml and stores it in additional rrd's. However it sure is beyond the whole concept of round robin databases to use it for the archiving of data.

If you have a good idea or suggestion on how to store the amount of data efficiently, without needing a extra cluster just to store and use the values, I would love to hear it.

- Ramon.

Reply via email to