Currently my organisation uses Graphite for metrics monitoring. We have ~1600 instances pushing ~40 metrics to Graphite every minute. This causes a straight ~600 iops on our SAN, which was deemed too expensive by the infrastructure managers for what amounts to 3 MB of information.
Synthetic tests I did with InfluxDB show that if we use InfluxDB in the same way, the behavior is pretty much identical to Graphite. Sending 1600 http requests with 40 metrics each over a persistent http connection pool of 8 threads results in a spike of 8 seconds of ~600 iops every minute. Spreading the 1600 http requests over a minute results in a consistent 80 iops. Sending all 64K metrics in 1 http request causes 1 'spike' of 15 iops for 800ms per minute. First question: Where do these discrepancies in iops come from ? Is it purely the amount of file access switches, database transaction / locking related, or does it have another cause ? I can imagine InfluxDB http calls would need to be handled / flushed as one transaction, due to the need for an http status code return, instead of a UDP / TCP stream where the flush happens only every x points. As our metrics on production would be pushed from 1600 different sources, we could only merge them by setting up Telegraf as proxy in front of InfluxDB, with the http-listener plugin as input, and the output set with a buffer of 64K points. The sythetic test with 1600 http request sent to the Telegraf proxy showed pretty much identical iops behavior on InfluxDB as merging them into one http request and sending to InfluxDB directly. Second question: Is this in any way an idiomatic way to set up InfluxDB and would there be any drawbacks to this method ? Only one I can think of is a few MB of memory and an extra process runnin, and perhaps losing a minute of metrics, which is a good trade-off for the reduced load. However, I still have a feeling this should be something that could be handled or configured from InfluxDB itself. -- Remember to include the version number! --- You received this message because you are subscribed to the Google Groups "InfluxData" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/influxdb. To view this discussion on the web visit https://groups.google.com/d/msgid/influxdb/bdaf9682-64df-4cfb-8731-547e1165a43b%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
