Re: [influxdb] InfluxDB restarts every 24 hours and some data is missing

Sean Beckett Fri, 14 Oct 2016 10:59:04 -0700

It looks like your CQ does lead to RAM spikes close to the capacity of the
box. Your shard durations are what's tipping the issue, I believe. With 1
day shards in a 90 day retention policy, there are a lot of housecleaning
tasks to do each night at midnight UTC. When each shard expires, the series
index has to be updated and a series of compactions kick off. Compactions
are RAM and CPU intensive.


First recommendation, use ALTER RETENTION POLICY to raise the shard
duration for `three_months` to at least a week, but even a month would be
good. It will reduce the frequency of the TSM compactions, and with fewer
files the compactions will be less resource intensive.

Also, queries should touch as few shards as possible. If you are often
querying for more than 12 hours of data then raising the shard duration
will reduce the RAM needs of those queries.

On Fri, Oct 14, 2016 at 3:29 AM, <[email protected]> wrote:

> Hi Sean,
>
> here is the graph from out NMS about memory usage
>
> https://s18.postimg.org/a6buyzna1/memory.png
>
> and I would say we have spikes, but they are not every 24h but rather
> every 30 minutes and I guess it's because of our CQ we use for
> downsampling. I can post that CQ if that can help.
>
> We are aware of cardinality when we designed our solution and currently we
> have 81761 series which I guess it's quite ok for this amount of RAM.
>
> > SHOW RETENTION POLICIES ON macdb
> name            duration        shardGroupDuration      replicaN
> default
> default         0               168h0m0s                1
>  false
> seven_days      168h0m0s        24h0m0s                 1
>  true
> three_months    2160h0m0s       24h0m0s                 1
>  false
>
>
> Yes, we always have successful writes and we have 204 response returned by
> InfluxDB. By missing the whole measurement I mean that for example we have
> one measurement at 10:00 pm in InfluxDB and in our file (we do write data
> in file for debugging), then we have next measurement in 10:05 pm in both
> InfluxDB and file, the next measurement in 10:10 we are missing in InfluxDB
> but we do have that measurement in file and still we have 204 response
> returned by InfluxDB.
>
> --
> Remember to include the version number!
> ---
> You received this message because you are subscribed to the Google Groups
> "InfluxData" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To post to this group, send email to [email protected].
> Visit this group at https://groups.google.com/group/influxdb.
> To view this discussion on the web visit https://groups.google.com/d/
> msgid/influxdb/ba384d32-335f-4db8-b4f9-f43b584dba25%40googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
>



-- 
Sean Beckett
Director of Support and Professional Services
InfluxDB

-- 
Remember to include the version number!
--- 
You received this message because you are subscribed to the Google Groups 
"InfluxData" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/influxdb.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/influxdb/CALGqCvMk%2BgtmKBqYTtq7vZWUCNb_DWSw4BM5aZO5usKi3pevog%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: [influxdb] InfluxDB restarts every 24 hours and some data is missing

Reply via email to