Re: [influxdb] Influxdb 0.13 runs out of memory

Sean Beckett Fri, 19 Aug 2016 12:23:39 -0700

The latter query is more bounded in the points it samples. By restricting
the time range and the series to query, many fewer points are sampled.


I suspect there is a major inefficiency in LIMIT, and have opened
https://github.com/influxdata/influxdb/issues/7182 to prompt the developers
to investigate that. If the query is in fact trying to sample 1.3 billion
points, that could quickly overwhelm 32GB of RAM. It shouldn't need to,
thus the open issue.

On Fri, Aug 19, 2016 at 12:53 PM, John Jelinek <[email protected]> wrote:

> This must be something specific to my query: `SELECT * FROM bars LIMIT 1`,
> that's when the RAM maxes out and influx restarts. If I try another query:
> `SELECT * FROM bars WHERE Symbol = 'AAPL' AND time > now() - 10d LIMIT 1` I
> get a result back quickly.
>
>
> On Friday, August 19, 2016 at 1:45:47 PM UTC-5, John Jelinek wrote:
>>
>> Cardinality has jumped to 14041 after all the points were uploaded, but
>> `SELECT * FROM bars LIMIT 1` still maxes out the RAM and reboots the influx
>> service.
>>
>> On Friday, August 19, 2016 at 1:41:45 PM UTC-5, John Jelinek wrote:
>>>
>>> Also, I don't have any other processes running on the latest environment
>>> (ubuntu w/ 32GB of RAM), I built that box just for this test.
>>>
>>> On Friday, August 19, 2016 at 1:38:59 PM UTC-5, John Jelinek wrote:
>>>>
>>>> I did the query while uploading 14288591 points (at 5000
>>>> points/second). Here's a CSV sample of the kind of data:
>>>>
>>>> ```
>>>> "Symbol","Date","Open","High","Low","Close","Volume","Ex-Dividend","Split
>>>> Ratio","Adj. Open","Adj. High","Adj. Low","Adj. Close","Adj. Volume"
>>>> A,1999-11-18,45.5,50.0,40.0,44.0,44739900.0,0.0,1.0,43.47180
>>>> 9559155,47.771219295775,38.21697543662,42.038672980282,44739900.0
>>>> A,1999-11-19,42.94,43.0,39.81,40.38,10897100.0,0.0,1.0,41.02
>>>> 5923131212,41.083248594367,38.035444803296,38.580036703268,10897100.0
>>>> A,1999-11-22,41.31,44.0,40.06,44.0,4705200.0,0.0,1.0,39.4685
>>>> 81382169,42.038672980282,38.274300899775,42.038672980282,4705200.0
>>>> A,1999-11-23,42.5,43.63,40.25,40.25,4274400.0,0.0,1.0,40.605
>>>> 536401409,41.685165957493,38.455831533099,38.455831533099,4274400.0
>>>> A,1999-11-24,40.13,41.94,40.0,41.06,3464400.0,0.0,1.0,38.341
>>>> 180606789,40.070498745296,38.21697543662,39.22972528569,3464400.0
>>>> A,1999-11-26,40.88,41.5,40.75,41.19,1237100.0,0.0,1.0,39.057
>>>> 748896226,39.650112015493,38.933543726057,39.353930455859,1237100.0
>>>> A,1999-11-29,41.0,42.44,40.56,42.13,2914700.0,0.0,1.0,39.172
>>>> 399822536,40.548210938254,38.752013092733,40.25202937862,2914700.0
>>>> A,1999-11-30,42.0,42.94,40.94,42.19,3083000.0,0.0,1.0,40.127
>>>> 824208451,41.025923131212,39.115074359381,40.309354841775,3083000.0
>>>> A,1999-12-01,42.19,43.44,41.88,42.94,2115400.0,0.0,1.0,40.30
>>>> 9354841775,41.503635324169,40.013173282141,41.025923131212,2115400.0
>>>> ```
>>>>
>>>> I only have the one database created for this measurement and just this
>>>> measurement. I've tested this in 3 different environments, docker on 8GB
>>>> RAM, running influx directly on a macbook pro w/ 16GB RAM, and running
>>>> influx directly on an ubuntu 16.04 server with 32GB of RAM. On all
>>>> environments, the RAM has maxed out and swap is then maxed out. I'm using
>>>> this process to upload the CSV into influx 0.13
>>>> https://github.com/jpillora/csv-to-influxdb. This is the dataset I'm
>>>> uploading into influx: https://www.quandl.com/data/WIKI. This is the
>>>> command I'm using to get it into influx: `csv-to-influxdb -m bars -t Symbol
>>>> -ts Date -tf 2006-01-02 -d eodbars WIKI_20160818.csv`. Let me know if you
>>>> need to know any other details.
>>>>
>>>> On Friday, August 19, 2016 at 12:13:59 PM UTC-5, Sean Beckett wrote:
>>>>>
>>>>> On further consideration, an unbounded query on 1.6 billion points is
>>>>> a lot to sample. Presumably if you put a time boundary on that query it
>>>>> doesn't OOM?
>>>>>
>>>>> On Fri, Aug 19, 2016 at 11:12 AM, Sean Beckett <[email protected]>
>>>>> wrote:
>>>>>
>>>>>> That is not expected behavior. 5000 points per second is a light
>>>>>> workload, unless each of those points has 10-100 fields. Even 500k values
>>>>>> per second is a sustainable workload on a multi-core machine.
>>>>>>
>>>>>> A series cardinality less than 10k is also fairly trivial. That
>>>>>> shouldn't require more than a gig or two of RAM.
>>>>>>
>>>>>> Do you have long strings in your database? Is there something else
>>>>>> running on the system that needs RAM?
>>>>>>
>>>>>> Do you have many many databases or measurements?
>>>>>>
>>>>>> On Fri, Aug 19, 2016 at 10:45 AM, John Jelinek <[email protected]
>>>>>> > wrote:
>>>>>>
>>>>>>> I have a cardinality of `9876` from this query: `SELECT
>>>>>>> sum(numSeries) AS "total_series" FROM "_internal".."database" WHERE 
>>>>>>> time >
>>>>>>> now() - 10s` and when I query one of my measurements with something like
>>>>>>> `SELECT * FROM bars LIMIT 1` the RAM instantly spikes up to 32GB, maxes 
>>>>>>> out
>>>>>>> swap, and the influxdb service restarts. Note, this measurement is 
>>>>>>> getting
>>>>>>> writes of 5000 points per second. Total number of points are about 
>>>>>>> 1.6GB.
>>>>>>> Is this to be expected?
>>>>>>>
>>>>>>>
>>>>>>> On Wednesday, August 10, 2016 at 8:04:16 AM UTC-5, whille zg wrote:
>>>>>>>>
>>>>>>>> I'm having OOM issue, post at https://github.com/influxda
>>>>>>>> ta/influxdb/issues/7134
>>>>>>>> It seems RAM will drop slowly to small amount if no query
>>>>>>>> continues, but i need to read recent data several times continuously.
>>>>>>>> I'm try ing v1.0beta on 32G machine, but it's been killed, will try
>>>>>>>> 256G RAM.
>>>>>>>> Or should v0.12 ok with the RAM problem?
>>>>>>>>
>>>>>>>> 在 2016年7月13日星期三 UTC+8上午12:19:25，Sean Beckett写道：
>>>>>>>>>
>>>>>>>>> Currently InfluxDB must load the entire series index into RAM.
>>>>>>>>> We're working on a caching mechanism so that only recently written or
>>>>>>>>> queries series need to be kept in RAM. It's a complex feature to 
>>>>>>>>> implement
>>>>>>>>> while maintaining performance, but we hope to have a first version in 
>>>>>>>>> some
>>>>>>>>> months.
>>>>>>>>>
>>>>>>>>> On Tue, Jul 12, 2016 at 3:36 AM, Jan Kis <[email protected]>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Hi Sean, nice guess, we have 91 786 506 series :) To understand
>>>>>>>>>> this a bit better. Does the high memory consumption come from the 
>>>>>>>>>> fact that
>>>>>>>>>> influx loads the index into memory for faster writes and querying?
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> I will dive into the individual measurements to see where exactly
>>>>>>>>>> do we have such a large tag cardinality, so that we can reduce the 
>>>>>>>>>> number
>>>>>>>>>> of series.
>>>>>>>>>>
>>>>>>>>>> Thank you
>>>>>>>>>>
>>>>>>>>>> On Monday, July 11, 2016 at 6:51:52 PM UTC+2, Sean Beckett wrote:
>>>>>>>>>>>
>>>>>>>>>>> High RAM usage usually correlates with high series cardinality
>>>>>>>>>>> <https://docs.influxdata.com/influxdb/v0.13/concepts/glossary/#series-cardinality>
>>>>>>>>>>> .
>>>>>>>>>>>
>>>>>>>>>>> You can run "SELECT sum(numSeries) AS "total_series" FROM
>>>>>>>>>>> "_internal".."database" WHERE time > now() - 10s" to determine your 
>>>>>>>>>>> series
>>>>>>>>>>> cardinality, assuming you haven't altered the default sample rate 
>>>>>>>>>>> for the
>>>>>>>>>>> _internal database. If you have, change the WHERE time clause to 
>>>>>>>>>>> grab only
>>>>>>>>>>> one sample, or use "SELECT last(numSeries) FROM 
>>>>>>>>>>> "_internal".."database"
>>>>>>>>>>> GROUP BY "database"" and sum the results.
>>>>>>>>>>>
>>>>>>>>>>> With 100GB of RAM in use, I'm going to guess you have 5+ million
>>>>>>>>>>> series.
>>>>>>>>>>>
>>>>>>>>>>> On Mon, Jul 11, 2016 at 10:21 AM, Jan Kis <[email protected]>
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Hi,
>>>>>>>>>>>>
>>>>>>>>>>>> we are using influxdb 0.13 on Fedora 23. We see influx
>>>>>>>>>>>> consuming more than 100GB of ram. At some point it eventually runs 
>>>>>>>>>>>> out of
>>>>>>>>>>>> memory and dies. There are no errors in the logs. Our 
>>>>>>>>>>>> configuration is
>>>>>>>>>>>> below.
>>>>>>>>>>>>
>>>>>>>>>>>> Is there a way to control how much memory influx is consuming?
>>>>>>>>>>>> What can we do to figure out why is influx consuming so much
>>>>>>>>>>>> memory?
>>>>>>>>>>>>
>>>>>>>>>>>> Thank you
>>>>>>>>>>>>
>>>>>>>>>>>> reporting-disabled = false
>>>>>>>>>>>> bind-address = ":8088"
>>>>>>>>>>>> hostname = ""
>>>>>>>>>>>> join = ""
>>>>>>>>>>>>
>>>>>>>>>>>> [meta]
>>>>>>>>>>>>   dir = "/data/influxdb/meta"
>>>>>>>>>>>>   retention-autocreate = true
>>>>>>>>>>>>   logging-enabled = true
>>>>>>>>>>>>   pprof-enabled = false
>>>>>>>>>>>>   lease-duration = "1m0s"
>>>>>>>>>>>>
>>>>>>>>>>>> [data]
>>>>>>>>>>>>   dir = "/data/influxdb/data"
>>>>>>>>>>>>   engine = "tsm1"
>>>>>>>>>>>>   wal-dir = "/data/influxdb/wal"
>>>>>>>>>>>>   wal-logging-enabled = true
>>>>>>>>>>>>   query-log-enabled = true
>>>>>>>>>>>>   cache-max-memory-size = 524288000
>>>>>>>>>>>>   cache-snapshot-memory-size = 26214400
>>>>>>>>>>>>   cache-snapshot-write-cold-duration = "1h0m0s"
>>>>>>>>>>>>   compact-full-write-cold-duration = "24h0m0s"
>>>>>>>>>>>>   max-points-per-block = 0
>>>>>>>>>>>>   data-logging-enabled = true
>>>>>>>>>>>>
>>>>>>>>>>>> [cluster]
>>>>>>>>>>>>   force-remote-mapping = false
>>>>>>>>>>>>   write-timeout = "10s"
>>>>>>>>>>>>   shard-writer-timeout = "5s"
>>>>>>>>>>>>   max-remote-write-connections = 3
>>>>>>>>>>>>   shard-mapper-timeout = "5s"
>>>>>>>>>>>>   max-concurrent-queries = 0
>>>>>>>>>>>>   query-timeout = "0"
>>>>>>>>>>>>   log-queries-after = "0"
>>>>>>>>>>>>   max-select-point = 0
>>>>>>>>>>>>   max-select-series = 0
>>>>>>>>>>>>   max-select-buckets = 0
>>>>>>>>>>>>
>>>>>>>>>>>> [retention]
>>>>>>>>>>>>   enabled = true
>>>>>>>>>>>>   check-interval = "30m0s"
>>>>>>>>>>>>
>>>>>>>>>>>> [shard-precreation]
>>>>>>>>>>>>   enabled = true
>>>>>>>>>>>>   check-interval = "10m0s"
>>>>>>>>>>>>   advance-period = "30m0s"
>>>>>>>>>>>>
>>>>>>>>>>>> [admin]
>>>>>>>>>>>>   enabled = true
>>>>>>>>>>>>   bind-address = ":8083"
>>>>>>>>>>>>   https-enabled = false
>>>>>>>>>>>>   https-certificate = "/etc/ssl/influxdb.pem"
>>>>>>>>>>>>   Version = ""
>>>>>>>>>>>>
>>>>>>>>>>>> [monitor]
>>>>>>>>>>>>   store-enabled = true
>>>>>>>>>>>>   store-database = "_internal"
>>>>>>>>>>>>   store-interval = "10s"
>>>>>>>>>>>>
>>>>>>>>>>>> [subscriber]
>>>>>>>>>>>>   enabled = true
>>>>>>>>>>>>
>>>>>>>>>>>> [http]
>>>>>>>>>>>>   enabled = true
>>>>>>>>>>>>   bind-address = ":8086"
>>>>>>>>>>>>   auth-enabled = false
>>>>>>>>>>>>   log-enabled = true
>>>>>>>>>>>>   write-tracing = false
>>>>>>>>>>>>   pprof-enabled = false
>>>>>>>>>>>>   https-enabled = false
>>>>>>>>>>>>   https-certificate = "/etc/ssl/influxdb.pem"
>>>>>>>>>>>>   max-row-limit = 10000
>>>>>>>>>>>>
>>>>>>>>>>>> [[graphite]]
>>>>>>>>>>>>   enabled = true
>>>>>>>>>>>>   bind-address = ":2003"
>>>>>>>>>>>>   database = "graphite"
>>>>>>>>>>>>   protocol = "udp"
>>>>>>>>>>>>   batch-size = 5000
>>>>>>>>>>>>   batch-pending = 10
>>>>>>>>>>>>   batch-timeout = "1s"
>>>>>>>>>>>>   consistency-level = "one"
>>>>>>>>>>>>   separator = "."
>>>>>>>>>>>>   udp-read-buffer = 0
>>>>>>>>>>>>
>>>>>>>>>>>> [[collectd]]
>>>>>>>>>>>>   enabled = false
>>>>>>>>>>>>   bind-address = ":25826"
>>>>>>>>>>>>   database = "collectd"
>>>>>>>>>>>>   retention-policy = ""
>>>>>>>>>>>>   batch-size = 5000
>>>>>>>>>>>>   batch-pending = 10
>>>>>>>>>>>>   batch-timeout = "10s"
>>>>>>>>>>>>   read-buffer = 0
>>>>>>>>>>>>   typesdb = "/usr/share/collectd/types.db"
>>>>>>>>>>>>
>>>>>>>>>>>> [[opentsdb]]
>>>>>>>>>>>>   enabled = false
>>>>>>>>>>>>   bind-address = ":4242"
>>>>>>>>>>>>   database = "opentsdb"
>>>>>>>>>>>>   retention-policy = ""
>>>>>>>>>>>>   consistency-level = "one"
>>>>>>>>>>>>   tls-enabled = false
>>>>>>>>>>>>   certificate = "/etc/ssl/influxdb.pem"
>>>>>>>>>>>>   batch-size = 1000
>>>>>>>>>>>>   batch-pending = 5
>>>>>>>>>>>>   batch-timeout = "1s"
>>>>>>>>>>>>   log-point-errors = true
>>>>>>>>>>>>
>>>>>>>>>>>> [[udp]]
>>>>>>>>>>>>   enabled = false
>>>>>>>>>>>>   bind-address = ":8089"
>>>>>>>>>>>>   database = "udp"
>>>>>>>>>>>>   retention-policy = ""
>>>>>>>>>>>>   batch-size = 5000
>>>>>>>>>>>>   batch-pending = 10
>>>>>>>>>>>>   read-buffer = 0
>>>>>>>>>>>>   batch-timeout = "1s"
>>>>>>>>>>>>   precision = ""
>>>>>>>>>>>>
>>>>>>>>>>>> [continuous_queries]
>>>>>>>>>>>>   log-enabled = true
>>>>>>>>>>>>   enabled = true
>>>>>>>>>>>>   run-interval = "1s"
>>>>>>>>>>>>
>>>>>>>>>>>> --
>>>>>>>>>>>> Remember to include the InfluxDB version number with all issue
>>>>>>>>>>>> reports
>>>>>>>>>>>> ---
>>>>>>>>>>>> You received this message because you are subscribed to the
>>>>>>>>>>>> Google Groups "InfluxDB" group.
>>>>>>>>>>>> To unsubscribe from this group and stop receiving emails from
>>>>>>>>>>>> it, send an email to [email protected].
>>>>>>>>>>>> To post to this group, send email to [email protected].
>>>>>>>>>>>> Visit this group at https://groups.google.com/group/influxdb.
>>>>>>>>>>>> To view this discussion on the web visit
>>>>>>>>>>>> https://groups.google.com/d/msgid/influxdb/770d4dc6-8a9b-449
>>>>>>>>>>>> e-ad43-fa558e53a16d%40googlegroups.com
>>>>>>>>>>>> <https://groups.google.com/d/msgid/influxdb/770d4dc6-8a9b-449e-ad43-fa558e53a16d%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>>>>>>>>> .
>>>>>>>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> Sean Beckett
>>>>>>>>>>> Director of Support and Professional Services
>>>>>>>>>>> InfluxDB
>>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> Remember to include the InfluxDB version number with all issue
>>>>>>>>>> reports
>>>>>>>>>> ---
>>>>>>>>>> You received this message because you are subscribed to the
>>>>>>>>>> Google Groups "InfluxDB" group.
>>>>>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>>>>>>> send an email to [email protected].
>>>>>>>>>> To post to this group, send email to [email protected].
>>>>>>>>>> Visit this group at https://groups.google.com/group/influxdb.
>>>>>>>>>> To view this discussion on the web visit
>>>>>>>>>> https://groups.google.com/d/msgid/influxdb/eaa4d5ef-1e81-409
>>>>>>>>>> b-89e1-867c83ef3939%40googlegroups.com
>>>>>>>>>> <https://groups.google.com/d/msgid/influxdb/eaa4d5ef-1e81-409b-89e1-867c83ef3939%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>>>>>>> .
>>>>>>>>>>
>>>>>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Sean Beckett
>>>>>>>>> Director of Support and Professional Services
>>>>>>>>> InfluxDB
>>>>>>>>>
>>>>>>>> --
>>>>>>> Remember to include the InfluxDB version number with all issue
>>>>>>> reports
>>>>>>> ---
>>>>>>> You received this message because you are subscribed to the Google
>>>>>>> Groups "InfluxDB" group.
>>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>>>> send an email to [email protected].
>>>>>>> To post to this group, send email to [email protected].
>>>>>>> Visit this group at https://groups.google.com/group/influxdb.
>>>>>>> To view this discussion on the web visit
>>>>>>> https://groups.google.com/d/msgid/influxdb/194493ab-664a-46e
>>>>>>> 5-9336-9bfd18a82416%40googlegroups.com
>>>>>>> <https://groups.google.com/d/msgid/influxdb/194493ab-664a-46e5-9336-9bfd18a82416%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>>>> .
>>>>>>>
>>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Sean Beckett
>>>>>> Director of Support and Professional Services
>>>>>> InfluxDB
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Sean Beckett
>>>>> Director of Support and Professional Services
>>>>> InfluxDB
>>>>>
>>>> --
> Remember to include the InfluxDB version number with all issue reports
> ---
> You received this message because you are subscribed to the Google Groups
> "InfluxDB" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To post to this group, send email to [email protected].
> Visit this group at https://groups.google.com/group/influxdb.
> To view this discussion on the web visit https://groups.google.com/d/
> msgid/influxdb/ed545a9d-d8da-4109-83ce-a5c789337898%40googlegroups.com
> <https://groups.google.com/d/msgid/influxdb/ed545a9d-d8da-4109-83ce-a5c789337898%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>
> For more options, visit https://groups.google.com/d/optout.
>



-- 
Sean Beckett
Director of Support and Professional Services
InfluxDB

-- 
Remember to include the InfluxDB version number with all issue reports
--- 
You received this message because you are subscribed to the Google Groups 
"InfluxDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/influxdb.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/influxdb/CALGqCvMb834Qgq%2Bc9bMM2nVMzoKJOUUEAORgG23P_Jmh8ExKzQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: [influxdb] Influxdb 0.13 runs out of memory

Reply via email to