Re: [influxdb] Influxdb 0.13 runs out of memory

John Jelinek Fri, 19 Aug 2016 14:58:12 -0700

Should a query like this work then?
```
SELECT MAX(Open) AS Open, MAX(Close) AS Close, MAX(Volume) AS Volume, 
(MAX(Close) - MAX(Open)) / MAX(Open) * 100 AS PctChg INTO newBars FROM bars 
WHERE time < now() GROUP BY time(1d), Symbol
```
because that OOM'd too.


On Friday, August 19, 2016 at 4:16:44 PM UTC-5, Sean Beckett wrote:
>
> INTO queries definitely must be bounded in time to be performant. The 
> system doesn't have an internal query scheduler that can translate that 
> into smaller chunks. It literally tries to query every point, process them 
> if needed, and then write them back. It's not streamed or chunked (yet), so 
> you must supply explicit time intervals to keep the number of points 
> queried in the millions.
>
> On Fri, Aug 19, 2016 at 3:08 PM, John Jelinek <[email protected] 
> <javascript:>> wrote:
>
>> The LIMIT query still OOMs if the import is not running. Also, this query 
>> OOMs: `SELECT Close / Open INTO pctChg FROM bars`
>>
>> On Friday, August 19, 2016 at 2:23:46 PM UTC-5, Sean Beckett wrote:
>>>
>>> I don't know anything about the CSV tool or what performance impacts it 
>>> might have. If that import is not running, does the LIMIT query succeed or 
>>> does it still OOM?
>>>
>>> On Fri, Aug 19, 2016 at 12:38 PM, John Jelinek <[email protected]> 
>>> wrote:
>>>
>>>> I did the query while uploading 14288591 points (at 5000 
>>>> points/second). Here's a CSV sample of the kind of data:
>>>>
>>>> ```
>>>> "Symbol","Date","Open","High","Low","Close","Volume","Ex-Dividend","Split 
>>>> Ratio","Adj. Open","Adj. High","Adj. Low","Adj. Close","Adj. Volume"
>>>>
>>>> A,1999-11-18,45.5,50.0,40.0,44.0,44739900.0,0.0,1.0,43.471809559155,47.771219295775,38.21697543662,42.038672980282,44739900.0
>>>>
>>>> A,1999-11-19,42.94,43.0,39.81,40.38,10897100.0,0.0,1.0,41.025923131212,41.083248594367,38.035444803296,38.580036703268,10897100.0
>>>>
>>>> A,1999-11-22,41.31,44.0,40.06,44.0,4705200.0,0.0,1.0,39.468581382169,42.038672980282,38.274300899775,42.038672980282,4705200.0
>>>>
>>>> A,1999-11-23,42.5,43.63,40.25,40.25,4274400.0,0.0,1.0,40.605536401409,41.685165957493,38.455831533099,38.455831533099,4274400.0
>>>>
>>>> A,1999-11-24,40.13,41.94,40.0,41.06,3464400.0,0.0,1.0,38.341180606789,40.070498745296,38.21697543662,39.22972528569,3464400.0
>>>>
>>>> A,1999-11-26,40.88,41.5,40.75,41.19,1237100.0,0.0,1.0,39.057748896226,39.650112015493,38.933543726057,39.353930455859,1237100.0
>>>>
>>>> A,1999-11-29,41.0,42.44,40.56,42.13,2914700.0,0.0,1.0,39.172399822536,40.548210938254,38.752013092733,40.25202937862,2914700.0
>>>>
>>>> A,1999-11-30,42.0,42.94,40.94,42.19,3083000.0,0.0,1.0,40.127824208451,41.025923131212,39.
>>>> 115074359381,40.309354841775,3083000.0
>>>>
>>>> A,1999-12-01,42.19,43.44,41.88,42.94,2115400.0,0.0,1.0,40.309354841775,41.503635324169,40.013173282141,41.025923131212,2115400.0
>>>> ```
>>>>
>>>> I only have the one database created for this measurement and just this 
>>>> measurement. I've tested this in 3 different environments, docker on 8GB 
>>>> RAM, running influx directly on a macbook pro w/ 16GB RAM, and running 
>>>> influx directly on an ubuntu 16.04 server with 32GB of RAM. On all 
>>>> environments, the RAM has maxed out and swap is then maxed out. I'm using 
>>>> this process to upload the CSV into influx 0.13 
>>>> https://github.com/jpillora/csv-to-influxdb. This is the dataset I'm 
>>>> uploading into influx: https://www.quandl.com/data/WIKI. This is the 
>>>> command I'm using to get it into influx: `csv-to-influxdb -m bars -t 
>>>> Symbol 
>>>> -ts Date -tf 2006-01-02 -d eodbars WIKI_20160818.csv`. Let me know if you 
>>>> need to know any other details.
>>>>
>>>> On Friday, August 19, 2016 at 12:13:59 PM UTC-5, Sean Beckett wrote:
>>>>>
>>>>> On further consideration, an unbounded query on 1.6 billion points is 
>>>>> a lot to sample. Presumably if you put a time boundary on that query it 
>>>>> doesn't OOM?
>>>>>
>>>>> On Fri, Aug 19, 2016 at 11:12 AM, Sean Beckett <[email protected]> 
>>>>> wrote:
>>>>>
>>>>>> That is not expected behavior. 5000 points per second is a light 
>>>>>> workload, unless each of those points has 10-100 fields. Even 500k 
>>>>>> values 
>>>>>> per second is a sustainable workload on a multi-core machine.
>>>>>>
>>>>>> A series cardinality less than 10k is also fairly trivial. That 
>>>>>> shouldn't require more than a gig or two of RAM.
>>>>>>
>>>>>> Do you have long strings in your database? Is there something else 
>>>>>> running on the system that needs RAM?
>>>>>>
>>>>>> Do you have many many databases or measurements?
>>>>>>
>>>>>> On Fri, Aug 19, 2016 at 10:45 AM, John Jelinek <[email protected]
>>>>>> > wrote:
>>>>>>
>>>>>>> I have a cardinality of `9876` from this query: `SELECT 
>>>>>>> sum(numSeries) AS "total_series" FROM "_internal".."database" WHERE 
>>>>>>> time > 
>>>>>>> now() - 10s` and when I query one of my measurements with something 
>>>>>>> like 
>>>>>>> `SELECT * FROM bars LIMIT 1` the RAM instantly spikes up to 32GB, maxes 
>>>>>>> out 
>>>>>>> swap, and the influxdb service restarts. Note, this measurement is 
>>>>>>> getting 
>>>>>>> writes of 5000 points per second. Total number of points are about 
>>>>>>> 1.6GB. 
>>>>>>> Is this to be expected?
>>>>>>>
>>>>>>>
>>>>>>> On Wednesday, August 10, 2016 at 8:04:16 AM UTC-5, whille zg wrote:
>>>>>>>>
>>>>>>>> I'm having OOM issue, post at 
>>>>>>>> https://github.com/influxdata/influxdb/issues/7134
>>>>>>>> It seems RAM will drop slowly to small amount if no query 
>>>>>>>> continues, but i need to read recent data several times continuously. 
>>>>>>>> I'm try ing v1.0beta on 32G machine, but it's been killed, will try 
>>>>>>>> 256G RAM.
>>>>>>>> Or should v0.12 ok with the RAM problem?
>>>>>>>>
>>>>>>>> 在 2016年7月13日星期三 UTC+8上午12:19:25，Sean Beckett写道：
>>>>>>>>>
>>>>>>>>> Currently InfluxDB must load the entire series index into RAM. 
>>>>>>>>> We're working on a caching mechanism so that only recently written or 
>>>>>>>>> queries series need to be kept in RAM. It's a complex feature to 
>>>>>>>>> implement 
>>>>>>>>> while maintaining performance, but we hope to have a first version in 
>>>>>>>>> some 
>>>>>>>>> months.
>>>>>>>>>
>>>>>>>>> On Tue, Jul 12, 2016 at 3:36 AM, Jan Kis <[email protected]> 
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Hi Sean, nice guess, we have 91 786 506 series :) To understand 
>>>>>>>>>> this a bit better. Does the high memory consumption come from the 
>>>>>>>>>> fact that 
>>>>>>>>>> influx loads the index into memory for faster writes and querying?
>>>>>>>>>>   
>>>>>>>>>>
>>>>>>>>>> I will dive into the individual measurements to see where exactly 
>>>>>>>>>> do we have such a large tag cardinality, so that we can reduce the 
>>>>>>>>>> number 
>>>>>>>>>> of series.  
>>>>>>>>>>
>>>>>>>>>> Thank you 
>>>>>>>>>>
>>>>>>>>>> On Monday, July 11, 2016 at 6:51:52 PM UTC+2, Sean Beckett wrote:
>>>>>>>>>>>
>>>>>>>>>>> High RAM usage usually correlates with high series cardinality 
>>>>>>>>>>> <https://docs.influxdata.com/influxdb/v0.13/concepts/glossary/#series-cardinality>
>>>>>>>>>>> . 
>>>>>>>>>>>
>>>>>>>>>>> You can run "SELECT sum(numSeries) AS "total_series" FROM 
>>>>>>>>>>> "_internal".."database" WHERE time > now() - 10s" to determine your 
>>>>>>>>>>> series 
>>>>>>>>>>> cardinality, assuming you haven't altered the default sample rate 
>>>>>>>>>>> for the 
>>>>>>>>>>> _internal database. If you have, change the WHERE time clause to 
>>>>>>>>>>> grab only 
>>>>>>>>>>> one sample, or use "SELECT last(numSeries) FROM 
>>>>>>>>>>> "_internal".."database" 
>>>>>>>>>>> GROUP BY "database"" and sum the results.
>>>>>>>>>>>
>>>>>>>>>>> With 100GB of RAM in use, I'm going to guess you have 5+ million 
>>>>>>>>>>> series.
>>>>>>>>>>>
>>>>>>>>>>> On Mon, Jul 11, 2016 at 10:21 AM, Jan Kis <[email protected]> 
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Hi, 
>>>>>>>>>>>>
>>>>>>>>>>>> we are using influxdb 0.13 on Fedora 23. We see influx 
>>>>>>>>>>>> consuming more than 100GB of ram. At some point it eventually runs 
>>>>>>>>>>>> out of 
>>>>>>>>>>>> memory and dies. There are no errors in the logs. Our 
>>>>>>>>>>>> configuration is 
>>>>>>>>>>>> below. 
>>>>>>>>>>>>
>>>>>>>>>>>> Is there a way to control how much memory influx is consuming?
>>>>>>>>>>>> What can we do to figure out why is influx consuming so much 
>>>>>>>>>>>> memory?
>>>>>>>>>>>>
>>>>>>>>>>>> Thank you
>>>>>>>>>>>>
>>>>>>>>>>>> reporting-disabled = false
>>>>>>>>>>>> bind-address = ":8088"
>>>>>>>>>>>> hostname = ""
>>>>>>>>>>>> join = ""
>>>>>>>>>>>>
>>>>>>>>>>>> [meta]
>>>>>>>>>>>>   dir = "/data/influxdb/meta"
>>>>>>>>>>>>   retention-autocreate = true
>>>>>>>>>>>>   logging-enabled = true
>>>>>>>>>>>>   pprof-enabled = false
>>>>>>>>>>>>   lease-duration = "1m0s"
>>>>>>>>>>>>
>>>>>>>>>>>> [data]
>>>>>>>>>>>>   dir = "/data/influxdb/data"
>>>>>>>>>>>>   engine = "tsm1"
>>>>>>>>>>>>   wal-dir = "/data/influxdb/wal"
>>>>>>>>>>>>   wal-logging-enabled = true
>>>>>>>>>>>>   query-log-enabled = true
>>>>>>>>>>>>   cache-max-memory-size = 524288000
>>>>>>>>>>>>   cache-snapshot-memory-size = 26214400
>>>>>>>>>>>>   cache-snapshot-write-cold-duration = "1h0m0s"
>>>>>>>>>>>>   compact-full-write-cold-duration = "24h0m0s"
>>>>>>>>>>>>   max-points-per-block = 0
>>>>>>>>>>>>   data-logging-enabled = true
>>>>>>>>>>>>
>>>>>>>>>>>> [cluster]
>>>>>>>>>>>>   force-remote-mapping = false
>>>>>>>>>>>>   write-timeout = "10s"
>>>>>>>>>>>>   shard-writer-timeout = "5s"
>>>>>>>>>>>>   max-remote-write-connections = 3
>>>>>>>>>>>>   shard-mapper-timeout = "5s"
>>>>>>>>>>>>   max-concurrent-queries = 0
>>>>>>>>>>>>   query-timeout = "0"
>>>>>>>>>>>>   log-queries-after = "0"
>>>>>>>>>>>>   max-select-point = 0
>>>>>>>>>>>>   max-select-series = 0
>>>>>>>>>>>>   max-select-buckets = 0
>>>>>>>>>>>>
>>>>>>>>>>>> [retention]
>>>>>>>>>>>>   enabled = true
>>>>>>>>>>>>   check-interval = "30m0s"
>>>>>>>>>>>>
>>>>>>>>>>>> [shard-precreation]
>>>>>>>>>>>>   enabled = true
>>>>>>>>>>>>   check-interval = "10m0s"
>>>>>>>>>>>>   advance-period = "30m0s"
>>>>>>>>>>>>
>>>>>>>>>>>> [admin]
>>>>>>>>>>>>   enabled = true
>>>>>>>>>>>>   bind-address = ":8083"
>>>>>>>>>>>>   https-enabled = false
>>>>>>>>>>>>   https-certificate = "/etc/ssl/influxdb.pem"
>>>>>>>>>>>>   Version = ""
>>>>>>>>>>>>
>>>>>>>>>>>> [monitor]
>>>>>>>>>>>>   store-enabled = true
>>>>>>>>>>>>   store-database = "_internal"
>>>>>>>>>>>>   store-interval = "10s"
>>>>>>>>>>>>
>>>>>>>>>>>> [subscriber]
>>>>>>>>>>>>   enabled = true
>>>>>>>>>>>>
>>>>>>>>>>>> [http]
>>>>>>>>>>>>   enabled = true
>>>>>>>>>>>>   bind-address = ":8086"
>>>>>>>>>>>>   auth-enabled = false
>>>>>>>>>>>>   log-enabled = true
>>>>>>>>>>>>   write-tracing = false
>>>>>>>>>>>>   pprof-enabled = false
>>>>>>>>>>>>   https-enabled = false
>>>>>>>>>>>>   https-certificate = "/etc/ssl/influxdb.pem"
>>>>>>>>>>>>   max-row-limit = 10000
>>>>>>>>>>>>
>>>>>>>>>>>> [[graphite]]
>>>>>>>>>>>>   enabled = true
>>>>>>>>>>>>   bind-address = ":2003"
>>>>>>>>>>>>   database = "graphite"
>>>>>>>>>>>>   protocol = "udp"
>>>>>>>>>>>>   batch-size = 5000
>>>>>>>>>>>>   batch-pending = 10
>>>>>>>>>>>>   batch-timeout = "1s"
>>>>>>>>>>>>   consistency-level = "one"
>>>>>>>>>>>>   separator = "."
>>>>>>>>>>>>   udp-read-buffer = 0
>>>>>>>>>>>>
>>>>>>>>>>>> [[collectd]]
>>>>>>>>>>>>   enabled = false
>>>>>>>>>>>>   bind-address = ":25826"
>>>>>>>>>>>>   database = "collectd"
>>>>>>>>>>>>   retention-policy = ""
>>>>>>>>>>>>   batch-size = 5000
>>>>>>>>>>>>   batch-pending = 10
>>>>>>>>>>>>   batch-timeout = "10s"
>>>>>>>>>>>>   read-buffer = 0
>>>>>>>>>>>>   typesdb = "/usr/share/collectd/types.db"
>>>>>>>>>>>>
>>>>>>>>>>>> [[opentsdb]]
>>>>>>>>>>>>   enabled = false
>>>>>>>>>>>>   bind-address = ":4242"
>>>>>>>>>>>>   database = "opentsdb"
>>>>>>>>>>>>   retention-policy = ""
>>>>>>>>>>>>   consistency-level = "one"
>>>>>>>>>>>>   tls-enabled = false
>>>>>>>>>>>>   certificate = "/etc/ssl/influxdb.pem"
>>>>>>>>>>>>   batch-size = 1000
>>>>>>>>>>>>   batch-pending = 5
>>>>>>>>>>>>   batch-timeout = "1s"
>>>>>>>>>>>>   log-point-errors = true
>>>>>>>>>>>>
>>>>>>>>>>>> [[udp]]
>>>>>>>>>>>>   enabled = false
>>>>>>>>>>>>   bind-address = ":8089"
>>>>>>>>>>>>   database = "udp"
>>>>>>>>>>>>   retention-policy = ""
>>>>>>>>>>>>   batch-size = 5000
>>>>>>>>>>>>   batch-pending = 10
>>>>>>>>>>>>   read-buffer = 0
>>>>>>>>>>>>   batch-timeout = "1s"
>>>>>>>>>>>>   precision = ""
>>>>>>>>>>>>
>>>>>>>>>>>> [continuous_queries]
>>>>>>>>>>>>   log-enabled = true
>>>>>>>>>>>>   enabled = true
>>>>>>>>>>>>   run-interval = "1s"
>>>>>>>>>>>>
>>>>>>>>>>>> -- 
>>>>>>>>>>>> Remember to include the InfluxDB version number with all issue 
>>>>>>>>>>>> reports
>>>>>>>>>>>> --- 
>>>>>>>>>>>> You received this message because you are subscribed to the 
>>>>>>>>>>>> Google Groups "InfluxDB" group.
>>>>>>>>>>>> To unsubscribe from this group and stop receiving emails from 
>>>>>>>>>>>> it, send an email to [email protected].
>>>>>>>>>>>> To post to this group, send email to [email protected].
>>>>>>>>>>>> Visit this group at https://groups.google.com/group/influxdb.
>>>>>>>>>>>> To view this discussion on the web visit 
>>>>>>>>>>>> https://groups.google.com/d/msgid/influxdb/770d4dc6-8a9b-449e-ad43-fa558e53a16d%40googlegroups.com
>>>>>>>>>>>>  
>>>>>>>>>>>> <https://groups.google.com/d/msgid/influxdb/770d4dc6-8a9b-449e-ad43-fa558e53a16d%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>>>>>>>>> .
>>>>>>>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> -- 
>>>>>>>>>>> Sean Beckett
>>>>>>>>>>> Director of Support and Professional Services
>>>>>>>>>>> InfluxDB
>>>>>>>>>>>
>>>>>>>>>> -- 
>>>>>>>>>> Remember to include the InfluxDB version number with all issue 
>>>>>>>>>> reports
>>>>>>>>>> --- 
>>>>>>>>>> You received this message because you are subscribed to the 
>>>>>>>>>> Google Groups "InfluxDB" group.
>>>>>>>>>> To unsubscribe from this group and stop receiving emails from it, 
>>>>>>>>>> send an email to [email protected].
>>>>>>>>>> To post to this group, send email to [email protected].
>>>>>>>>>> Visit this group at https://groups.google.com/group/influxdb.
>>>>>>>>>> To view this discussion on the web visit 
>>>>>>>>>> https://groups.google.com/d/msgid/influxdb/eaa4d5ef-1e81-409b-89e1-867c83ef3939%40googlegroups.com
>>>>>>>>>>  
>>>>>>>>>> <https://groups.google.com/d/msgid/influxdb/eaa4d5ef-1e81-409b-89e1-867c83ef3939%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>>>>>>> .
>>>>>>>>>>
>>>>>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> -- 
>>>>>>>>> Sean Beckett
>>>>>>>>> Director of Support and Professional Services
>>>>>>>>> InfluxDB
>>>>>>>>>
>>>>>>>> -- 
>>>>>>> Remember to include the InfluxDB version number with all issue 
>>>>>>> reports
>>>>>>> --- 
>>>>>>> You received this message because you are subscribed to the Google 
>>>>>>> Groups "InfluxDB" group.
>>>>>>> To unsubscribe from this group and stop receiving emails from it, 
>>>>>>> send an email to [email protected].
>>>>>>> To post to this group, send email to [email protected].
>>>>>>> Visit this group at https://groups.google.com/group/influxdb.
>>>>>>> To view this discussion on the web visit 
>>>>>>> https://groups.google.com/d/msgid/influxdb/194493ab-664a-46e5-9336-9bfd18a82416%40googlegroups.com
>>>>>>>  
>>>>>>> <https://groups.google.com/d/msgid/influxdb/194493ab-664a-46e5-9336-9bfd18a82416%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>>>> .
>>>>>>>
>>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> -- 
>>>>>> Sean Beckett
>>>>>> Director of Support and Professional Services
>>>>>> InfluxDB
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> -- 
>>>>> Sean Beckett
>>>>> Director of Support and Professional Services
>>>>> InfluxDB
>>>>>
>>>> -- 
>>>> Remember to include the InfluxDB version number with all issue reports
>>>> --- 
>>>> You received this message because you are subscribed to the Google 
>>>> Groups "InfluxDB" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send 
>>>> an email to [email protected].
>>>> To post to this group, send email to [email protected].
>>>> Visit this group at https://groups.google.com/group/influxdb.
>>>> To view this discussion on the web visit 
>>>> https://groups.google.com/d/msgid/influxdb/633e8456-205f-4106-a6fe-f6802f497919%40googlegroups.com
>>>>  
>>>> <https://groups.google.com/d/msgid/influxdb/633e8456-205f-4106-a6fe-f6802f497919%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>> .
>>>>
>>>> For more options, visit https://groups.google.com/d/optout.
>>>>
>>>
>>>
>>>
>>> -- 
>>> Sean Beckett
>>> Director of Support and Professional Services
>>> InfluxDB
>>>
>> -- 
>> Remember to include the InfluxDB version number with all issue reports
>> --- 
>> You received this message because you are subscribed to the Google Groups 
>> "InfluxDB" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to [email protected] <javascript:>.
>> To post to this group, send email to [email protected] 
>> <javascript:>.
>> Visit this group at https://groups.google.com/group/influxdb.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/influxdb/0e542852-81b2-464a-bcdc-7dbaca538080%40googlegroups.com
>>  
>> <https://groups.google.com/d/msgid/influxdb/0e542852-81b2-464a-bcdc-7dbaca538080%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>>
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>
>
> -- 
> Sean Beckett
> Director of Support and Professional Services
> InfluxDB
>

-- 
Remember to include the InfluxDB version number with all issue reports
--- 
You received this message because you are subscribed to the Google Groups 
"InfluxDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/influxdb.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/influxdb/55271457-2c9a-46a7-9e51-e91617077567%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: [influxdb] Influxdb 0.13 runs out of memory

Reply via email to