This must be something specific to my query: `SELECT * FROM bars LIMIT 1`, that's when the RAM maxes out and influx restarts. If I try another query: `SELECT * FROM bars WHERE Symbol = 'AAPL' AND time > now() - 10d LIMIT 1` I get a result back quickly.
On Friday, August 19, 2016 at 1:45:47 PM UTC-5, John Jelinek wrote: > > Cardinality has jumped to 14041 after all the points were uploaded, but > `SELECT * FROM bars LIMIT 1` still maxes out the RAM and reboots the influx > service. > > On Friday, August 19, 2016 at 1:41:45 PM UTC-5, John Jelinek wrote: >> >> Also, I don't have any other processes running on the latest environment >> (ubuntu w/ 32GB of RAM), I built that box just for this test. >> >> On Friday, August 19, 2016 at 1:38:59 PM UTC-5, John Jelinek wrote: >>> >>> I did the query while uploading 14288591 points (at 5000 points/second). >>> Here's a CSV sample of the kind of data: >>> >>> ``` >>> "Symbol","Date","Open","High","Low","Close","Volume","Ex-Dividend","Split >>> Ratio","Adj. Open","Adj. High","Adj. Low","Adj. Close","Adj. Volume" >>> >>> A,1999-11-18,45.5,50.0,40.0,44.0,44739900.0,0.0,1.0,43.471809559155,47.771219295775,38.21697543662,42.038672980282,44739900.0 >>> >>> A,1999-11-19,42.94,43.0,39.81,40.38,10897100.0,0.0,1.0,41.025923131212,41.083248594367,38.035444803296,38.580036703268,10897100.0 >>> >>> A,1999-11-22,41.31,44.0,40.06,44.0,4705200.0,0.0,1.0,39.468581382169,42.038672980282,38.274300899775,42.038672980282,4705200.0 >>> >>> A,1999-11-23,42.5,43.63,40.25,40.25,4274400.0,0.0,1.0,40.605536401409,41.685165957493,38.455831533099,38.455831533099,4274400.0 >>> >>> A,1999-11-24,40.13,41.94,40.0,41.06,3464400.0,0.0,1.0,38.341180606789,40.070498745296,38.21697543662,39.22972528569,3464400.0 >>> >>> A,1999-11-26,40.88,41.5,40.75,41.19,1237100.0,0.0,1.0,39.057748896226,39.650112015493,38.933543726057,39.353930455859,1237100.0 >>> >>> A,1999-11-29,41.0,42.44,40.56,42.13,2914700.0,0.0,1.0,39.172399822536,40.548210938254,38.752013092733,40.25202937862,2914700.0 >>> >>> A,1999-11-30,42.0,42.94,40.94,42.19,3083000.0,0.0,1.0,40.127824208451,41.025923131212,39.115074359381,40.309354841775,3083000.0 >>> >>> A,1999-12-01,42.19,43.44,41.88,42.94,2115400.0,0.0,1.0,40.309354841775,41.503635324169,40.013173282141,41.025923131212,2115400.0 >>> ``` >>> >>> I only have the one database created for this measurement and just this >>> measurement. I've tested this in 3 different environments, docker on 8GB >>> RAM, running influx directly on a macbook pro w/ 16GB RAM, and running >>> influx directly on an ubuntu 16.04 server with 32GB of RAM. On all >>> environments, the RAM has maxed out and swap is then maxed out. I'm using >>> this process to upload the CSV into influx 0.13 >>> https://github.com/jpillora/csv-to-influxdb. This is the dataset I'm >>> uploading into influx: https://www.quandl.com/data/WIKI. This is the >>> command I'm using to get it into influx: `csv-to-influxdb -m bars -t Symbol >>> -ts Date -tf 2006-01-02 -d eodbars WIKI_20160818.csv`. Let me know if you >>> need to know any other details. >>> >>> On Friday, August 19, 2016 at 12:13:59 PM UTC-5, Sean Beckett wrote: >>>> >>>> On further consideration, an unbounded query on 1.6 billion points is a >>>> lot to sample. Presumably if you put a time boundary on that query it >>>> doesn't OOM? >>>> >>>> On Fri, Aug 19, 2016 at 11:12 AM, Sean Beckett <[email protected]> >>>> wrote: >>>> >>>>> That is not expected behavior. 5000 points per second is a light >>>>> workload, unless each of those points has 10-100 fields. Even 500k values >>>>> per second is a sustainable workload on a multi-core machine. >>>>> >>>>> A series cardinality less than 10k is also fairly trivial. That >>>>> shouldn't require more than a gig or two of RAM. >>>>> >>>>> Do you have long strings in your database? Is there something else >>>>> running on the system that needs RAM? >>>>> >>>>> Do you have many many databases or measurements? >>>>> >>>>> On Fri, Aug 19, 2016 at 10:45 AM, John Jelinek <[email protected]> >>>>> wrote: >>>>> >>>>>> I have a cardinality of `9876` from this query: `SELECT >>>>>> sum(numSeries) AS "total_series" FROM "_internal".."database" WHERE time >>>>>> > >>>>>> now() - 10s` and when I query one of my measurements with something like >>>>>> `SELECT * FROM bars LIMIT 1` the RAM instantly spikes up to 32GB, maxes >>>>>> out >>>>>> swap, and the influxdb service restarts. Note, this measurement is >>>>>> getting >>>>>> writes of 5000 points per second. Total number of points are about >>>>>> 1.6GB. >>>>>> Is this to be expected? >>>>>> >>>>>> >>>>>> On Wednesday, August 10, 2016 at 8:04:16 AM UTC-5, whille zg wrote: >>>>>>> >>>>>>> I'm having OOM issue, post at >>>>>>> https://github.com/influxdata/influxdb/issues/7134 >>>>>>> It seems RAM will drop slowly to small amount if no query continues, >>>>>>> but i need to read recent data several times continuously. >>>>>>> I'm try ing v1.0beta on 32G machine, but it's been killed, will try >>>>>>> 256G RAM. >>>>>>> Or should v0.12 ok with the RAM problem? >>>>>>> >>>>>>> 在 2016年7月13日星期三 UTC+8上午12:19:25,Sean Beckett写道: >>>>>>>> >>>>>>>> Currently InfluxDB must load the entire series index into RAM. >>>>>>>> We're working on a caching mechanism so that only recently written or >>>>>>>> queries series need to be kept in RAM. It's a complex feature to >>>>>>>> implement >>>>>>>> while maintaining performance, but we hope to have a first version in >>>>>>>> some >>>>>>>> months. >>>>>>>> >>>>>>>> On Tue, Jul 12, 2016 at 3:36 AM, Jan Kis <[email protected]> wrote: >>>>>>>> >>>>>>>>> Hi Sean, nice guess, we have 91 786 506 series :) To understand >>>>>>>>> this a bit better. Does the high memory consumption come from the >>>>>>>>> fact that >>>>>>>>> influx loads the index into memory for faster writes and querying? >>>>>>>>> >>>>>>>>> >>>>>>>>> I will dive into the individual measurements to see where exactly >>>>>>>>> do we have such a large tag cardinality, so that we can reduce the >>>>>>>>> number >>>>>>>>> of series. >>>>>>>>> >>>>>>>>> Thank you >>>>>>>>> >>>>>>>>> On Monday, July 11, 2016 at 6:51:52 PM UTC+2, Sean Beckett wrote: >>>>>>>>>> >>>>>>>>>> High RAM usage usually correlates with high series cardinality >>>>>>>>>> <https://docs.influxdata.com/influxdb/v0.13/concepts/glossary/#series-cardinality> >>>>>>>>>> . >>>>>>>>>> >>>>>>>>>> You can run "SELECT sum(numSeries) AS "total_series" FROM >>>>>>>>>> "_internal".."database" WHERE time > now() - 10s" to determine your >>>>>>>>>> series >>>>>>>>>> cardinality, assuming you haven't altered the default sample rate >>>>>>>>>> for the >>>>>>>>>> _internal database. If you have, change the WHERE time clause to >>>>>>>>>> grab only >>>>>>>>>> one sample, or use "SELECT last(numSeries) FROM >>>>>>>>>> "_internal".."database" >>>>>>>>>> GROUP BY "database"" and sum the results. >>>>>>>>>> >>>>>>>>>> With 100GB of RAM in use, I'm going to guess you have 5+ million >>>>>>>>>> series. >>>>>>>>>> >>>>>>>>>> On Mon, Jul 11, 2016 at 10:21 AM, Jan Kis <[email protected]> >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> Hi, >>>>>>>>>>> >>>>>>>>>>> we are using influxdb 0.13 on Fedora 23. We see influx consuming >>>>>>>>>>> more than 100GB of ram. At some point it eventually runs out of >>>>>>>>>>> memory and >>>>>>>>>>> dies. There are no errors in the logs. Our configuration is below. >>>>>>>>>>> >>>>>>>>>>> Is there a way to control how much memory influx is consuming? >>>>>>>>>>> What can we do to figure out why is influx consuming so much >>>>>>>>>>> memory? >>>>>>>>>>> >>>>>>>>>>> Thank you >>>>>>>>>>> >>>>>>>>>>> reporting-disabled = false >>>>>>>>>>> bind-address = ":8088" >>>>>>>>>>> hostname = "" >>>>>>>>>>> join = "" >>>>>>>>>>> >>>>>>>>>>> [meta] >>>>>>>>>>> dir = "/data/influxdb/meta" >>>>>>>>>>> retention-autocreate = true >>>>>>>>>>> logging-enabled = true >>>>>>>>>>> pprof-enabled = false >>>>>>>>>>> lease-duration = "1m0s" >>>>>>>>>>> >>>>>>>>>>> [data] >>>>>>>>>>> dir = "/data/influxdb/data" >>>>>>>>>>> engine = "tsm1" >>>>>>>>>>> wal-dir = "/data/influxdb/wal" >>>>>>>>>>> wal-logging-enabled = true >>>>>>>>>>> query-log-enabled = true >>>>>>>>>>> cache-max-memory-size = 524288000 >>>>>>>>>>> cache-snapshot-memory-size = 26214400 >>>>>>>>>>> cache-snapshot-write-cold-duration = "1h0m0s" >>>>>>>>>>> compact-full-write-cold-duration = "24h0m0s" >>>>>>>>>>> max-points-per-block = 0 >>>>>>>>>>> data-logging-enabled = true >>>>>>>>>>> >>>>>>>>>>> [cluster] >>>>>>>>>>> force-remote-mapping = false >>>>>>>>>>> write-timeout = "10s" >>>>>>>>>>> shard-writer-timeout = "5s" >>>>>>>>>>> max-remote-write-connections = 3 >>>>>>>>>>> shard-mapper-timeout = "5s" >>>>>>>>>>> max-concurrent-queries = 0 >>>>>>>>>>> query-timeout = "0" >>>>>>>>>>> log-queries-after = "0" >>>>>>>>>>> max-select-point = 0 >>>>>>>>>>> max-select-series = 0 >>>>>>>>>>> max-select-buckets = 0 >>>>>>>>>>> >>>>>>>>>>> [retention] >>>>>>>>>>> enabled = true >>>>>>>>>>> check-interval = "30m0s" >>>>>>>>>>> >>>>>>>>>>> [shard-precreation] >>>>>>>>>>> enabled = true >>>>>>>>>>> check-interval = "10m0s" >>>>>>>>>>> advance-period = "30m0s" >>>>>>>>>>> >>>>>>>>>>> [admin] >>>>>>>>>>> enabled = true >>>>>>>>>>> bind-address = ":8083" >>>>>>>>>>> https-enabled = false >>>>>>>>>>> https-certificate = "/etc/ssl/influxdb.pem" >>>>>>>>>>> Version = "" >>>>>>>>>>> >>>>>>>>>>> [monitor] >>>>>>>>>>> store-enabled = true >>>>>>>>>>> store-database = "_internal" >>>>>>>>>>> store-interval = "10s" >>>>>>>>>>> >>>>>>>>>>> [subscriber] >>>>>>>>>>> enabled = true >>>>>>>>>>> >>>>>>>>>>> [http] >>>>>>>>>>> enabled = true >>>>>>>>>>> bind-address = ":8086" >>>>>>>>>>> auth-enabled = false >>>>>>>>>>> log-enabled = true >>>>>>>>>>> write-tracing = false >>>>>>>>>>> pprof-enabled = false >>>>>>>>>>> https-enabled = false >>>>>>>>>>> https-certificate = "/etc/ssl/influxdb.pem" >>>>>>>>>>> max-row-limit = 10000 >>>>>>>>>>> >>>>>>>>>>> [[graphite]] >>>>>>>>>>> enabled = true >>>>>>>>>>> bind-address = ":2003" >>>>>>>>>>> database = "graphite" >>>>>>>>>>> protocol = "udp" >>>>>>>>>>> batch-size = 5000 >>>>>>>>>>> batch-pending = 10 >>>>>>>>>>> batch-timeout = "1s" >>>>>>>>>>> consistency-level = "one" >>>>>>>>>>> separator = "." >>>>>>>>>>> udp-read-buffer = 0 >>>>>>>>>>> >>>>>>>>>>> [[collectd]] >>>>>>>>>>> enabled = false >>>>>>>>>>> bind-address = ":25826" >>>>>>>>>>> database = "collectd" >>>>>>>>>>> retention-policy = "" >>>>>>>>>>> batch-size = 5000 >>>>>>>>>>> batch-pending = 10 >>>>>>>>>>> batch-timeout = "10s" >>>>>>>>>>> read-buffer = 0 >>>>>>>>>>> typesdb = "/usr/share/collectd/types.db" >>>>>>>>>>> >>>>>>>>>>> [[opentsdb]] >>>>>>>>>>> enabled = false >>>>>>>>>>> bind-address = ":4242" >>>>>>>>>>> database = "opentsdb" >>>>>>>>>>> retention-policy = "" >>>>>>>>>>> consistency-level = "one" >>>>>>>>>>> tls-enabled = false >>>>>>>>>>> certificate = "/etc/ssl/influxdb.pem" >>>>>>>>>>> batch-size = 1000 >>>>>>>>>>> batch-pending = 5 >>>>>>>>>>> batch-timeout = "1s" >>>>>>>>>>> log-point-errors = true >>>>>>>>>>> >>>>>>>>>>> [[udp]] >>>>>>>>>>> enabled = false >>>>>>>>>>> bind-address = ":8089" >>>>>>>>>>> database = "udp" >>>>>>>>>>> retention-policy = "" >>>>>>>>>>> batch-size = 5000 >>>>>>>>>>> batch-pending = 10 >>>>>>>>>>> read-buffer = 0 >>>>>>>>>>> batch-timeout = "1s" >>>>>>>>>>> precision = "" >>>>>>>>>>> >>>>>>>>>>> [continuous_queries] >>>>>>>>>>> log-enabled = true >>>>>>>>>>> enabled = true >>>>>>>>>>> run-interval = "1s" >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> Remember to include the InfluxDB version number with all issue >>>>>>>>>>> reports >>>>>>>>>>> --- >>>>>>>>>>> You received this message because you are subscribed to the >>>>>>>>>>> Google Groups "InfluxDB" group. >>>>>>>>>>> To unsubscribe from this group and stop receiving emails from >>>>>>>>>>> it, send an email to [email protected]. >>>>>>>>>>> To post to this group, send email to [email protected]. >>>>>>>>>>> Visit this group at https://groups.google.com/group/influxdb. >>>>>>>>>>> To view this discussion on the web visit >>>>>>>>>>> https://groups.google.com/d/msgid/influxdb/770d4dc6-8a9b-449e-ad43-fa558e53a16d%40googlegroups.com >>>>>>>>>>> >>>>>>>>>>> <https://groups.google.com/d/msgid/influxdb/770d4dc6-8a9b-449e-ad43-fa558e53a16d%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>>>>>>>> . >>>>>>>>>>> For more options, visit https://groups.google.com/d/optout. >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> Sean Beckett >>>>>>>>>> Director of Support and Professional Services >>>>>>>>>> InfluxDB >>>>>>>>>> >>>>>>>>> -- >>>>>>>>> Remember to include the InfluxDB version number with all issue >>>>>>>>> reports >>>>>>>>> --- >>>>>>>>> You received this message because you are subscribed to the Google >>>>>>>>> Groups "InfluxDB" group. >>>>>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>>>>> send an email to [email protected]. >>>>>>>>> To post to this group, send email to [email protected]. >>>>>>>>> Visit this group at https://groups.google.com/group/influxdb. >>>>>>>>> To view this discussion on the web visit >>>>>>>>> https://groups.google.com/d/msgid/influxdb/eaa4d5ef-1e81-409b-89e1-867c83ef3939%40googlegroups.com >>>>>>>>> >>>>>>>>> <https://groups.google.com/d/msgid/influxdb/eaa4d5ef-1e81-409b-89e1-867c83ef3939%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>>>>>> . >>>>>>>>> >>>>>>>>> For more options, visit https://groups.google.com/d/optout. >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> Sean Beckett >>>>>>>> Director of Support and Professional Services >>>>>>>> InfluxDB >>>>>>>> >>>>>>> -- >>>>>> Remember to include the InfluxDB version number with all issue reports >>>>>> --- >>>>>> You received this message because you are subscribed to the Google >>>>>> Groups "InfluxDB" group. >>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>> send an email to [email protected]. >>>>>> To post to this group, send email to [email protected]. >>>>>> Visit this group at https://groups.google.com/group/influxdb. >>>>>> To view this discussion on the web visit >>>>>> https://groups.google.com/d/msgid/influxdb/194493ab-664a-46e5-9336-9bfd18a82416%40googlegroups.com >>>>>> >>>>>> <https://groups.google.com/d/msgid/influxdb/194493ab-664a-46e5-9336-9bfd18a82416%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>>> . >>>>>> >>>>>> For more options, visit https://groups.google.com/d/optout. >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> Sean Beckett >>>>> Director of Support and Professional Services >>>>> InfluxDB >>>>> >>>> >>>> >>>> >>>> -- >>>> Sean Beckett >>>> Director of Support and Professional Services >>>> InfluxDB >>>> >>> -- Remember to include the InfluxDB version number with all issue reports --- You received this message because you are subscribed to the Google Groups "InfluxDB" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/influxdb. To view this discussion on the web visit https://groups.google.com/d/msgid/influxdb/ed545a9d-d8da-4109-83ce-a5c789337898%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
