I only have ~15M points, should this query cause 32GB of RAM to max out? On Friday, August 19, 2016 at 4:57:56 PM UTC-5, John Jelinek wrote: > > Should a query like this work then? > ``` > SELECT MAX(Open) AS Open, MAX(Close) AS Close, MAX(Volume) AS Volume, > (MAX(Close) - MAX(Open)) / MAX(Open) * 100 AS PctChg INTO newBars FROM bars > WHERE time < now() GROUP BY time(1d), Symbol > ``` > because that OOM'd too. > > On Friday, August 19, 2016 at 4:16:44 PM UTC-5, Sean Beckett wrote: >> >> INTO queries definitely must be bounded in time to be performant. The >> system doesn't have an internal query scheduler that can translate that >> into smaller chunks. It literally tries to query every point, process them >> if needed, and then write them back. It's not streamed or chunked (yet), so >> you must supply explicit time intervals to keep the number of points >> queried in the millions. >> >> On Fri, Aug 19, 2016 at 3:08 PM, John Jelinek <[email protected]> >> wrote: >> >>> The LIMIT query still OOMs if the import is not running. Also, this >>> query OOMs: `SELECT Close / Open INTO pctChg FROM bars` >>> >>> On Friday, August 19, 2016 at 2:23:46 PM UTC-5, Sean Beckett wrote: >>>> >>>> I don't know anything about the CSV tool or what performance impacts it >>>> might have. If that import is not running, does the LIMIT query succeed or >>>> does it still OOM? >>>> >>>> On Fri, Aug 19, 2016 at 12:38 PM, John Jelinek <[email protected]> >>>> wrote: >>>> >>>>> I did the query while uploading 14288591 points (at 5000 >>>>> points/second). Here's a CSV sample of the kind of data: >>>>> >>>>> ``` >>>>> "Symbol","Date","Open","High","Low","Close","Volume","Ex-Dividend","Split >>>>> Ratio","Adj. Open","Adj. High","Adj. Low","Adj. Close","Adj. Volume" >>>>> >>>>> A,1999-11-18,45.5,50.0,40.0,44.0,44739900.0,0.0,1.0,43.471809559155,47.771219295775,38.21697543662,42.038672980282,44739900.0 >>>>> >>>>> A,1999-11-19,42.94,43.0,39.81,40.38,10897100.0,0.0,1.0,41.025923131212,41.083248594367,38.035444803296,38.580036703268,10897100.0 >>>>> >>>>> A,1999-11-22,41.31,44.0,40.06,44.0,4705200.0,0.0,1.0,39.468581382169,42.038672980282,38.274300899775,42.038672980282,4705200.0 >>>>> >>>>> A,1999-11-23,42.5,43.63,40.25,40.25,4274400.0,0.0,1.0,40.605536401409,41.685165957493,38.455831533099,38.455831533099,4274400.0 >>>>> >>>>> A,1999-11-24,40.13,41.94,40.0,41.06,3464400.0,0.0,1.0,38.341180606789,40.070498745296,38.21697543662,39.22972528569,3464400.0 >>>>> >>>>> A,1999-11-26,40.88,41.5,40.75,41.19,1237100.0,0.0,1.0,39.057748896226,39.650112015493,38.933543726057,39.353930455859,1237100.0 >>>>> >>>>> A,1999-11-29,41.0,42.44,40.56,42.13,2914700.0,0.0,1.0,39.172399822536,40.548210938254,38.752013092733,40.25202937862,2914700.0 >>>>> >>>>> A,1999-11-30,42.0,42.94,40.94,42.19,3083000.0,0.0,1.0,40.127824208451,41.025923131212,39. >>>>> 115074359381,40.309354841775,3083000.0 >>>>> >>>>> A,1999-12-01,42.19,43.44,41.88,42.94,2115400.0,0.0,1.0,40.309354841775,41.503635324169,40.013173282141,41.025923131212,2115400.0 >>>>> ``` >>>>> >>>>> I only have the one database created for this measurement and just >>>>> this measurement. I've tested this in 3 different environments, docker on >>>>> 8GB RAM, running influx directly on a macbook pro w/ 16GB RAM, and >>>>> running >>>>> influx directly on an ubuntu 16.04 server with 32GB of RAM. On all >>>>> environments, the RAM has maxed out and swap is then maxed out. I'm using >>>>> this process to upload the CSV into influx 0.13 >>>>> https://github.com/jpillora/csv-to-influxdb. This is the dataset I'm >>>>> uploading into influx: https://www.quandl.com/data/WIKI. This is the >>>>> command I'm using to get it into influx: `csv-to-influxdb -m bars -t >>>>> Symbol >>>>> -ts Date -tf 2006-01-02 -d eodbars WIKI_20160818.csv`. Let me know if you >>>>> need to know any other details. >>>>> >>>>> On Friday, August 19, 2016 at 12:13:59 PM UTC-5, Sean Beckett wrote: >>>>>> >>>>>> On further consideration, an unbounded query on 1.6 billion points is >>>>>> a lot to sample. Presumably if you put a time boundary on that query it >>>>>> doesn't OOM? >>>>>> >>>>>> On Fri, Aug 19, 2016 at 11:12 AM, Sean Beckett <[email protected]> >>>>>> wrote: >>>>>> >>>>>>> That is not expected behavior. 5000 points per second is a light >>>>>>> workload, unless each of those points has 10-100 fields. Even 500k >>>>>>> values >>>>>>> per second is a sustainable workload on a multi-core machine. >>>>>>> >>>>>>> A series cardinality less than 10k is also fairly trivial. That >>>>>>> shouldn't require more than a gig or two of RAM. >>>>>>> >>>>>>> Do you have long strings in your database? Is there something else >>>>>>> running on the system that needs RAM? >>>>>>> >>>>>>> Do you have many many databases or measurements? >>>>>>> >>>>>>> On Fri, Aug 19, 2016 at 10:45 AM, John Jelinek < >>>>>>> [email protected]> wrote: >>>>>>> >>>>>>>> I have a cardinality of `9876` from this query: `SELECT >>>>>>>> sum(numSeries) AS "total_series" FROM "_internal".."database" WHERE >>>>>>>> time > >>>>>>>> now() - 10s` and when I query one of my measurements with something >>>>>>>> like >>>>>>>> `SELECT * FROM bars LIMIT 1` the RAM instantly spikes up to 32GB, >>>>>>>> maxes out >>>>>>>> swap, and the influxdb service restarts. Note, this measurement is >>>>>>>> getting >>>>>>>> writes of 5000 points per second. Total number of points are about >>>>>>>> 1.6GB. >>>>>>>> Is this to be expected? >>>>>>>> >>>>>>>> >>>>>>>> On Wednesday, August 10, 2016 at 8:04:16 AM UTC-5, whille zg wrote: >>>>>>>>> >>>>>>>>> I'm having OOM issue, post at >>>>>>>>> https://github.com/influxdata/influxdb/issues/7134 >>>>>>>>> It seems RAM will drop slowly to small amount if no query >>>>>>>>> continues, but i need to read recent data several times continuously. >>>>>>>>> I'm try ing v1.0beta on 32G machine, but it's been killed, will >>>>>>>>> try 256G RAM. >>>>>>>>> Or should v0.12 ok with the RAM problem? >>>>>>>>> >>>>>>>>> 在 2016年7月13日星期三 UTC+8上午12:19:25,Sean Beckett写道: >>>>>>>>>> >>>>>>>>>> Currently InfluxDB must load the entire series index into RAM. >>>>>>>>>> We're working on a caching mechanism so that only recently written >>>>>>>>>> or >>>>>>>>>> queries series need to be kept in RAM. It's a complex feature to >>>>>>>>>> implement >>>>>>>>>> while maintaining performance, but we hope to have a first version >>>>>>>>>> in some >>>>>>>>>> months. >>>>>>>>>> >>>>>>>>>> On Tue, Jul 12, 2016 at 3:36 AM, Jan Kis <[email protected]> >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> Hi Sean, nice guess, we have 91 786 506 series :) To understand >>>>>>>>>>> this a bit better. Does the high memory consumption come from the >>>>>>>>>>> fact that >>>>>>>>>>> influx loads the index into memory for faster writes and querying? >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> I will dive into the individual measurements to see where >>>>>>>>>>> exactly do we have such a large tag cardinality, so that we can >>>>>>>>>>> reduce the >>>>>>>>>>> number of series. >>>>>>>>>>> >>>>>>>>>>> Thank you >>>>>>>>>>> >>>>>>>>>>> On Monday, July 11, 2016 at 6:51:52 PM UTC+2, Sean Beckett wrote: >>>>>>>>>>>> >>>>>>>>>>>> High RAM usage usually correlates with high series cardinality >>>>>>>>>>>> <https://docs.influxdata.com/influxdb/v0.13/concepts/glossary/#series-cardinality> >>>>>>>>>>>> . >>>>>>>>>>>> >>>>>>>>>>>> You can run "SELECT sum(numSeries) AS "total_series" FROM >>>>>>>>>>>> "_internal".."database" WHERE time > now() - 10s" to determine >>>>>>>>>>>> your series >>>>>>>>>>>> cardinality, assuming you haven't altered the default sample rate >>>>>>>>>>>> for the >>>>>>>>>>>> _internal database. If you have, change the WHERE time clause to >>>>>>>>>>>> grab only >>>>>>>>>>>> one sample, or use "SELECT last(numSeries) FROM >>>>>>>>>>>> "_internal".."database" >>>>>>>>>>>> GROUP BY "database"" and sum the results. >>>>>>>>>>>> >>>>>>>>>>>> With 100GB of RAM in use, I'm going to guess you have 5+ >>>>>>>>>>>> million series. >>>>>>>>>>>> >>>>>>>>>>>> On Mon, Jul 11, 2016 at 10:21 AM, Jan Kis <[email protected]> >>>>>>>>>>>> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Hi, >>>>>>>>>>>>> >>>>>>>>>>>>> we are using influxdb 0.13 on Fedora 23. We see influx >>>>>>>>>>>>> consuming more than 100GB of ram. At some point it eventually >>>>>>>>>>>>> runs out of >>>>>>>>>>>>> memory and dies. There are no errors in the logs. Our >>>>>>>>>>>>> configuration is >>>>>>>>>>>>> below. >>>>>>>>>>>>> >>>>>>>>>>>>> Is there a way to control how much memory influx is consuming? >>>>>>>>>>>>> What can we do to figure out why is influx consuming so much >>>>>>>>>>>>> memory? >>>>>>>>>>>>> >>>>>>>>>>>>> Thank you >>>>>>>>>>>>> >>>>>>>>>>>>> reporting-disabled = false >>>>>>>>>>>>> bind-address = ":8088" >>>>>>>>>>>>> hostname = "" >>>>>>>>>>>>> join = "" >>>>>>>>>>>>> >>>>>>>>>>>>> [meta] >>>>>>>>>>>>> dir = "/data/influxdb/meta" >>>>>>>>>>>>> retention-autocreate = true >>>>>>>>>>>>> logging-enabled = true >>>>>>>>>>>>> pprof-enabled = false >>>>>>>>>>>>> lease-duration = "1m0s" >>>>>>>>>>>>> >>>>>>>>>>>>> [data] >>>>>>>>>>>>> dir = "/data/influxdb/data" >>>>>>>>>>>>> engine = "tsm1" >>>>>>>>>>>>> wal-dir = "/data/influxdb/wal" >>>>>>>>>>>>> wal-logging-enabled = true >>>>>>>>>>>>> query-log-enabled = true >>>>>>>>>>>>> cache-max-memory-size = 524288000 >>>>>>>>>>>>> cache-snapshot-memory-size = 26214400 >>>>>>>>>>>>> cache-snapshot-write-cold-duration = "1h0m0s" >>>>>>>>>>>>> compact-full-write-cold-duration = "24h0m0s" >>>>>>>>>>>>> max-points-per-block = 0 >>>>>>>>>>>>> data-logging-enabled = true >>>>>>>>>>>>> >>>>>>>>>>>>> [cluster] >>>>>>>>>>>>> force-remote-mapping = false >>>>>>>>>>>>> write-timeout = "10s" >>>>>>>>>>>>> shard-writer-timeout = "5s" >>>>>>>>>>>>> max-remote-write-connections = 3 >>>>>>>>>>>>> shard-mapper-timeout = "5s" >>>>>>>>>>>>> max-concurrent-queries = 0 >>>>>>>>>>>>> query-timeout = "0" >>>>>>>>>>>>> log-queries-after = "0" >>>>>>>>>>>>> max-select-point = 0 >>>>>>>>>>>>> max-select-series = 0 >>>>>>>>>>>>> max-select-buckets = 0 >>>>>>>>>>>>> >>>>>>>>>>>>> [retention] >>>>>>>>>>>>> enabled = true >>>>>>>>>>>>> check-interval = "30m0s" >>>>>>>>>>>>> >>>>>>>>>>>>> [shard-precreation] >>>>>>>>>>>>> enabled = true >>>>>>>>>>>>> check-interval = "10m0s" >>>>>>>>>>>>> advance-period = "30m0s" >>>>>>>>>>>>> >>>>>>>>>>>>> [admin] >>>>>>>>>>>>> enabled = true >>>>>>>>>>>>> bind-address = ":8083" >>>>>>>>>>>>> https-enabled = false >>>>>>>>>>>>> https-certificate = "/etc/ssl/influxdb.pem" >>>>>>>>>>>>> Version = "" >>>>>>>>>>>>> >>>>>>>>>>>>> [monitor] >>>>>>>>>>>>> store-enabled = true >>>>>>>>>>>>> store-database = "_internal" >>>>>>>>>>>>> store-interval = "10s" >>>>>>>>>>>>> >>>>>>>>>>>>> [subscriber] >>>>>>>>>>>>> enabled = true >>>>>>>>>>>>> >>>>>>>>>>>>> [http] >>>>>>>>>>>>> enabled = true >>>>>>>>>>>>> bind-address = ":8086" >>>>>>>>>>>>> auth-enabled = false >>>>>>>>>>>>> log-enabled = true >>>>>>>>>>>>> write-tracing = false >>>>>>>>>>>>> pprof-enabled = false >>>>>>>>>>>>> https-enabled = false >>>>>>>>>>>>> https-certificate = "/etc/ssl/influxdb.pem" >>>>>>>>>>>>> max-row-limit = 10000 >>>>>>>>>>>>> >>>>>>>>>>>>> [[graphite]] >>>>>>>>>>>>> enabled = true >>>>>>>>>>>>> bind-address = ":2003" >>>>>>>>>>>>> database = "graphite" >>>>>>>>>>>>> protocol = "udp" >>>>>>>>>>>>> batch-size = 5000 >>>>>>>>>>>>> batch-pending = 10 >>>>>>>>>>>>> batch-timeout = "1s" >>>>>>>>>>>>> consistency-level = "one" >>>>>>>>>>>>> separator = "." >>>>>>>>>>>>> udp-read-buffer = 0 >>>>>>>>>>>>> >>>>>>>>>>>>> [[collectd]] >>>>>>>>>>>>> enabled = false >>>>>>>>>>>>> bind-address = ":25826" >>>>>>>>>>>>> database = "collectd" >>>>>>>>>>>>> retention-policy = "" >>>>>>>>>>>>> batch-size = 5000 >>>>>>>>>>>>> batch-pending = 10 >>>>>>>>>>>>> batch-timeout = "10s" >>>>>>>>>>>>> read-buffer = 0 >>>>>>>>>>>>> typesdb = "/usr/share/collectd/types.db" >>>>>>>>>>>>> >>>>>>>>>>>>> [[opentsdb]] >>>>>>>>>>>>> enabled = false >>>>>>>>>>>>> bind-address = ":4242" >>>>>>>>>>>>> database = "opentsdb" >>>>>>>>>>>>> retention-policy = "" >>>>>>>>>>>>> consistency-level = "one" >>>>>>>>>>>>> tls-enabled = false >>>>>>>>>>>>> certificate = "/etc/ssl/influxdb.pem" >>>>>>>>>>>>> batch-size = 1000 >>>>>>>>>>>>> batch-pending = 5 >>>>>>>>>>>>> batch-timeout = "1s" >>>>>>>>>>>>> log-point-errors = true >>>>>>>>>>>>> >>>>>>>>>>>>> [[udp]] >>>>>>>>>>>>> enabled = false >>>>>>>>>>>>> bind-address = ":8089" >>>>>>>>>>>>> database = "udp" >>>>>>>>>>>>> retention-policy = "" >>>>>>>>>>>>> batch-size = 5000 >>>>>>>>>>>>> batch-pending = 10 >>>>>>>>>>>>> read-buffer = 0 >>>>>>>>>>>>> batch-timeout = "1s" >>>>>>>>>>>>> precision = "" >>>>>>>>>>>>> >>>>>>>>>>>>> [continuous_queries] >>>>>>>>>>>>> log-enabled = true >>>>>>>>>>>>> enabled = true >>>>>>>>>>>>> run-interval = "1s" >>>>>>>>>>>>> >>>>>>>>>>>>> -- >>>>>>>>>>>>> Remember to include the InfluxDB version number with all issue >>>>>>>>>>>>> reports >>>>>>>>>>>>> --- >>>>>>>>>>>>> You received this message because you are subscribed to the >>>>>>>>>>>>> Google Groups "InfluxDB" group. >>>>>>>>>>>>> To unsubscribe from this group and stop receiving emails from >>>>>>>>>>>>> it, send an email to [email protected]. >>>>>>>>>>>>> To post to this group, send email to [email protected]. >>>>>>>>>>>>> Visit this group at https://groups.google.com/group/influxdb. >>>>>>>>>>>>> To view this discussion on the web visit >>>>>>>>>>>>> https://groups.google.com/d/msgid/influxdb/770d4dc6-8a9b-449e-ad43-fa558e53a16d%40googlegroups.com >>>>>>>>>>>>> >>>>>>>>>>>>> <https://groups.google.com/d/msgid/influxdb/770d4dc6-8a9b-449e-ad43-fa558e53a16d%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>>>>>>>>>> . >>>>>>>>>>>>> For more options, visit https://groups.google.com/d/optout. >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> -- >>>>>>>>>>>> Sean Beckett >>>>>>>>>>>> Director of Support and Professional Services >>>>>>>>>>>> InfluxDB >>>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> Remember to include the InfluxDB version number with all issue >>>>>>>>>>> reports >>>>>>>>>>> --- >>>>>>>>>>> You received this message because you are subscribed to the >>>>>>>>>>> Google Groups "InfluxDB" group. >>>>>>>>>>> To unsubscribe from this group and stop receiving emails from >>>>>>>>>>> it, send an email to [email protected]. >>>>>>>>>>> To post to this group, send email to [email protected]. >>>>>>>>>>> Visit this group at https://groups.google.com/group/influxdb. >>>>>>>>>>> To view this discussion on the web visit >>>>>>>>>>> https://groups.google.com/d/msgid/influxdb/eaa4d5ef-1e81-409b-89e1-867c83ef3939%40googlegroups.com >>>>>>>>>>> >>>>>>>>>>> <https://groups.google.com/d/msgid/influxdb/eaa4d5ef-1e81-409b-89e1-867c83ef3939%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>>>>>>>> . >>>>>>>>>>> >>>>>>>>>>> For more options, visit https://groups.google.com/d/optout. >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> Sean Beckett >>>>>>>>>> Director of Support and Professional Services >>>>>>>>>> InfluxDB >>>>>>>>>> >>>>>>>>> -- >>>>>>>> Remember to include the InfluxDB version number with all issue >>>>>>>> reports >>>>>>>> --- >>>>>>>> You received this message because you are subscribed to the Google >>>>>>>> Groups "InfluxDB" group. >>>>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>>>> send an email to [email protected]. >>>>>>>> To post to this group, send email to [email protected]. >>>>>>>> Visit this group at https://groups.google.com/group/influxdb. >>>>>>>> To view this discussion on the web visit >>>>>>>> https://groups.google.com/d/msgid/influxdb/194493ab-664a-46e5-9336-9bfd18a82416%40googlegroups.com >>>>>>>> >>>>>>>> <https://groups.google.com/d/msgid/influxdb/194493ab-664a-46e5-9336-9bfd18a82416%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>>>>> . >>>>>>>> >>>>>>>> For more options, visit https://groups.google.com/d/optout. >>>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Sean Beckett >>>>>>> Director of Support and Professional Services >>>>>>> InfluxDB >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Sean Beckett >>>>>> Director of Support and Professional Services >>>>>> InfluxDB >>>>>> >>>>> -- >>>>> Remember to include the InfluxDB version number with all issue reports >>>>> --- >>>>> You received this message because you are subscribed to the Google >>>>> Groups "InfluxDB" group. >>>>> To unsubscribe from this group and stop receiving emails from it, send >>>>> an email to [email protected]. >>>>> To post to this group, send email to [email protected]. >>>>> Visit this group at https://groups.google.com/group/influxdb. >>>>> To view this discussion on the web visit >>>>> https://groups.google.com/d/msgid/influxdb/633e8456-205f-4106-a6fe-f6802f497919%40googlegroups.com >>>>> >>>>> <https://groups.google.com/d/msgid/influxdb/633e8456-205f-4106-a6fe-f6802f497919%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>> . >>>>> >>>>> For more options, visit https://groups.google.com/d/optout. >>>>> >>>> >>>> >>>> >>>> -- >>>> Sean Beckett >>>> Director of Support and Professional Services >>>> InfluxDB >>>> >>> -- >>> Remember to include the InfluxDB version number with all issue reports >>> --- >>> You received this message because you are subscribed to the Google >>> Groups "InfluxDB" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to [email protected]. >>> To post to this group, send email to [email protected]. >>> Visit this group at https://groups.google.com/group/influxdb. >>> To view this discussion on the web visit >>> https://groups.google.com/d/msgid/influxdb/0e542852-81b2-464a-bcdc-7dbaca538080%40googlegroups.com >>> >>> <https://groups.google.com/d/msgid/influxdb/0e542852-81b2-464a-bcdc-7dbaca538080%40googlegroups.com?utm_medium=email&utm_source=footer> >>> . >>> >>> For more options, visit https://groups.google.com/d/optout. >>> >> >> >> >> -- >> Sean Beckett >> Director of Support and Professional Services >> InfluxDB >> >
-- Remember to include the InfluxDB version number with all issue reports --- You received this message because you are subscribed to the Google Groups "InfluxDB" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/influxdb. To view this discussion on the web visit https://groups.google.com/d/msgid/influxdb/958a4a57-c573-4cc0-84b2-2b0508492492%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
