Should a query like this work then? ``` SELECT MAX(Open) AS Open, MAX(Close) AS Close, MAX(Volume) AS Volume, (MAX(Close) - MAX(Open)) / MAX(Open) * 100 AS PctChg INTO newBars FROM bars WHERE time < now() GROUP BY time(1d), Symbol ``` because that OOM'd too.
On Friday, August 19, 2016 at 4:16:44 PM UTC-5, Sean Beckett wrote: > > INTO queries definitely must be bounded in time to be performant. The > system doesn't have an internal query scheduler that can translate that > into smaller chunks. It literally tries to query every point, process them > if needed, and then write them back. It's not streamed or chunked (yet), so > you must supply explicit time intervals to keep the number of points > queried in the millions. > > On Fri, Aug 19, 2016 at 3:08 PM, John Jelinek <[email protected] > <javascript:>> wrote: > >> The LIMIT query still OOMs if the import is not running. Also, this query >> OOMs: `SELECT Close / Open INTO pctChg FROM bars` >> >> On Friday, August 19, 2016 at 2:23:46 PM UTC-5, Sean Beckett wrote: >>> >>> I don't know anything about the CSV tool or what performance impacts it >>> might have. If that import is not running, does the LIMIT query succeed or >>> does it still OOM? >>> >>> On Fri, Aug 19, 2016 at 12:38 PM, John Jelinek <[email protected]> >>> wrote: >>> >>>> I did the query while uploading 14288591 points (at 5000 >>>> points/second). Here's a CSV sample of the kind of data: >>>> >>>> ``` >>>> "Symbol","Date","Open","High","Low","Close","Volume","Ex-Dividend","Split >>>> Ratio","Adj. Open","Adj. High","Adj. Low","Adj. Close","Adj. Volume" >>>> >>>> A,1999-11-18,45.5,50.0,40.0,44.0,44739900.0,0.0,1.0,43.471809559155,47.771219295775,38.21697543662,42.038672980282,44739900.0 >>>> >>>> A,1999-11-19,42.94,43.0,39.81,40.38,10897100.0,0.0,1.0,41.025923131212,41.083248594367,38.035444803296,38.580036703268,10897100.0 >>>> >>>> A,1999-11-22,41.31,44.0,40.06,44.0,4705200.0,0.0,1.0,39.468581382169,42.038672980282,38.274300899775,42.038672980282,4705200.0 >>>> >>>> A,1999-11-23,42.5,43.63,40.25,40.25,4274400.0,0.0,1.0,40.605536401409,41.685165957493,38.455831533099,38.455831533099,4274400.0 >>>> >>>> A,1999-11-24,40.13,41.94,40.0,41.06,3464400.0,0.0,1.0,38.341180606789,40.070498745296,38.21697543662,39.22972528569,3464400.0 >>>> >>>> A,1999-11-26,40.88,41.5,40.75,41.19,1237100.0,0.0,1.0,39.057748896226,39.650112015493,38.933543726057,39.353930455859,1237100.0 >>>> >>>> A,1999-11-29,41.0,42.44,40.56,42.13,2914700.0,0.0,1.0,39.172399822536,40.548210938254,38.752013092733,40.25202937862,2914700.0 >>>> >>>> A,1999-11-30,42.0,42.94,40.94,42.19,3083000.0,0.0,1.0,40.127824208451,41.025923131212,39. >>>> 115074359381,40.309354841775,3083000.0 >>>> >>>> A,1999-12-01,42.19,43.44,41.88,42.94,2115400.0,0.0,1.0,40.309354841775,41.503635324169,40.013173282141,41.025923131212,2115400.0 >>>> ``` >>>> >>>> I only have the one database created for this measurement and just this >>>> measurement. I've tested this in 3 different environments, docker on 8GB >>>> RAM, running influx directly on a macbook pro w/ 16GB RAM, and running >>>> influx directly on an ubuntu 16.04 server with 32GB of RAM. On all >>>> environments, the RAM has maxed out and swap is then maxed out. I'm using >>>> this process to upload the CSV into influx 0.13 >>>> https://github.com/jpillora/csv-to-influxdb. This is the dataset I'm >>>> uploading into influx: https://www.quandl.com/data/WIKI. This is the >>>> command I'm using to get it into influx: `csv-to-influxdb -m bars -t >>>> Symbol >>>> -ts Date -tf 2006-01-02 -d eodbars WIKI_20160818.csv`. Let me know if you >>>> need to know any other details. >>>> >>>> On Friday, August 19, 2016 at 12:13:59 PM UTC-5, Sean Beckett wrote: >>>>> >>>>> On further consideration, an unbounded query on 1.6 billion points is >>>>> a lot to sample. Presumably if you put a time boundary on that query it >>>>> doesn't OOM? >>>>> >>>>> On Fri, Aug 19, 2016 at 11:12 AM, Sean Beckett <[email protected]> >>>>> wrote: >>>>> >>>>>> That is not expected behavior. 5000 points per second is a light >>>>>> workload, unless each of those points has 10-100 fields. Even 500k >>>>>> values >>>>>> per second is a sustainable workload on a multi-core machine. >>>>>> >>>>>> A series cardinality less than 10k is also fairly trivial. That >>>>>> shouldn't require more than a gig or two of RAM. >>>>>> >>>>>> Do you have long strings in your database? Is there something else >>>>>> running on the system that needs RAM? >>>>>> >>>>>> Do you have many many databases or measurements? >>>>>> >>>>>> On Fri, Aug 19, 2016 at 10:45 AM, John Jelinek <[email protected] >>>>>> > wrote: >>>>>> >>>>>>> I have a cardinality of `9876` from this query: `SELECT >>>>>>> sum(numSeries) AS "total_series" FROM "_internal".."database" WHERE >>>>>>> time > >>>>>>> now() - 10s` and when I query one of my measurements with something >>>>>>> like >>>>>>> `SELECT * FROM bars LIMIT 1` the RAM instantly spikes up to 32GB, maxes >>>>>>> out >>>>>>> swap, and the influxdb service restarts. Note, this measurement is >>>>>>> getting >>>>>>> writes of 5000 points per second. Total number of points are about >>>>>>> 1.6GB. >>>>>>> Is this to be expected? >>>>>>> >>>>>>> >>>>>>> On Wednesday, August 10, 2016 at 8:04:16 AM UTC-5, whille zg wrote: >>>>>>>> >>>>>>>> I'm having OOM issue, post at >>>>>>>> https://github.com/influxdata/influxdb/issues/7134 >>>>>>>> It seems RAM will drop slowly to small amount if no query >>>>>>>> continues, but i need to read recent data several times continuously. >>>>>>>> I'm try ing v1.0beta on 32G machine, but it's been killed, will try >>>>>>>> 256G RAM. >>>>>>>> Or should v0.12 ok with the RAM problem? >>>>>>>> >>>>>>>> 在 2016年7月13日星期三 UTC+8上午12:19:25,Sean Beckett写道: >>>>>>>>> >>>>>>>>> Currently InfluxDB must load the entire series index into RAM. >>>>>>>>> We're working on a caching mechanism so that only recently written or >>>>>>>>> queries series need to be kept in RAM. It's a complex feature to >>>>>>>>> implement >>>>>>>>> while maintaining performance, but we hope to have a first version in >>>>>>>>> some >>>>>>>>> months. >>>>>>>>> >>>>>>>>> On Tue, Jul 12, 2016 at 3:36 AM, Jan Kis <[email protected]> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> Hi Sean, nice guess, we have 91 786 506 series :) To understand >>>>>>>>>> this a bit better. Does the high memory consumption come from the >>>>>>>>>> fact that >>>>>>>>>> influx loads the index into memory for faster writes and querying? >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> I will dive into the individual measurements to see where exactly >>>>>>>>>> do we have such a large tag cardinality, so that we can reduce the >>>>>>>>>> number >>>>>>>>>> of series. >>>>>>>>>> >>>>>>>>>> Thank you >>>>>>>>>> >>>>>>>>>> On Monday, July 11, 2016 at 6:51:52 PM UTC+2, Sean Beckett wrote: >>>>>>>>>>> >>>>>>>>>>> High RAM usage usually correlates with high series cardinality >>>>>>>>>>> <https://docs.influxdata.com/influxdb/v0.13/concepts/glossary/#series-cardinality> >>>>>>>>>>> . >>>>>>>>>>> >>>>>>>>>>> You can run "SELECT sum(numSeries) AS "total_series" FROM >>>>>>>>>>> "_internal".."database" WHERE time > now() - 10s" to determine your >>>>>>>>>>> series >>>>>>>>>>> cardinality, assuming you haven't altered the default sample rate >>>>>>>>>>> for the >>>>>>>>>>> _internal database. If you have, change the WHERE time clause to >>>>>>>>>>> grab only >>>>>>>>>>> one sample, or use "SELECT last(numSeries) FROM >>>>>>>>>>> "_internal".."database" >>>>>>>>>>> GROUP BY "database"" and sum the results. >>>>>>>>>>> >>>>>>>>>>> With 100GB of RAM in use, I'm going to guess you have 5+ million >>>>>>>>>>> series. >>>>>>>>>>> >>>>>>>>>>> On Mon, Jul 11, 2016 at 10:21 AM, Jan Kis <[email protected]> >>>>>>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>>> Hi, >>>>>>>>>>>> >>>>>>>>>>>> we are using influxdb 0.13 on Fedora 23. We see influx >>>>>>>>>>>> consuming more than 100GB of ram. At some point it eventually runs >>>>>>>>>>>> out of >>>>>>>>>>>> memory and dies. There are no errors in the logs. Our >>>>>>>>>>>> configuration is >>>>>>>>>>>> below. >>>>>>>>>>>> >>>>>>>>>>>> Is there a way to control how much memory influx is consuming? >>>>>>>>>>>> What can we do to figure out why is influx consuming so much >>>>>>>>>>>> memory? >>>>>>>>>>>> >>>>>>>>>>>> Thank you >>>>>>>>>>>> >>>>>>>>>>>> reporting-disabled = false >>>>>>>>>>>> bind-address = ":8088" >>>>>>>>>>>> hostname = "" >>>>>>>>>>>> join = "" >>>>>>>>>>>> >>>>>>>>>>>> [meta] >>>>>>>>>>>> dir = "/data/influxdb/meta" >>>>>>>>>>>> retention-autocreate = true >>>>>>>>>>>> logging-enabled = true >>>>>>>>>>>> pprof-enabled = false >>>>>>>>>>>> lease-duration = "1m0s" >>>>>>>>>>>> >>>>>>>>>>>> [data] >>>>>>>>>>>> dir = "/data/influxdb/data" >>>>>>>>>>>> engine = "tsm1" >>>>>>>>>>>> wal-dir = "/data/influxdb/wal" >>>>>>>>>>>> wal-logging-enabled = true >>>>>>>>>>>> query-log-enabled = true >>>>>>>>>>>> cache-max-memory-size = 524288000 >>>>>>>>>>>> cache-snapshot-memory-size = 26214400 >>>>>>>>>>>> cache-snapshot-write-cold-duration = "1h0m0s" >>>>>>>>>>>> compact-full-write-cold-duration = "24h0m0s" >>>>>>>>>>>> max-points-per-block = 0 >>>>>>>>>>>> data-logging-enabled = true >>>>>>>>>>>> >>>>>>>>>>>> [cluster] >>>>>>>>>>>> force-remote-mapping = false >>>>>>>>>>>> write-timeout = "10s" >>>>>>>>>>>> shard-writer-timeout = "5s" >>>>>>>>>>>> max-remote-write-connections = 3 >>>>>>>>>>>> shard-mapper-timeout = "5s" >>>>>>>>>>>> max-concurrent-queries = 0 >>>>>>>>>>>> query-timeout = "0" >>>>>>>>>>>> log-queries-after = "0" >>>>>>>>>>>> max-select-point = 0 >>>>>>>>>>>> max-select-series = 0 >>>>>>>>>>>> max-select-buckets = 0 >>>>>>>>>>>> >>>>>>>>>>>> [retention] >>>>>>>>>>>> enabled = true >>>>>>>>>>>> check-interval = "30m0s" >>>>>>>>>>>> >>>>>>>>>>>> [shard-precreation] >>>>>>>>>>>> enabled = true >>>>>>>>>>>> check-interval = "10m0s" >>>>>>>>>>>> advance-period = "30m0s" >>>>>>>>>>>> >>>>>>>>>>>> [admin] >>>>>>>>>>>> enabled = true >>>>>>>>>>>> bind-address = ":8083" >>>>>>>>>>>> https-enabled = false >>>>>>>>>>>> https-certificate = "/etc/ssl/influxdb.pem" >>>>>>>>>>>> Version = "" >>>>>>>>>>>> >>>>>>>>>>>> [monitor] >>>>>>>>>>>> store-enabled = true >>>>>>>>>>>> store-database = "_internal" >>>>>>>>>>>> store-interval = "10s" >>>>>>>>>>>> >>>>>>>>>>>> [subscriber] >>>>>>>>>>>> enabled = true >>>>>>>>>>>> >>>>>>>>>>>> [http] >>>>>>>>>>>> enabled = true >>>>>>>>>>>> bind-address = ":8086" >>>>>>>>>>>> auth-enabled = false >>>>>>>>>>>> log-enabled = true >>>>>>>>>>>> write-tracing = false >>>>>>>>>>>> pprof-enabled = false >>>>>>>>>>>> https-enabled = false >>>>>>>>>>>> https-certificate = "/etc/ssl/influxdb.pem" >>>>>>>>>>>> max-row-limit = 10000 >>>>>>>>>>>> >>>>>>>>>>>> [[graphite]] >>>>>>>>>>>> enabled = true >>>>>>>>>>>> bind-address = ":2003" >>>>>>>>>>>> database = "graphite" >>>>>>>>>>>> protocol = "udp" >>>>>>>>>>>> batch-size = 5000 >>>>>>>>>>>> batch-pending = 10 >>>>>>>>>>>> batch-timeout = "1s" >>>>>>>>>>>> consistency-level = "one" >>>>>>>>>>>> separator = "." >>>>>>>>>>>> udp-read-buffer = 0 >>>>>>>>>>>> >>>>>>>>>>>> [[collectd]] >>>>>>>>>>>> enabled = false >>>>>>>>>>>> bind-address = ":25826" >>>>>>>>>>>> database = "collectd" >>>>>>>>>>>> retention-policy = "" >>>>>>>>>>>> batch-size = 5000 >>>>>>>>>>>> batch-pending = 10 >>>>>>>>>>>> batch-timeout = "10s" >>>>>>>>>>>> read-buffer = 0 >>>>>>>>>>>> typesdb = "/usr/share/collectd/types.db" >>>>>>>>>>>> >>>>>>>>>>>> [[opentsdb]] >>>>>>>>>>>> enabled = false >>>>>>>>>>>> bind-address = ":4242" >>>>>>>>>>>> database = "opentsdb" >>>>>>>>>>>> retention-policy = "" >>>>>>>>>>>> consistency-level = "one" >>>>>>>>>>>> tls-enabled = false >>>>>>>>>>>> certificate = "/etc/ssl/influxdb.pem" >>>>>>>>>>>> batch-size = 1000 >>>>>>>>>>>> batch-pending = 5 >>>>>>>>>>>> batch-timeout = "1s" >>>>>>>>>>>> log-point-errors = true >>>>>>>>>>>> >>>>>>>>>>>> [[udp]] >>>>>>>>>>>> enabled = false >>>>>>>>>>>> bind-address = ":8089" >>>>>>>>>>>> database = "udp" >>>>>>>>>>>> retention-policy = "" >>>>>>>>>>>> batch-size = 5000 >>>>>>>>>>>> batch-pending = 10 >>>>>>>>>>>> read-buffer = 0 >>>>>>>>>>>> batch-timeout = "1s" >>>>>>>>>>>> precision = "" >>>>>>>>>>>> >>>>>>>>>>>> [continuous_queries] >>>>>>>>>>>> log-enabled = true >>>>>>>>>>>> enabled = true >>>>>>>>>>>> run-interval = "1s" >>>>>>>>>>>> >>>>>>>>>>>> -- >>>>>>>>>>>> Remember to include the InfluxDB version number with all issue >>>>>>>>>>>> reports >>>>>>>>>>>> --- >>>>>>>>>>>> You received this message because you are subscribed to the >>>>>>>>>>>> Google Groups "InfluxDB" group. >>>>>>>>>>>> To unsubscribe from this group and stop receiving emails from >>>>>>>>>>>> it, send an email to [email protected]. >>>>>>>>>>>> To post to this group, send email to [email protected]. >>>>>>>>>>>> Visit this group at https://groups.google.com/group/influxdb. >>>>>>>>>>>> To view this discussion on the web visit >>>>>>>>>>>> https://groups.google.com/d/msgid/influxdb/770d4dc6-8a9b-449e-ad43-fa558e53a16d%40googlegroups.com >>>>>>>>>>>> >>>>>>>>>>>> <https://groups.google.com/d/msgid/influxdb/770d4dc6-8a9b-449e-ad43-fa558e53a16d%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>>>>>>>>> . >>>>>>>>>>>> For more options, visit https://groups.google.com/d/optout. >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> Sean Beckett >>>>>>>>>>> Director of Support and Professional Services >>>>>>>>>>> InfluxDB >>>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> Remember to include the InfluxDB version number with all issue >>>>>>>>>> reports >>>>>>>>>> --- >>>>>>>>>> You received this message because you are subscribed to the >>>>>>>>>> Google Groups "InfluxDB" group. >>>>>>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>>>>>> send an email to [email protected]. >>>>>>>>>> To post to this group, send email to [email protected]. >>>>>>>>>> Visit this group at https://groups.google.com/group/influxdb. >>>>>>>>>> To view this discussion on the web visit >>>>>>>>>> https://groups.google.com/d/msgid/influxdb/eaa4d5ef-1e81-409b-89e1-867c83ef3939%40googlegroups.com >>>>>>>>>> >>>>>>>>>> <https://groups.google.com/d/msgid/influxdb/eaa4d5ef-1e81-409b-89e1-867c83ef3939%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>>>>>>> . >>>>>>>>>> >>>>>>>>>> For more options, visit https://groups.google.com/d/optout. >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Sean Beckett >>>>>>>>> Director of Support and Professional Services >>>>>>>>> InfluxDB >>>>>>>>> >>>>>>>> -- >>>>>>> Remember to include the InfluxDB version number with all issue >>>>>>> reports >>>>>>> --- >>>>>>> You received this message because you are subscribed to the Google >>>>>>> Groups "InfluxDB" group. >>>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>>> send an email to [email protected]. >>>>>>> To post to this group, send email to [email protected]. >>>>>>> Visit this group at https://groups.google.com/group/influxdb. >>>>>>> To view this discussion on the web visit >>>>>>> https://groups.google.com/d/msgid/influxdb/194493ab-664a-46e5-9336-9bfd18a82416%40googlegroups.com >>>>>>> >>>>>>> <https://groups.google.com/d/msgid/influxdb/194493ab-664a-46e5-9336-9bfd18a82416%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>>>> . >>>>>>> >>>>>>> For more options, visit https://groups.google.com/d/optout. >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Sean Beckett >>>>>> Director of Support and Professional Services >>>>>> InfluxDB >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> Sean Beckett >>>>> Director of Support and Professional Services >>>>> InfluxDB >>>>> >>>> -- >>>> Remember to include the InfluxDB version number with all issue reports >>>> --- >>>> You received this message because you are subscribed to the Google >>>> Groups "InfluxDB" group. >>>> To unsubscribe from this group and stop receiving emails from it, send >>>> an email to [email protected]. >>>> To post to this group, send email to [email protected]. >>>> Visit this group at https://groups.google.com/group/influxdb. >>>> To view this discussion on the web visit >>>> https://groups.google.com/d/msgid/influxdb/633e8456-205f-4106-a6fe-f6802f497919%40googlegroups.com >>>> >>>> <https://groups.google.com/d/msgid/influxdb/633e8456-205f-4106-a6fe-f6802f497919%40googlegroups.com?utm_medium=email&utm_source=footer> >>>> . >>>> >>>> For more options, visit https://groups.google.com/d/optout. >>>> >>> >>> >>> >>> -- >>> Sean Beckett >>> Director of Support and Professional Services >>> InfluxDB >>> >> -- >> Remember to include the InfluxDB version number with all issue reports >> --- >> You received this message because you are subscribed to the Google Groups >> "InfluxDB" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected] <javascript:>. >> To post to this group, send email to [email protected] >> <javascript:>. >> Visit this group at https://groups.google.com/group/influxdb. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/influxdb/0e542852-81b2-464a-bcdc-7dbaca538080%40googlegroups.com >> >> <https://groups.google.com/d/msgid/influxdb/0e542852-81b2-464a-bcdc-7dbaca538080%40googlegroups.com?utm_medium=email&utm_source=footer> >> . >> >> For more options, visit https://groups.google.com/d/optout. >> > > > > -- > Sean Beckett > Director of Support and Professional Services > InfluxDB > -- Remember to include the InfluxDB version number with all issue reports --- You received this message because you are subscribed to the Google Groups "InfluxDB" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/influxdb. To view this discussion on the web visit https://groups.google.com/d/msgid/influxdb/55271457-2c9a-46a7-9e51-e91617077567%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
