On Fri, Oct 7, 2016 at 10:20 AM, <[email protected]> wrote: > Hi Sean, > > Thanks for answering my questions. > BTW, I am using version 1.0 > > Regarding Model 2, I also added a tag called deviceGroup, so deviceId > between 1 and 999 will have deviceGroup='1000', deviceId between 1000 and > 1999 will have deviceGroup='2000', ... > > So I can run queries like this > select * from "data" where deviceGroup = '???' and SYSTEM_MEMORY > 80 > > > Regarding the Continuous Query: > > I did a quick test processing all devices 35 times (5 seconds between each > run in my test). No continuous query created. I tested some queries that I > might use without problems using deviceGroup tag. > > After I executed the following query > select MEAN(*) from monitorData group by deviceId, * > > which is what the Continuous Query will use (I will group by time once it > is in PROD or have more data in my test), but it never gave me a response > back. >
Meaning the query timed out, the process OOM'd, or the query returned immediately with no results? With no time restriction on that query, that might be a lot of points. That's taking the mean of every field from all time, grouped by every tag. If there are more than a few hundred thousand points that could be quite expensive. > I started processing all devices again (10 more times) and ran the above > query again, I got the message "Batch could not be sent. Data will be lost" > in the application that feeds the data points and the query never returned > results. > I don't understand. Did that error come from InfluxDB? Can you share actual error output rather than descriptions of the output? > > I am afraid that the Continuous query won't be able to handle the volume > of data and make the whole system to slow down to a point that is not > usable. > How many values per second are you writing to the system? How many series? What are the machine specs? RAM, CPU, IOPS in particular > > In PROD, we want to process all devices every 5 minutes, keep that data > for a day or two, then have a weekly retention policy with data aggregated > per hour using the Continuous Query mentioned above. > That's the canonical use case for CQs and RPs. > > Thanks in advance for your time. > > Best regards, > > Carlo > > > > > > On Thursday, September 29, 2016 at 11:18:15 PM UTC-4, Sean Beckett wrote: > > I would recommend Model 2. Store each metric as a field on the same > measurement, with a Device ID tag. > > > > > > > Issues: > > > - Queries by value are slow (more than a minute). > Example: select * from "data" where SYSTEM_MEMORY > 80 > > > > > > Queries that filter by field values are always slow. Field values are > not indexed. Running a query unbounded in time forces the system to scan > every single point to evaluate the condition. > > > > > > > - Continuous Query takes so much time: > > > ... BEGIN SELECT mean(SYSTEM_MEMORY) > as SYSTEM_MEMORY_mean INTO .... FROM data GROUP BY time(5m), deviceId > > > > > > What does "so much time" mean? There doesn't appear to be anything > inefficient in the query. > > > > > > You never talked about your data density. How many metrics are written > every five minutes? > > > > > > > > > > > > > > > > On Tue, Sep 27, 2016 at 7:02 AM, Carlo Vargas <[email protected]> > wrote: > > > > BTW, for model 1 there will be 6 millions data points for database > "devices" (600K data points per database/measurement). > > For model 3, there will be 600K data points per database (each database > has its own metric). > > > > > > > > > > > > > > > > > > On Monday, September 26, 2016 at 8:52:19 PM UTC-4, Carlo Vargas wrote: > > > > > > Currently I am evaluating different Time Series data bases and I do have > some questions regarding data modelling and query performance in InfluxDB. > > > > Context: We have 200 000 devices and 10 metrics per device (for > instance: SYSTEM_MEMORY). > > > > > > Devices were processed 3 times, so we ended up with 600K data points. > > > > Here are the 3 models that were used: > > > > Model 1: One database named "devices", 10 measurements (one for each > metric), and the tag deviceId. > > > > > > > > > > Issues: > > > > - Queries by value are not responding. Example of this > query: select * from SYSTEM_MEMORY where value > 80 > > - It uses a lot of RAM, the server crashes when the > above query is executed or when the following Continuous Query is also > executed: > > ... BEGIN SELECT mean(value) as mean_value INTO > devices."<current_policy>".:MEASUREMENT FROM devices."<new_policy>"./.*/ GROUP > BY time(5m), deviceId > > > > Model 2: One database named "devices", one measurement named "data", > deviceId tag and each metric as a field. > > > > > > > > Issues: > > - Queries by value are slow (more than a minute). > Example: select * from "data" where SYSTEM_MEMORY > 80 > > - Continuous Query takes so much time: > > ... BEGIN SELECT mean(SYSTEM_MEMORY) > as SYSTEM_MEMORY_mean INTO .... FROM data GROUP BY time(5m), deviceId > > Model 3: One database per metric, one measurement "data", and deviceId > tag. > > > > > > > > Issues: > > - Queries by value takes around 25 seconds. Example: > select * from "data" where value > 80 (this query is done in SYSTEM_MEMORY > database) > > - Continuous Query needs to be created for each database > and they are slow. > > - Adding data points is slower than previous two models. > > > > > > Any advice/suggestion would be really appreciated. > > > > Thanks in advance. > > > > > > > > > > > > > > > > -- > > > > Remember to include the InfluxDB version number with all issue reports > > > > --- > > > > You received this message because you are subscribed to the Google > Groups "InfluxDB" group. > > > > To unsubscribe from this group and stop receiving emails from it, send > an email to [email protected]. > > > > To post to this group, send email to [email protected]. > > > > Visit this group at https://groups.google.com/group/influxdb. > > > > To view this discussion on the web visit https://groups.google.com/d/ > msgid/influxdb/4e0d07d0-fcda-4948-9bb0-ec8f93f703af%40googlegroups.com. > > > > > > > > For more options, visit https://groups.google.com/d/optout. > > > > > > > > > > > > -- > > > > > > Sean Beckett > > Director of Support and Professional Services > > InfluxDB > > -- > Remember to include the InfluxDB version number with all issue reports > --- > You received this message because you are subscribed to the Google Groups > "InfluxDB" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To post to this group, send email to [email protected]. > Visit this group at https://groups.google.com/group/influxdb. > To view this discussion on the web visit https://groups.google.com/d/ > msgid/influxdb/bd6f8e00-281d-4988-a7d9-9883ee1efe62%40googlegroups.com. > For more options, visit https://groups.google.com/d/optout. > -- Sean Beckett Director of Support and Professional Services InfluxDB -- Remember to include the InfluxDB version number with all issue reports --- You received this message because you are subscribed to the Google Groups "InfluxDB" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/influxdb. To view this discussion on the web visit https://groups.google.com/d/msgid/influxdb/CALGqCvPmOyuUFMg-2p3tqGaSZKdrTxGA5JQti%2Br3xuzJPVSY9g%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
