Hi Sean,
Thanks for answering my questions.
BTW, I am using version 1.0
Regarding Model 2, I also added a tag called deviceGroup, so deviceId between 1
and 999 will have deviceGroup='1000', deviceId between 1000 and 1999 will have
deviceGroup='2000', ...
So I can run queries like this
select * from "data" where deviceGroup = '???' and SYSTEM_MEMORY > 80
Regarding the Continuous Query:
I did a quick test processing all devices 35 times (5 seconds between each run
in my test). No continuous query created. I tested some queries that I might
use without problems using deviceGroup tag.
After I executed the following query
select MEAN(*) from monitorData group by deviceId, *
which is what the Continuous Query will use (I will group by time once it is in
PROD or have more data in my test), but it never gave me a response back.
I started processing all devices again (10 more times) and ran the above query
again, I got the message "Batch could not be sent. Data will be lost" in the
application that feeds the data points and the query never returned results.
I am afraid that the Continuous query won't be able to handle the volume of
data and make the whole system to slow down to a point that is not usable.
In PROD, we want to process all devices every 5 minutes, keep that data for a
day or two, then have a weekly retention policy with data aggregated per hour
using the Continuous Query mentioned above.
Thanks in advance for your time.
Best regards,
Carlo
On Thursday, September 29, 2016 at 11:18:15 PM UTC-4, Sean Beckett wrote:
> I would recommend Model 2. Store each metric as a field on the same
> measurement, with a Device ID tag.
>
>
> > Issues:
> > - Queries by value are slow (more than a minute). Example:
> > select * from "data" where SYSTEM_MEMORY > 80
>
>
> Queries that filter by field values are always slow. Field values are not
> indexed. Running a query unbounded in time forces the system to scan every
> single point to evaluate the condition.
>
>
> > - Continuous Query takes so much time:
> > ... BEGIN SELECT mean(SYSTEM_MEMORY) as
> > SYSTEM_MEMORY_mean INTO .... FROM data GROUP BY time(5m), deviceId
>
>
> What does "so much time" mean? There doesn't appear to be anything
> inefficient in the query.
>
>
> You never talked about your data density. How many metrics are written every
> five minutes?
>
>
>
>
>
>
>
> On Tue, Sep 27, 2016 at 7:02 AM, Carlo Vargas <[email protected]> wrote:
>
> BTW, for model 1 there will be 6 millions data points for database "devices"
> (600K data points per database/measurement).
> For model 3, there will be 600K data points per database (each database has
> its own metric).
>
>
>
>
>
>
>
>
> On Monday, September 26, 2016 at 8:52:19 PM UTC-4, Carlo Vargas wrote:
>
>
> Currently I am evaluating different Time Series data bases and I do have some
> questions regarding data modelling and query performance in InfluxDB.
>
> Context: We have 200 000 devices and 10 metrics per device (for instance:
> SYSTEM_MEMORY).
>
>
> Devices were processed 3 times, so we ended up with 600K data points.
>
> Here are the 3 models that were used:
>
> Model 1: One database named "devices", 10 measurements (one for each metric),
> and the tag deviceId.
>
>
>
>
> Issues:
>
> - Queries by value are not responding. Example of this query:
> select * from SYSTEM_MEMORY where value > 80
> - It uses a lot of RAM, the server crashes when the above
> query is executed or when the following Continuous Query is also executed:
> ... BEGIN SELECT mean(value) as mean_value INTO
> devices."<current_policy>".:MEASUREMENT FROM devices."<new_policy>"./.*/
> GROUP BY time(5m), deviceId
>
> Model 2: One database named "devices", one measurement named "data", deviceId
> tag and each metric as a field.
>
>
>
> Issues:
> - Queries by value are slow (more than a minute). Example:
> select * from "data" where SYSTEM_MEMORY > 80
> - Continuous Query takes so much time:
> ... BEGIN SELECT mean(SYSTEM_MEMORY) as SYSTEM_MEMORY_mean
> INTO .... FROM data GROUP BY time(5m), deviceId
> Model 3: One database per metric, one measurement "data", and deviceId tag.
>
>
>
> Issues:
> - Queries by value takes around 25 seconds. Example: select *
> from "data" where value > 80 (this query is done in SYSTEM_MEMORY database)
> - Continuous Query needs to be created for each database and
> they are slow.
> - Adding data points is slower than previous two models.
>
>
> Any advice/suggestion would be really appreciated.
>
> Thanks in advance.
>
>
>
>
>
>
>
> --
>
> Remember to include the InfluxDB version number with all issue reports
>
> ---
>
> You received this message because you are subscribed to the Google Groups
> "InfluxDB" group.
>
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
>
> To post to this group, send email to [email protected].
>
> Visit this group at https://groups.google.com/group/influxdb.
>
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/influxdb/4e0d07d0-fcda-4948-9bb0-ec8f93f703af%40googlegroups.com.
>
>
>
> For more options, visit https://groups.google.com/d/optout.
>
>
>
>
>
> --
>
>
> Sean Beckett
> Director of Support and Professional Services
> InfluxDB
--
Remember to include the InfluxDB version number with all issue reports
---
You received this message because you are subscribed to the Google Groups
"InfluxDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/influxdb.
To view this discussion on the web visit
https://groups.google.com/d/msgid/influxdb/bd6f8e00-281d-4988-a7d9-9883ee1efe62%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.