On Thursday, June 16, 2016 at 6:16:46 PM UTC-4, Sean Beckett wrote:
> On Thu, Jun 16, 2016 at 2:14 PM, UW <[email protected]> wrote:
> Hello, we are collecting data from hundreds of clients and each data 
> collection period lasts several weeks. After three months we would like to 
> move older data to the archive.
> 
> 
> 
> Questions:
> 
> 
> 
> 1. Since there will be no queries across clients, does it make sense to set 
> up a separate DB to encapsulate data for each client?
> 
> 
> 
> Probably not. That will lead to a lot of duplicated index for little gain. It 
> sounds to me like client should be a tag, not a database or measurement.
>  
> 2. What is the performance impact of having hundreds or thousands of DBs and 
> are there any scalability guidelines for this?
> 
> 
> 
> Each database must have its own index. If you have one database with 5 tags, 
> each with 2 fully independent values, you've got 25 unique series. If you 
> have 100 databases, each with the identical 5 tags, 2 values each, now you 
> have 2500 unique series. Since you're talking about thousands of databases 
> that leads to a much higher series cardinality, which strongly impacts RAM 
> needs. In addition, the query engine has to maintain some working RAM for 
> every database. That adds up when you're talking about thousands of DBs.
> 
> 
> Also, points are stored in files per retention policy, per database. If you 
> have 1000 databases with just 1 point each, that still means 1000 shard files 
> on disk. It's not a huge issue but it does affect performance.
>  
> 3. If a DB is moved to archive, how quickly can it be re-mounted if data 
> needs to be analyzed again?
> 
> 
> 
> InfluxDB does not yet support multiple file paths for data. All the data is 
> stored in one place, and there is no concept of warm or cold storage. All 
> data is always accessible.
> 
> 
> InfluxDB does support automated expiry of data, and also automated 
> downsampling of high precision in to lower precision data. Does that meet 
> your needs?
> 
> 
> InfluxDB is very space efficient, so I'm not sure there's any reason to want 
> to archive the data. Each numeric value takes less than 2 bytes on disk when 
> fully compacted, so unless each client is storing billions of points you 
> should be fine for space on even a small SSD.
>  
> 
> 
> Thank you.
> 
> 
> 
> Mark
> 
> 
> 
> --
> 
> Remember to include the InfluxDB version number with all issue reports
> 
> ---
> 
> You received this message because you are subscribed to the Google Groups 
> "InfluxDB" group.
> 
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to [email protected].
> 
> To post to this group, send email to [email protected].
> 
> Visit this group at https://groups.google.com/group/influxdb.
> 
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/influxdb/7316123c-846f-4c4b-ba5d-4c7ccf09bc39%40googlegroups.com.
> 
> For more options, visit https://groups.google.com/d/optout.
> 
> 
> 
> 
> 
> -- 
> 
> 
> Sean Beckett
> Director of Support and Professional Services
> InfluxDB

Sean, thank you for a quick response. I have three follow-up questions:

1. How much RAM is required for each working database?
2. How does the number of shard files affect performance?
3. If we use a single database, how difficult would it be to purge data 
relating to a particular tag (e.g. client)?

Thank you.

Mark

-- 
Remember to include the InfluxDB version number with all issue reports
--- 
You received this message because you are subscribed to the Google Groups 
"InfluxDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/influxdb.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/influxdb/1e7f86a7-4214-4864-ba1a-0a87e435a80f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to