[influxdb] schema design advice

Zoltan Szalai Mon, 20 Jun 2016 06:26:02 -0700

Hello,

I would like to ask schema design advise for the following use case:


I want to collect time series data of trips of cars.

I collect about 20-30 different metrics per trip (like speed, rpm,location, temperature etc.) with different frequencies ranging from 3seconds to 1 minute.

A car generates about 150-300 trips in a month, 2000-3000 trip a year.

I have some data about the trips stored in an rdbms like trip_id, car,driver, start datetime, end datetime and many more and I want to storeonly the raw, collected time series data in influxdb. Among the raw datacollected from the cars I have to calculate additional time series dataderived from the collected data, like fuel consumption and also singlevalues like distance, average speed etc. typically when a trip ends.


I had/have the following ideas in my mind:

1. My first naive approach was to just simple store every trip's data inits own measurement (trip_<trip_id>) without any tags but that wouldgenerate a huge amount of series sooner or later. Since then I'velearned that series cardinality matters the most in terms of memoryusage and performance so it's obviously not a good choice. So any optionthat would use the trip ids as tag values sounds like a really bad ideabecause I'll have a lot of trips.2. Store all the data in a single measurement and use for example thecar id as a tag. That would generate giant amount of points in a singlemeasurement (is it a problem?) and would make horizontal partitioning ofthe data difficult if I'd need to use multiple hosts / databases later.3. Measurement / per car, no tags. This seems like the best approach forme at the moment. Horizontal partitioning would be relatively trivialusing car id as shard key if the number of cars on the system increases.Series cardinality would equal with car cardinality. Obviously in orderto get only the data of a single trip (which is important) I'd have torely on the meta data (the trip's start and end time) stored in therdbms. That is true for 2. as well.


I'm open to new ideas as well. What do you think?

Thanks
Zoltan

--
Remember to include the InfluxDB version number with all issue reports

---You received this message because you are subscribed to the Google Groups "InfluxDB" group.

To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/influxdb.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/influxdb/f5bb0779-f2ac-d9f3-7078-9709d6369714%40gmail.com.
For more options, visit https://groups.google.com/d/optout.

[influxdb] schema design advice

Reply via email to