I would not use different measurements as InfluxDB does not allow you do to cross measurement analytics, so if you go the multi-measurements way you won't be able to crunch your storage metrics with your network ones if you used two different measurements.
On Friday, September 23, 2016 at 2:42:49 AM UTC+2, [email protected] wrote: > > > We are also in the process of figuring out how to store data for a > multi-tenant system so this is very helpful. I have a couple of follow up > questions along the similar lines (caveat -- we're very new to InfluxDB so > I may have misused the terms). > > When thinking about our schema (ignoring tenancy) we were considering > having > measurements which stored data from a specific sub-system of our > application. So for example we might have one measurement for the data > related to our storage sub-system (mongodb) and other for the rest layer > and so forth. These would have fields which stored data for specific > metrics and we'd use tags to segment the data. So we might have a > measurement for rest with fields for throughput and response time and tags > for HTTP method and relative URI. > > When laying tenancy into this our first inclination was to do so by adding > the tenant id to the measurement name (so we'd have XYZ-mongodb and > XYZ-rest). However, based on this discussion it sounds like you'd advise > against that in favor of simply having a measurement per tenant and putting > all of the metrics in that as fields (which begs the question of why not > have 1 measurement in the non-tenant schema). One issue that arrises with > that approach is what to do about overlapping field keys (both mongodb and > rest have throughput for example). It seems like we could use either > stylized field keys (mongo_throughput and rest_throughput) or we could use > tags. Any thoughts on which would be preferable? > > Even with using tags to resolve metric overlap I think we'd end up with > 1000's fields and if we used prefixing we'd have 10,000's. Also, if I > understand how writes work we'd have some points that are extremely sparse > (those for sub-systems with more specialized data, such as the JVM) and > some points with a large number of field values (on the order of 100's to > 1000's). Is this going to cause issues? We'd also end up doing lots of > queries which pull out only a small sub-set of the fields, any concern > there? > > Anyway, thanks in advance for any advice. I'm looking forward to trying > this out and seeing how it works. > > Sean Fitts > > > On Friday, August 19, 2016 at 12:27:13 PM UTC-7, Sean Beckett wrote: > > There's not much performance gain from segmenting the data. It will all > live on the filesystem organized by time first, and series second. As long > as your queries are bounded to particular times and series, the measurement > schema won't make too much difference. > > > > > > However, DROP MEASUREMENT is more performant than DROP SERIES, so I > would think scoping each customer to a measurement (Schema #2) would be > beneficial for overall organization. > > > > > > Schema #3 is not a great schema, as it puts important metadata in the > measurement name. Typically that's an anti-pattern. Additionally, there are > no JOINs across measurements, so you wouldn't be able to query for the > COUNT() of all events across a customer if each page ID meant a new > measurement. > > > > > > On Tue, Aug 2, 2016 at 11:05 PM, <[email protected]> wrote: > > > > Hi there, > > > > > > I'm not quite sure which schema design would be better and hoping > someone could help: > > > > > > (1) > > Measurement = PageViews > > > > Tags = OrganisationId=XYZ, PageId=123 > > Values = BrowserAgent=Chrome, URL=test.com > > > > > > > > Measurement = Clicks > > Tags = OrganisationId=XYZ, PageId=123 > > Values = BrowserAgent=Chrome, URL=test.com > > > > > > or > > > > > > (2) > > Measurement = XYZ (OrganisationID) > > > > Tags = PageId=123, Event=PageView > > Values = BrowserAgent=Chrome, URL=test.com > > > > > > > > Measurement = XYZ (OrganisationID) > > Tags = PageId=123, Event=Click > > Values = BrowserAgent=Chrome, URL=test.com > > > > > > or > > > > > > (3) > > Measurement = XYZ-123 (OrganisationId-PageId) > > Tags = Event=PageView > > > > Values = BrowserAgent=Chrome, URL=test.com > > > > > > > > Measurement = XYZ-123 (OrganisationId-PageId) > > Tags = Event=Click > > Values = BrowserAgent=Chrome, URL=test.com > > > > > > This would be a used in a multi-tenant environment where each customer > (organisation) has their own data. Does the use of a orgid-pageid > measurement help the underlying database? > > > > IE, with SQL having a table name as the OrgId-PageId would restrict the > indexing storage/speed to just that of that scope, however I'm not sure it > would be the same with InfluxDB as perhaps indexes are based on series > (which includes measurement and tags). > > > > > > So then in theory it would be much of a muchness - no performance gain > by segmenting data? > > > > > > Ryan > > > > > > > > > > -- > > > > Remember to include the InfluxDB version number with all issue reports > > > > --- > > > > You received this message because you are subscribed to the Google > Groups "InfluxDB" group. > > > > To unsubscribe from this group and stop receiving emails from it, send > an email to [email protected]. > > > > To post to this group, send email to [email protected]. > > > > Visit this group at https://groups.google.com/group/influxdb. > > > > To view this discussion on the web visit > https://groups.google.com/d/msgid/influxdb/0100a133-7b23-4d29-aa7a-33e2666991d7%40googlegroups.com. > > > > > > For more options, visit https://groups.google.com/d/optout. > > > > > > > > > > > > -- > > > > > > Sean Beckett > > Director of Support and Professional Services > > InfluxDB > > -- Remember to include the InfluxDB version number with all issue reports --- You received this message because you are subscribed to the Google Groups "InfluxDB" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/influxdb. To view this discussion on the web visit https://groups.google.com/d/msgid/influxdb/7a51a003-eba3-489e-8325-e00b434b74c2%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
