The approach you are currently taking seems like the best one based on my experiences. I believe that writing locally chronological batches with few series per batch is faster than writing globally chronological batches with many series in each batch.
On Mon, Jul 11, 2016 at 6:37 AM, Ryan January <[email protected]> wrote: > Thanks Sean, I appreciate your help. > > I did write a quick python script as a proof of concept yesterday, and it > seems to work ok. I was a little surprised by how long this approach will > take, but I haven't yet profiled the code to see where my time is spent. > Before I go much further I'd like to verify exactly how data is stored on > disk and how the data is sharded. > > During this migration I'm consolidating a number of measurements, with > part of the metric name becoming an additional tag. The influxdb-python > result set returns query results as a list of series. Due to that > structure I am writing batches of points in chronological order, but > repeating that same chronological period roughly 50 times across different > series within the measurement. Is this a valid approach, or should I > interleave the points chronologically, only passing over the same time > period once? > > > On Jul 10, 2016, at 9:52 PM, Sean Beckett <[email protected]> wrote: > > When writing older data, as long as you submit batches of points with > timestamps in chronological order, it should still be fairly performant. > Backfilling the downsampling with the INTO clause should also be performant > provided you restrict the time ranges to a perhaps a week at a time. > > We have an internal tool for querying data and replaying it into another > instance. I'll see if that's something we can release, or if we can point > you to the right code. > > On Sat, Jul 9, 2016 at 4:20 PM, Ryan January <[email protected]> wrote: > >> I currently have data I'm trying to migrate from one InfluxDB instance to >> a newer installation and not sure of the proper way to do so. >> As a bit of background: We're sending app metrics to statsd, which are >> passing through telegraf to be persisted in InfluxDB on 10 second >> intervals. During the migration to the new server I'm taking the >> opportunity to restructure the schema to account for some shortcomings >> during the first install. Due to this restructuring I can't perform a >> simple backup/restore. >> >> The old server is .11, new is currently using .13, both using RHEL >> >> I'm writing a fairly consistent 20 points per second (total) across 50 >> measurements, each having roughly 130 series. >> Each measurement has 3 tags and 6 fields. >> No retention policy is currently in place. >> The database is approximately 2 gb and covers a timespan of roughly 1 >> year. >> >> The new server will store 10s resolution in a 60 day retention policy, >> the data will also be downsampled to 1m resolution into a 1 year retention >> policy. I'd like to migrate the current 10s samples into the new server, >> and 1 year retention policy. >> I'm planning to backfill the old data after the apps have begun sending >> metrics to the new server. >> >> My current plan is to write a script to query 1 minute aggregate data >> from the old DB in 1 hour chunks, rewriting that data to the new DB. >> The little I know about the TSM engine is from videos posted on the >> influxdata site. It stated that InfluxDB was optimized around the thought >> that we're only modifying recent data, and that there may be a write >> penalty as older shards are rewritten. >> >> How much of an issue is this if no data currently exists in those >> timeframes? Are there other methods of backfill that I should consider? >> >> Thank you, >> Ryan >> >> >> >> >> -- >> Remember to include the InfluxDB version number with all issue reports >> --- >> You received this message because you are subscribed to the Google Groups >> "InfluxDB" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected]. >> To post to this group, send email to [email protected]. >> Visit this group at https://groups.google.com/group/influxdb. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/influxdb/50701a8b-5a8e-4abd-bf60-7dcecb2cf65d%40googlegroups.com >> <https://groups.google.com/d/msgid/influxdb/50701a8b-5a8e-4abd-bf60-7dcecb2cf65d%40googlegroups.com?utm_medium=email&utm_source=footer> >> . >> For more options, visit https://groups.google.com/d/optout. >> > > > > -- > Sean Beckett > Director of Support and Professional Services > InfluxDB > > -- > Remember to include the InfluxDB version number with all issue reports > --- > You received this message because you are subscribed to a topic in the > Google Groups "InfluxDB" group. > To unsubscribe from this topic, visit > https://groups.google.com/d/topic/influxdb/A9WSWOdorxY/unsubscribe. > To unsubscribe from this group and all its topics, send an email to > [email protected]. > To post to this group, send email to [email protected]. > Visit this group at https://groups.google.com/group/influxdb. > To view this discussion on the web visit > https://groups.google.com/d/msgid/influxdb/CALGqCvOe5t%2BXN8cP4vfjSi6ZyqHOa3uAwhzL8tZwvbdhVO4kjQ%40mail.gmail.com > <https://groups.google.com/d/msgid/influxdb/CALGqCvOe5t%2BXN8cP4vfjSi6ZyqHOa3uAwhzL8tZwvbdhVO4kjQ%40mail.gmail.com?utm_medium=email&utm_source=footer> > . > For more options, visit https://groups.google.com/d/optout. > > > -- > Remember to include the InfluxDB version number with all issue reports > --- > You received this message because you are subscribed to the Google Groups > "InfluxDB" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To post to this group, send email to [email protected]. > Visit this group at https://groups.google.com/group/influxdb. > To view this discussion on the web visit > https://groups.google.com/d/msgid/influxdb/0C68318D-A1A3-4CFB-B16E-177D9C564AF1%40gmail.com > <https://groups.google.com/d/msgid/influxdb/0C68318D-A1A3-4CFB-B16E-177D9C564AF1%40gmail.com?utm_medium=email&utm_source=footer> > . > > For more options, visit https://groups.google.com/d/optout. > -- Sean Beckett Director of Support and Professional Services InfluxDB -- Remember to include the InfluxDB version number with all issue reports --- You received this message because you are subscribed to the Google Groups "InfluxDB" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/influxdb. To view this discussion on the web visit https://groups.google.com/d/msgid/influxdb/CALGqCvNRmaGSy%2BcAqngV0BAiaRTjsgBX5h9%3DMWk-pxPCgH_Cgw%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
