I have a (common) scenario where data for measurements comes from multiple remote sources, some of which may lose connectivity, and data may arrive late (as much as a few days late in some scenarios.) Since the data is cached, it is not a big deal, as once the data is sent, the series are populated with proper data with proper timestamps (which are sent along with the data)
We are now looking into aggregating this data using CQs and retention policies (or alternatively using Kapacitor to do the same) and I have a concern about late data. If I understand this correctly, if I aggregate data in say 10 minute increments, it really only checks the last 10 minutes for that data. If data is not there when it runs (i.e. over 10 mins late), it will arrive into the raw measurement in influx, but would never be aggregated unless I manually run a catch-up "SELECT ...INTO" statement. First of all, Am I understanding this correctly? Is this really a problem? Or will it just magically work? And if so, what is the proper way to aggregate data that may be being recorded late? Thanks, -M -- Remember to include the version number! --- You received this message because you are subscribed to the Google Groups "InfluxData" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/influxdb. To view this discussion on the web visit https://groups.google.com/d/msgid/influxdb/9b6d848d-1f43-437b-a774-5d509da458cd%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
