On Thu, May 29, 2014 at 11:03 PM, Sean Pringle <[email protected]> wrote:
> On Fri, May 30, 2014 at 3:28 PM, Ori Livneh <[email protected]> wrote: > >> On Wed, May 28, 2014 at 11:26 PM, Steven Walling <[email protected]> >> wrote: >> >>> My main question is what the rationale is. Is it to improve query >>> performance on analytics dbs? >>> >> >> I imagine it will help, but it's probably not the primary reason. I >> imagine Sean would like to have the database in a state of equilibrium such >> that there are no looming dangers, and no reason in principle why things >> couldn't just keep running. At the moment the clip of incoming events is >> prone to sharp fluctuations and there is no protocol in place for handling >> exhausted server capacity. >> > > Correct. > > It's not really about performance since the dataset will be larger than > $memory regardless. > > Of course, if you guys decide that specific data needs to stay around for > ever, that's fine; it helps with capacity planning and we just bite the > bullet and ensure sufficient storage space is available. Having a default > purge-after-X-months policy for new tables would be the baseline. > Thanks for the explanation guys. This makes perfect sense to me. I'd much rather have old data be something we have to dig a little harder for, than worry if current schemas are going to be accessible or not. -- Steven Walling, Product Manager https://wikimediafoundation.org/
_______________________________________________ Analytics mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/analytics
