On Thu, May 29, 2014 at 11:03 PM, Sean Pringle <[email protected]>
wrote:

> On Fri, May 30, 2014 at 3:28 PM, Ori Livneh <[email protected]> wrote:
>
>> On Wed, May 28, 2014 at 11:26 PM, Steven Walling <[email protected]>
>> wrote:
>>
>>> My main question is what the rationale is. Is it to improve query
>>> performance on analytics dbs?
>>>
>>
>> I imagine it will help, but it's probably not the primary reason. I
>> imagine Sean would like to have the database in a state of equilibrium such
>> that there are no looming dangers, and no reason in principle why things
>> couldn't just keep running. At the moment the clip of incoming events is
>> prone to sharp fluctuations and there is no protocol in place for handling
>> exhausted server capacity.
>>
>
> Correct.
>
> It's not really about performance since the dataset will be larger than
> $memory regardless.
>
> Of course, if you guys decide that specific data needs to stay around for
> ever, that's fine; it helps with capacity planning and we just bite the
> bullet and ensure sufficient storage space is available. Having a default
> purge-after-X-months policy for new tables would be the baseline.
>

Thanks for the explanation guys. This makes perfect sense to me. I'd much
rather have old data be something we have to dig a little harder for, than
worry if current schemas are going to be accessible or not.

-- 
Steven Walling,
Product Manager
https://wikimediafoundation.org/
_______________________________________________
Analytics mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/analytics

Reply via email to