+1 to Dario's mention of the many schemas that just capture production DB stuff in a better way.
Re. growth: Old growth experiment schemas continue to be a great resource for checking old work and sometimes even new hypotheses. When Dario and Kevin get around to us, I'll have a complete list of schemas that should not be purged. Re. storage parameters in the Schema, I agree with Ori, but I'd still like to have them on the wiki somehow. If we were a bunch of Wikipedia editors, I'd suggest making a template for the talk page of a schema that captures this metadata. Given that a template would probably not be best and we'd probably like to stick to JSON, maybe a subpage would be in order. E.g. - Schema:Foo == data type JSON - Schema:Foo/restrictions == storage restrictions JSON (sampling, pruning, indexing, etc.) - Schema_talk:Foo == Discussion of Schema:Foo Such a pattern would allow for changes to storage restrictions without changing the rev_id of the schema page (data type). -Aaron On Thu, May 29, 2014 at 1:26 AM, Steven Walling <[email protected]> wrote: > > On Wed, May 28, 2014 at 10:50 AM, Dan Andreescu <[email protected]> > wrote: > >> I just announced this potential change in Scrum of Scrums and the Mobile >> team said they also would like to keep old data, but not for all of their >> schemas. They're cleaning up their graphs and we should check with them >> when we start deleting. > > > Following up on this from the Growth perspective... > > My main question is what the rationale is. Is it to improve query > performance on analytics dbs? > > I do know there are many older schemas for Growth-related experiments that > are only really useful for historical analysis, which is kind of hard to > reconstruct anyway. If there are sound technical reasons to chuck stuff > from the relational dbs and retain it only in the raw JSON logs, then I'm > potentially okay with helping figure out a list of schemas to retain and > schemas to purge. Aaron, thoughts? > > -- > Steven Walling, > Product Manager > https://wikimediafoundation.org/ > > _______________________________________________ > Analytics mailing list > [email protected] > https://lists.wikimedia.org/mailman/listinfo/analytics > >
_______________________________________________ Analytics mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/analytics
