Oops, I may have caused a bigger reaction than I intended. Sean and team know about this, and are trying to fix it. But there are other nuances. We've been working with Sean on designing a new schema which to replicate data to. The purpose is to make analysis work easier, and to isolate the specific optimization and capacity problems and needs that analytics products have. I can only be vague as we're in the very early stages [1], but I just wanted to point out that Sean is by no means ignoring this problem. Not only is he working on the production cluster (!), but also he's currently addressing the data integrity problems I mentioned, and *also* he's working with us to fix the root causes and higher level architecture.
Things like this take a long time and two months of bad data does not imply any kind of catastrophe for us at this point. I'll also point out that these problems are mostly due to bizarre bugs and database problems that even the db engine authors don't seem to understand yet. The products that are impacted by these problems are products we care about, but not at the same level as our tier 1 or tier 2 stuff. Vital Signs is the most impacted and it's a relatively new project that still has not replaced the reportcard. As we promote these projects to "ready", we're planning better around them, as mentioned above. [1] https://gerrit.wikimedia.org/r/#/c/167839/
_______________________________________________ Labs-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/labs-l
