I'm a little confused as to which DB server we are talking about. I
need access to

enwiki-p.db.toolserver.org
hap-s1-user.esi.toolserver.org.

is that sql-s1-user or sql-s1-rr or what?

Daniel

On Wed, Aug 8, 2012 at 7:46 AM, Russell Blau <[email protected]> wrote:
> (TL;DR? Skip down three paragraphs to the possible workaround....)  Last
> month, I reported on the progress of SHA-1 updates from the WMF servers,
> and noted that s1 replag was likely to continue to be a problem for a
> number of weeks.  As I said then, the WMF was using (at least) three
> processes to populate the SHA-1 field on three separate blocks of
> revision records.  All these changes then were being replicated to the
> Toolserver's copies of the databases, and this flood of updates was
> causing the replag.
>
> The three blocks were being populated at different rates (for reasons
> that are beyond my knowledge). On July 23 at about 15:00 UTC, rosemary
> (sql-s1-rr) completed updating the first of the three blocks. The other
> blocks continued to be populated (and at some point the WMF started
> another process to help finish off the slowest block), but the rate of
> updates was somewhat less, and rosemary actually caught up on its
> backlog and reached zero replag within about a day after this milestone.
>
> The situation on thyme (sql-s1-user) is less favorable, as we all know.
> The replag on that server got much higher to start with, and thyme
> didn't even reach the end of the first block until Sunday August 5 at
> about 12:00 UTC. Unlike the situation with rosemary, the reduced load
> after this event did not make any noticeable difference to the replag,
> which has continued to increase for the past three days at much the same
> rate as before.  The next milestone will be completion of the second
> major block, which looks like it will occur either late on Friday August
> 9 or early on Saturday August 10 UTC, barring any other major problems
> (like the WMF server outage on Monday which caused replication at the TS
> end to stop for several hours).  At that point, the load from SHA-1
> updates should be roughly about 30% of what it had been during July. One
> would think that would allow the replag to drop, but since the events of
> this week, I can't be confident of that.
>
> There is a possible workaround.  The TS could treat this like a server
> outage; copy user databases from thyme to rosemary and then point
> sql-s1-user to rosemary, which currently has no replag. Rosemary would
> then have to handle twice the load, but thyme should start to recover
> very quickly with no user-generated queries hitting it. Once thyme has
> recovered, point sql-s1-rr to it.
>
> Downsides: (1) this would require several hours of downtime for
> sql-s1-user while the user databases are copied; all tools that require
> access to user databases would be offline entirely for this period. (2)
> it would have to wait until our volunteer TS admins have time to do it.
> (3) the added load on rosemary could cause replag to grow there,
> although I doubt it would come anywhere near the 14+ days replag we are
> dealing with now on thyme. (4) this could all be unnecessary since thyme
> might recover on its own once the SHA-1 update load is reduced, although
> I don't know any way of forecasting that and experience so far has not
> been encouraging.
>
> Question for those of you who operate and/or use tools that access s1
> (enwiki):  would you be willing to accept several hours of service
> outage and the other downsides in exchange for getting rid of the 14-day
> replag?
> --
>   Russell Blau
>   [email protected]
>
>
> _______________________________________________
> Toolserver-l mailing list ([email protected])
> https://lists.wikimedia.org/mailman/listinfo/toolserver-l
> Posting guidelines for this list: 
> https://wiki.toolserver.org/view/Mailing_list_etiquette

_______________________________________________
Toolserver-l mailing list ([email protected])
https://lists.wikimedia.org/mailman/listinfo/toolserver-l
Posting guidelines for this list: 
https://wiki.toolserver.org/view/Mailing_list_etiquette

Reply via email to