[Bug 54406] Parsoid jobs are not dequeued fast enough during heavy bot activity

2013-09-23 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=54406

--- Comment #20 from Nemo federicol...@tiscali.it ---
I'm not sure how familiar you are with the actual writing and enacting of those
social rules, but surely here nothing was done against them. The social rules
in the various variations of the global [[m:Bot policy]] are not about
performance; the speed limit is typically driven by users not wanting their
Special:RecentChanges and Special:WatchList to be flooded, which in this case
obviously didn't happen (so let's not even start debating what's an edit).

Besides, I don't really have an answer yet to my question, though we're getting
nearer; probably it would be faster if we avoided off-topic discussions on
alleged abuse and similar stuff.

(In reply to comment #19)
 (In reply to comment #15)
  I still see no answer about what the spike on July 29 was: are you saying it
  wasn't about parsoid, but just a coincidence? Or that it was normal to queue
  over a million jobs in a couple of days (initial caching of all pages or
  something?) but the second million was too much?
 
 Just editing a handful really popular templates (some are used in 7 million
 articles) can enqueue a lot of jobs (10 titles per job, so ~700k jobs). As
 can
 editing content at high rates. Core happens to cap the number of titles to
 re-render at 200k, while Parsoid re-renders all, albeit with a delay.

Thanks, so I guess the answer is the second. Can you then explain why the
second million was too much, i.e. why reaching 2M job queue is in your opinion
all fine and normal while 3 is something absolutely horrible and criminal?
Thanks.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 54406] Parsoid jobs are not dequeued fast enough during heavy bot activity

2013-09-22 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=54406

Gabriel Wicke gwi...@wikimedia.org changed:

   What|Removed |Added

Summary|Parsoid is flooding the |Parsoid jobs are not
   |global job queue|dequeued fast enough during
   ||heavy bot activity

--- Comment #13 from Gabriel Wicke gwi...@wikimedia.org ---
Changed the subject to be more accurate. Some more clarification for those less
familiar with the way Parsoid caching works:

* Timely Parsoid cache refreshes are not needed for correctness. Delayed
updates can result in a higher percentage of cache misses than usual, but will
not result in incorrect edits.

* Parsoid uses its own queue, so a backlog does not affect any other job queue.

So in the bigger picture this is annoying, but not critical.

Still, to avoid any misunderstandings: Null editing at rates way higher than
those allowed for bots is at best a very bad idea.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 54406] Parsoid jobs are not dequeued fast enough during heavy bot activity

2013-09-22 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=54406

--- Comment #14 from MZMcBride b...@mzmcbride.com ---
(In reply to comment #13)
 Still, to avoid any misunderstandings: Null editing at rates way higher than
 those allowed for bots is at best a very bad idea.

Bots are explicitly not rate-limited.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 54406] Parsoid jobs are not dequeued fast enough during heavy bot activity

2013-09-22 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=54406

--- Comment #15 from Nemo federicol...@tiscali.it ---
I still see no answer about what the spike on July 29 was: are you saying it
wasn't about parsoid, but just a coincidence? Or that it was normal to queue
over a million jobs in a couple of days (initial caching of all pages or
something?) but the second million was too much?

The new summary seems incorrect in two ways:
1) fast enough might be incorrect, you say that it's not important if it's
slow,
2) during heavy bot activity is a possibly incorrect guess.
What's sure is that we no longer seem to have any meaningful job queue data.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 54406] Parsoid jobs are not dequeued fast enough during heavy bot activity

2013-09-22 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=54406

--- Comment #16 from Andre Klapper aklap...@wikimedia.org ---
(In reply to comment #14)
 Bots are explicitly not rate-limited.

https://en.wikipedia.org/wiki/Wikipedia:Bot_policy#Bot_requirements states that
bots doing non-urgent tasks may edit approximately once every ten seconds. 
Am I missing something?

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 54406] Parsoid jobs are not dequeued fast enough during heavy bot activity

2013-09-22 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=54406

--- Comment #17 from Kunal Mehta (Legoktm) legoktm.wikipe...@gmail.com ---
(In reply to comment #16)
 (In reply to comment #14)
  Bots are explicitly not rate-limited.
 
 https://en.wikipedia.org/wiki/Wikipedia:Bot_policy#Bot_requirements states
 that
 bots doing non-urgent tasks may edit approximately once every ten seconds. 
 Am I missing something?

That requirement is imposed by the community, it is not necessarily a technical
limit. If you look at https://en.wikipedia.org/wiki/Special:ListGroupRights,
the 'bot' group is deliberately exempted from rate limits with the
'noratelimit' userright.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 54406] Parsoid jobs are not dequeued fast enough during heavy bot activity

2013-09-22 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=54406

--- Comment #18 from MZMcBride b...@mzmcbride.com ---
No user in the bot user group was used to null edit. Just a standard and
unprivileged (albeit autoconfirmed) account.

On Wikimedia wikis, 'edit' is rate-limited only for 'ip' and 'newbie', but not
for 'user' or 'bot' or any other group, as far as I can tell.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 54406] Parsoid jobs are not dequeued fast enough during heavy bot activity

2013-09-22 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=54406

--- Comment #19 from Gabriel Wicke gwi...@wikimedia.org ---
(In reply to comment #15)
 I still see no answer about what the spike on July 29 was: are you saying it
 wasn't about parsoid, but just a coincidence? Or that it was normal to queue
 over a million jobs in a couple of days (initial caching of all pages or
 something?) but the second million was too much?

Just editing a handful really popular templates (some are used in 7 million
articles) can enqueue a lot of jobs (10 titles per job, so ~700k jobs). As can
editing content at high rates. Core happens to cap the number of titles to
re-render at 200k, while Parsoid re-renders all, albeit with a delay.

Ideally we'd prioritize direct edits over template updates as only the former
has any performance impact. Editing a page with slightly out-of-date template
rendering will still yield correct results. Longer-term our goal is to reduce
API requests further so that we can run the Parsoid cluster at capacity without
taking out the API.

And yes, I was referring to very straightforward social rules that are designed
to prevent our users from overloading the site with expensive operations. We
currently provide powerful API access that can easily be abused. If the
attitude that everything that is not technically blocked must surely be ok
becomes more popular we'll have to neuter those APIs significantly. Maybe it is
actually time to technically enforce social rules more strictly, for example by
automatically blocking abusers.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l