On Sun, Jun 2, 2019 at 7:43 PM Jesse Plamondon-Willard
<[email protected]> wrote:
>
> Hello,
>
> Some Toolforge SQL queries have much worse performance when updated for the 
> new comment table.
>
> For example, see one previous SQL query and its updated version which just 
> adds a join on the comment table. The change adds ten dependent subqueries 
> and increases Stalktoy load times from ≈15 seconds to ≈245 seconds, and the 
> same change for another query was enough to have my tools temporarily 
> rate-limited.
>
> I haven't found a way to update those queries efficiently. Is there an 
> optimisation I'm missing? Why does the comment view need subqueries on ten 
> other tables, and is there an alternate version without the subqueries for 
> cases where we just need to join by ID?

I don't have a good answer for the performance question, but I can
provide some information about where the large number of subqueries
come from. They are an artifact of the way that the comment table is
exposed by the Wiki Replicas view layer. Comments can be suppressed
and this suppression can be toggled back and forth over time. The flag
for suppression lives in the related entity table (ipblocks in this
case) rather than the comment table itself. Our current data
sanitization system deals with this by deciding at runtime if a given
comment should be visible or not to Wiki Replica users. This is done
via a view that is defined in our Puppet repository [0]. This view is
not aware of the context that it being used for in a given query, so
it tests all of the entity table flags that could indicate suppression
for each comment row.

One "fix" I could imagine for this would be for the Wiki Replicas to
also provide per-entity views of the comment table that only return
rows related to a single entity table. For example a
"comment_ipblocks" view could use a where clause specific to the
ipblocks entity table like "WHERE exists( select 1 from ipblocks where
ipb_reason_id = comment_id and ipb_deleted = 0)" and exclude the tests
for other entities (image, filearchive, revision, etc).

[0]: 
https://phabricator.wikimedia.org/source/operations-puppet/browse/production/modules/profile/templates/labs/db/views/maintain-views.yaml;baa494c8ae8fc4fd7cdcc43733392b7f0bc7c1e1$287-302

Bryan
-- 
Bryan Davis              Wikimedia Foundation    <[email protected]>
[[m:User:BDavis_(WMF)]] Manager, Technical Engagement    Boise, ID USA
irc: bd808                                        v:415.839.6885 x6855

_______________________________________________
Wikimedia Cloud Services mailing list
[email protected] (formerly [email protected])
https://lists.wikimedia.org/mailman/listinfo/cloud

Reply via email to