Re: [Labs-l] Query optimization

Huji Lee Sat, 12 Mar 2016 08:53:17 -0800

Thanks John. I avoided category table because it is not 100% consistent
with categorylinks, but they join you used takes care of that. This is an
excellent solution to my particular problem.


On Thu, Mar 10, 2016 at 6:48 AM, John <[email protected]> wrote:

> Take a look at:
>
>  select * from category left join page on page_title = cat_title where
> page_namespace = 14 and cat_pages > 500 and 'fa' NOT IN (SELECT ll_lang
> FROM langlinks where ll_from = page_id)
>
> It works very quick and gives what you need.
>
>
> On Thu, Mar 10, 2016 at 4:09 AM, Jaime Crespo <[email protected]>
> wrote:
>
>> On Thu, Mar 10, 2016 at 9:24 AM, Purodha Blissenbach
>> <[email protected]> wrote:
>> > I do not know how replication works, but I did and do assume, it
>> > replicates indexes. If so, the online manual starting at
>> > https://www.mediawiki.org/wiki/Manual:Database_layout
>> > may help you.
>>
>> Correct. There are some cases where the indexing doesn't work (it
>> doesn't make sense to use certain indexes that are partially filtered
>> on the view). In those cases, new views are created, such as
>> "revision_userindex", etc.
>>
>> Maybe we can add a "columns" table to information_schema_p, or the
>> create tables to the "tables" table, where the original table
>> description is explained. File a ticket if that would be useful.
>>
>> If there are several users that perform common slow analytics, file a
>> ticket and we could, if technically possible, do them in advance so
>> they do not have to be done every single time. Another option is to
>> replicate to a column-based engine, but I am not sure if people would
>> know how to take most advantage of those (where aggregations are
>> cheap).
>>
>> > Shouldn't a simple join on indexed fields + a simple aggregation on an
>> indexed field be relatively cheap
>>
>> No, aggregation on mysql, on standard engines, is never cheap. Indexes
>> do not help with that, they still need to count every single row (so
>> you are creating a filter, of a sort, of a full tablescan of one of
>> the largest tables per wiki). Why doing that when the count is already
>> aggregated?
>>
>> I recently delivered a session on query optimization for mysql and
>> mariadb, using wikimedia and osm as examples [warning, external site]
>> <https://github.com/jynus/query-optimization>, maybe that would be
>> helpful?
>>
>> --
>> Jaime Crespo
>> <http://wikimedia.org>
>>
>> _______________________________________________
>> Labs-l mailing list
>> [email protected]
>> https://lists.wikimedia.org/mailman/listinfo/labs-l
>>
>
>
> _______________________________________________
> Labs-l mailing list
> [email protected]
> https://lists.wikimedia.org/mailman/listinfo/labs-l
>
>

_______________________________________________
Labs-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/labs-l

Re: [Labs-l] Query optimization

Reply via email to