There is a logging_logindex table. I incorporated all suggested changes
into https://quarry.wmflabs.org/query/23829

Since I still think they query will take more than 20 minutes, I think
quarry is not the way to go. I saved the query on the grid and tried to run
it using this command:

jsub -once -N "custom_sql" -mem 1g -o /data/project/huji/err/custom_sql.out
-e /data/project/huji/err/custom_sql.err sql fawiki_p <
/data/project/huji/query.sql

But even though when I run this I get the response that my job was
submitted, running qstat immediately thereafter shows no results. Is it
okay, or am I running the job incorrectly? (Note that both the .out and the
.err files are empty).

On Sat, Dec 30, 2017 at 2:10 PM, Maximilian Doerr <
[email protected]> wrote:

> Isn’t there a logging_logindex table to use that should optimize this?
>
> Cyberpower678
> English Wikipedia Account Creation Team
> English Wikipedia Administrator
> Global User Renamer
>
> On Dec 30, 2017, at 14:07, Brad Jorsch (Anomie) <[email protected]>
> wrote:
>
> On Sat, Dec 30, 2017 at 1:07 PM, John <[email protected]> wrote:
>
>> Use the logging_userindex table instead of logging
>>
>
> That won't make much difference, since the select on the logging table
> isn't targeting any user columns.
>
> On Sat, Dec 30, 2017 at 1:09 PM, John <[email protected]> wrote:
>
>> I would also find the first log of 2017 and use that instead of the
>> timestamp
>>
>
> That would make it worse, since there's no index on (log_type, log_id).
> It'll either have to use the primary key and filter out all rows with a
> different log_type, or use one of the indexes that begins with log_type and
> filter out all the rows with an earlier log_id.
>
> On Sat, Dec 30, 2017 at 1:32 PM, Dennis Tobar <[email protected]>
> wrote:
>
>> Replace count(*) with count(1) in the subquery. It could help (?) to
>> improve the performance.
>>
>
> "count(*)" and "count(1)" should be treated equivalently. The "*" in
> "count(*)" does not cause the database to fetch all fields.
>
> If anything, "count(*)" might be ever so slightly faster since it's
> literally staying "count the number of rows" rather than "count the number
> of rows where the constant 1 is not null". But the DB probably optimizes
> counting of a constant to make them identical.
>
>
> --
> Brad Jorsch (Anomie)
> Senior Software Engineer
> Wikimedia Foundation
> _______________________________________________
> Wikimedia Cloud Services mailing list
> [email protected] (formerly [email protected])
> https://lists.wikimedia.org/mailman/listinfo/cloud
>
>
>
> _______________________________________________
> Wikimedia Cloud Services mailing list
> [email protected] (formerly [email protected])
> https://lists.wikimedia.org/mailman/listinfo/cloud
>
_______________________________________________
Wikimedia Cloud Services mailing list
[email protected] (formerly [email protected])
https://lists.wikimedia.org/mailman/listinfo/cloud

Reply via email to