There is a logging_logindex table. I incorporated all suggested changes into https://quarry.wmflabs.org/query/23829
Since I still think they query will take more than 20 minutes, I think quarry is not the way to go. I saved the query on the grid and tried to run it using this command: jsub -once -N "custom_sql" -mem 1g -o /data/project/huji/err/custom_sql.out -e /data/project/huji/err/custom_sql.err sql fawiki_p < /data/project/huji/query.sql But even though when I run this I get the response that my job was submitted, running qstat immediately thereafter shows no results. Is it okay, or am I running the job incorrectly? (Note that both the .out and the .err files are empty). On Sat, Dec 30, 2017 at 2:10 PM, Maximilian Doerr < [email protected]> wrote: > Isn’t there a logging_logindex table to use that should optimize this? > > Cyberpower678 > English Wikipedia Account Creation Team > English Wikipedia Administrator > Global User Renamer > > On Dec 30, 2017, at 14:07, Brad Jorsch (Anomie) <[email protected]> > wrote: > > On Sat, Dec 30, 2017 at 1:07 PM, John <[email protected]> wrote: > >> Use the logging_userindex table instead of logging >> > > That won't make much difference, since the select on the logging table > isn't targeting any user columns. > > On Sat, Dec 30, 2017 at 1:09 PM, John <[email protected]> wrote: > >> I would also find the first log of 2017 and use that instead of the >> timestamp >> > > That would make it worse, since there's no index on (log_type, log_id). > It'll either have to use the primary key and filter out all rows with a > different log_type, or use one of the indexes that begins with log_type and > filter out all the rows with an earlier log_id. > > On Sat, Dec 30, 2017 at 1:32 PM, Dennis Tobar <[email protected]> > wrote: > >> Replace count(*) with count(1) in the subquery. It could help (?) to >> improve the performance. >> > > "count(*)" and "count(1)" should be treated equivalently. The "*" in > "count(*)" does not cause the database to fetch all fields. > > If anything, "count(*)" might be ever so slightly faster since it's > literally staying "count the number of rows" rather than "count the number > of rows where the constant 1 is not null". But the DB probably optimizes > counting of a constant to make them identical. > > > -- > Brad Jorsch (Anomie) > Senior Software Engineer > Wikimedia Foundation > _______________________________________________ > Wikimedia Cloud Services mailing list > [email protected] (formerly [email protected]) > https://lists.wikimedia.org/mailman/listinfo/cloud > > > > _______________________________________________ > Wikimedia Cloud Services mailing list > [email protected] (formerly [email protected]) > https://lists.wikimedia.org/mailman/listinfo/cloud >
_______________________________________________ Wikimedia Cloud Services mailing list [email protected] (formerly [email protected]) https://lists.wikimedia.org/mailman/listinfo/cloud
