Sean,

   What Nuria said. It seems we've missed this one. Sorry for the trouble.

Leila

On Mon, Nov 10, 2014 at 8:01 AM, Nuria Ruiz <[email protected]> wrote:

> cc-ing leila as we were experimenting with these some weeks back in SF, I
> think they can be killed w/o problems. I did not know they were still
> running, we run a faster version of those queries and got the data we were
> interested in a while back.
>
> On Mon, Nov 10, 2014 at 1:55 AM, Sean Pringle <[email protected]>
> wrote:
>
>> Three identical queries from the 'research_prod' user have just passed
>> one month execution time on s1-anlytics-slave:
>>
>> select count(*)
>> from staging.ourvision r
>> where exists (
>>   select *
>>   from staging.ourvision r1
>>           inner join
>>        staging.ourvision r2
>>           on r2.sha1 = r1.sha1
>>     where r1.page_id = r.page_id
>>       and r2.page_id = r.page_id
>>       and DATE_ADD(r.timestamp, INTERVAL 1 HOUR)
>>       and r2.timestamp between r.timestamp and DATE_SUB(r.timestamp ,
>> INTERVAL 1 HOUR)
>>       and  r1.sha1!= r.sha1
>> );
>>
>> I havn't checked to see if the queries are just that amazingly slow, or
>> if they're part of a larger ongoing transaction. In any case, three
>> month-long transactions is pushing the resource limits of the slave and
>> will soon result in either mass replication lag or some other interesting
>> lockup that may in turn take days to rollback :-)
>>
>> Can we kill these? Can we optimize and/or redesign the jobs? Happy to
>> help...
>>
>> _______________________________________________
>> Analytics mailing list
>> [email protected]
>> https://lists.wikimedia.org/mailman/listinfo/analytics
>>
>>
>
_______________________________________________
Analytics mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/analytics

Reply via email to