Let's kill them (Leila is OoO today and tomorrow).

> On Nov 10, 2014, at 08:01, Nuria Ruiz <[email protected]> wrote:
> 
> cc-ing leila as we were experimenting with these some weeks back in SF, I 
> think they can be killed w/o problems. I did not know they were still 
> running, we run a faster version of those queries and got the data we were 
> interested in a while back.
> 
>> On Mon, Nov 10, 2014 at 1:55 AM, Sean Pringle <[email protected]> wrote:
>> Three identical queries from the 'research_prod' user have just passed one 
>> month execution time on s1-anlytics-slave:
>> 
>> select count(*) 
>> from staging.ourvision r
>> where exists (
>>   select *
>>   from staging.ourvision r1
>>           inner join
>>        staging.ourvision r2     
>>           on r2.sha1 = r1.sha1
>>     where r1.page_id = r.page_id
>>       and r2.page_id = r.page_id
>>       and DATE_ADD(r.timestamp, INTERVAL 1 HOUR)
>>       and r2.timestamp between r.timestamp and DATE_SUB(r.timestamp , 
>> INTERVAL 1 HOUR) 
>>       and  r1.sha1!= r.sha1
>> );
>> 
>> I havn't checked to see if the queries are just that amazingly slow, or if 
>> they're part of a larger ongoing transaction. In any case, three month-long 
>> transactions is pushing the resource limits of the slave and will soon 
>> result in either mass replication lag or some other interesting lockup that 
>> may in turn take days to rollback :-)
>> 
>> Can we kill these? Can we optimize and/or redesign the jobs? Happy to help...
>> 
>> _______________________________________________
>> Analytics mailing list
>> [email protected]
>> https://lists.wikimedia.org/mailman/listinfo/analytics
> 
> _______________________________________________
> Analytics mailing list
> [email protected]
> https://lists.wikimedia.org/mailman/listinfo/analytics
_______________________________________________
Analytics mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/analytics

Reply via email to