Awesome - thanks for the communication and quick turnaround. 

-Toby


> On Feb 26, 2015, at 14:08, Andrew Otto <[email protected]> wrote:
> 
> Hey everybody!  Things are better now!  Cluster is caught up.  We’ve also put 
> in place some fancier queuing to ensure that production jobs aren’t bogged 
> down.
> 
> ALSO:  The webrequest table now has some new fields!  client_ip, 
> geocoded_data and record_version.  WooT!  This data will only be filled in 
> for new partitions.  It should be present for everything beginning at 
> 2015-02-26T18:00.  Anything before that will not have these fields.  Also 
> note that you can no longer  use SELECT * on data older than this.  This is a 
> technical consequence of the way we import the new data.
> 
> Thanks so much Christian and Joseph!
> 
> -Ao
> 
> 
>> On Feb 26, 2015, at 12:13, Toby Negrin <[email protected]> wrote:
>> 
>> Thank you Christian!
>> 
>> On Wed, Feb 25, 2015 at 5:18 PM, Christian Aistleitner 
>> <[email protected]> wrote:
>>> Hi,
>>> 
>>> just a quick heads up that the Analytics cluster got stuck today. And
>>> jobs deadlocked themselves waiting for other jobs to free resources.
>>> 
>>> For the time being, to allow the cluster to catch up for the missed
>>> hours, I suspended the refining jobs.
>>> 
>>> This gives the cluster enough resources to catch up with importing the
>>> kafka data that it missed during the day.
>>> 
>>> But this also means that the datasets:
>>>   pagecounts-all-sites,
>>>   pagecounts-raw,
>>>   legacy_tsvs
>>> will fall behind a bit, and the wmf.webrequest data will not see new
>>> data while the cluster is catching up.
>>> 
>>> Tomorrow, in the European morning when the cluster has caught up, I'll
>>> enable refining again, and the datasets should catch up again.
>>> 
>>> Sorry for the inconveniences,
>>> Christian
>>> 
>>> 
>>> P.S.: Suspending refining looks a bit drastic. But if we only killed
>>> the resource hungry jobs without stopping refining, refining would
>>> start during the catch up of camus and produce faulty datasets.
>>> Hence, we suspended refining for now. Tomorrow, we'll resume the
>>> suspended jobs and have the datasets catch up again.
>>> 
>>> P.P.S.: If you have resource hungry jobs on the Analytics cluster, if
>>> possible please wait until tomorrow to run them.
>>> 
>>> --
>>> ---- quelltextlich e.U. ---- \\ ---- Christian Aistleitner ----
>>>                            Companies' registry: 360296y in Linz
>>> Christian Aistleitner
>>> Kefermarkterstrasze 6a/3     Email:  [email protected]
>>> 4293 Gutau, Austria          Phone:          +43 7946 / 20 5 81
>>>                              Fax:            +43 7946 / 20 5 81
>>>                              Homepage: http://quelltextlich.at/
>>> ---------------------------------------------------------------
>>> 
>>> _______________________________________________
>>> Analytics mailing list
>>> [email protected]
>>> https://lists.wikimedia.org/mailman/listinfo/analytics
>> 
>> _______________________________________________
>> Analytics mailing list
>> [email protected]
>> https://lists.wikimedia.org/mailman/listinfo/analytics
> 
> _______________________________________________
> Analytics mailing list
> [email protected]
> https://lists.wikimedia.org/mailman/listinfo/analytics
_______________________________________________
Analytics mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/analytics

Reply via email to