Phew, ok, things did go wrong!  We ran into a couple of bugs recently 
introduced in Yarn and in Hive that took us a while to find work arounds.  Jobs 
are again flowing through the cluster.  However, jobs have been lagging behind 
since they haven’t been able to run all day.  They should eventually catch up.  
For now, the cluster is back open for business, but I’d appreciate if no one 
ran any heavy jobs until tomorrow.

Also, it is still possible we may run into other issues we haven’t yet seen, so 
I can’t guarantee that I won’t have to restart things again.


Anyway, aside from those hiccups. CDH 5.4.0 is now installed, Hive 1.1 and 
Spark 1.3.0 are now available, weeeeee!

-Ao


> On May 4, 2015, at 11:05, Andrew Otto <[email protected]> wrote:
> 
> Hi all, as a reminder, I will be doing this upgrade today.  Within the next 
> hour I will turn off the Hadoop cluster.  Please do not attempt to use it 
> again until I notify you again.
> 
> Thanks!
> -AO
> 
> 
> 
>> On Apr 29, 2015, at 14:57, Robert West <[email protected]> wrote:
>> 
>> All good!
>> 
>> On Wed, Apr 29, 2015 at 11:35 AM, Aaron Halfaker
>> <[email protected]> wrote:
>>> + the right research list  (Andrew, remove wmfresearch@ from your contact
>>> list :P )
>>> 
>>> All looks good to me.  Thanks. :)
>>> 
>>> On Wed, Apr 29, 2015 at 1:11 PM, Leila Zia <[email protected]> wrote:
>>>> 
>>>> FYI
>>>> 
>>>> Ashwin, Bob, Ellery, I don't anticipate this having negative impact on our
>>>> workflow. If you see possible issues, please communicate with Andrew 
>>>> (cc-ing
>>>> me), or let me know and I communicate. Thanks!
>>>> 
>>>> 
>>>> ---------- Forwarded message ----------
>>>> From: Andrew Otto <[email protected]>
>>>> Date: Wed, Apr 29, 2015 at 11:05 AM
>>>> Subject: [wmfresearch] Hadoop Cluster Downtime
>>>> To: Operations Engineers <[email protected]>, "A mailing list for
>>>> the Analytics Team at WMF and everybody who has an interest in Wikipedia 
>>>> and
>>>> analytics." <[email protected]>,
>>>> "[email protected] Research" 
>>>> <[email protected]>
>>>> 
>>>> 
>>>> Hi all!
>>>> 
>>>> CDH 5.4 is out[1] and we’d like to upgrade.  We are doing this now, rather
>>>> than later, because there is an important Parquet/Hive related bug that has
>>>> been fixed in this version[2].  This upgrade will include Spark 1.3, which
>>>> should at least make one researcher happy.
>>>> 
>>>> To do this upgrade, I need to schedule some downtime for Hadoop.  I’d like
>>>> to do this on Monday May 4th.  I expect the upgrade to take me no more than
>>>> an hour or two, but just to be safe I’d like to schedule the downtime for
>>>> the whole day.
>>>> 
>>>> If anyone has critical things that they absolutely have to run on Monday,
>>>> let me know now and I will find another day.
>>>> 
>>>> Thanks!
>>>> -Ao
>>>> 
>>>> [1]
>>>> http://blog.cloudera.com/blog/2015/04/cloudera-enterprise-5-4-is-released/
>>>> [2] https://issues.apache.org/jira/browse/HIVE-9482
>>>> 
>>>> 
>>>> 
>>>> _______________________________________________
>>>> wmfresearch mailing list
>>>> [email protected]
>>>> https://lists.wikimedia.org/mailman/listinfo/wmfresearch
>>>> 
>>>> 
>>>> 
>>>> _______________________________________________
>>>> Research-Internal mailing list
>>>> [email protected]
>>>> https://lists.wikimedia.org/mailman/listinfo/research-internal
>>>> 
>>> 
>>> 
>>> _______________________________________________
>>> Research-Internal mailing list
>>> [email protected]
>>> https://lists.wikimedia.org/mailman/listinfo/research-internal
>>> 
>> 
>> 
>> 
>> -- 
>> Up for a little language game? -- http://www.unfun.me
>> 
>> _______________________________________________
>> Ops mailing list
>> [email protected]
>> https://lists.wikimedia.org/mailman/listinfo/ops
> 


_______________________________________________
Analytics mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/analytics

Reply via email to