Re: [Analytics] Resources stat1005

2017-08-14 Thread Luca Toscano
Hi Adrian,

you should open a phab task like the following:
https://phabricator.wikimedia.org/T158053 to get into the nda LDAP group
(if you really need it as Nuria mentioned :).

Luca

2017-08-13 0:52 GMT+02:00 Adrian Bielefeldt <
adrian.bielefe...@mailbox.tu-dresden.de>:

> Hi Andrew,
>
> thanks for the advice. Quick follow-up question: Which kind of
> access/account do I need for yarn.wikimedia.org? Neither my
> MediaWiki-Account nor my WikiTech-Account work.
>
> Greetings,
>
> Adrian
>
> On 08/12/2017 09:58 PM, Andrew Otto wrote:
>
> I have only 2 comments:
>
> 1.  Please nice  any heavy
> long running local processes, so that others can continue to use the
> machine.
>
> 2. For large data, consider using the Hadoop cluster!  I think you are
> getting your data from the webrequest logs in Hadoop anyway, so you might
> as well continue to do processing there, no?  If you do, you shouldn’t have
> to worry (too much) about resource contention: https://yarn.
> wikimedia.org/cluster/scheduler
>
> :)
>
> - Andrew Otto
>   Systems Engineer, WMF
>
>
>
>
> On Sat, Aug 12, 2017 at 2:20 PM, Erik Zachte 
> wrote:
>
>> I will soon start the two Wikistats jobs which run for about several
>> weeks each month,
>>
>> They might use two cores each, one for unzip, one for perl.
>>
>> How many cores are there anyway?
>>
>>
>>
>> Cheers,
>>
>> Erik
>>
>>
>>
>> *From:* Analytics [mailto:analytics-boun...@lists.wikimedia.org] *On
>> Behalf Of *Adrian Bielefeldt
>> *Sent:* Saturday, August 12, 2017 19:44
>> *To:* analytics@lists.wikimedia.org
>> *Subject:* [Analytics] Resources stat1005
>>
>>
>>
>> Hello everyone,
>>
>> I wanted to ask about resource allocation on stat1005. We
>> 
>> need quite a bit since we process every entry in wdqs_extract and I was
>> wondering how many cores and how much memory we can use without conflicting
>> with anyone else.
>>
>> Greetings,
>>
>> Adrian
>>
>> ___
>> Analytics mailing list
>> Analytics@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/analytics
>>
>>
>
>
> ___
> Analytics mailing 
> listAnalytics@lists.wikimedia.orghttps://lists.wikimedia.org/mailman/listinfo/analytics
>
>
>
> ___
> Analytics mailing list
> Analytics@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/analytics
>
>
___
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics


Re: [Analytics] Resources stat1005

2017-08-14 Thread Nuria Ruiz
Adrian,

You already have access to use the cluster, which is where you  should move
your processing, the link to yarn was just to show resource consumption.

Thanks,

Nuria

On Sat, Aug 12, 2017 at 3:52 PM, Adrian Bielefeldt <
adrian.bielefe...@mailbox.tu-dresden.de> wrote:

> Hi Andrew,
>
> thanks for the advice. Quick follow-up question: Which kind of
> access/account do I need for yarn.wikimedia.org? Neither my
> MediaWiki-Account nor my WikiTech-Account work.
>
> Greetings,
>
> Adrian
>
> On 08/12/2017 09:58 PM, Andrew Otto wrote:
>
> I have only 2 comments:
>
> 1.  Please nice  any heavy
> long running local processes, so that others can continue to use the
> machine.
>
> 2. For large data, consider using the Hadoop cluster!  I think you are
> getting your data from the webrequest logs in Hadoop anyway, so you might
> as well continue to do processing there, no?  If you do, you shouldn’t have
> to worry (too much) about resource contention: https://yarn.
> wikimedia.org/cluster/scheduler
>
> :)
>
> - Andrew Otto
>   Systems Engineer, WMF
>
>
>
>
> On Sat, Aug 12, 2017 at 2:20 PM, Erik Zachte 
> wrote:
>
>> I will soon start the two Wikistats jobs which run for about several
>> weeks each month,
>>
>> They might use two cores each, one for unzip, one for perl.
>>
>> How many cores are there anyway?
>>
>>
>>
>> Cheers,
>>
>> Erik
>>
>>
>>
>> *From:* Analytics [mailto:analytics-boun...@lists.wikimedia.org] *On
>> Behalf Of *Adrian Bielefeldt
>> *Sent:* Saturday, August 12, 2017 19:44
>> *To:* analytics@lists.wikimedia.org
>> *Subject:* [Analytics] Resources stat1005
>>
>>
>>
>> Hello everyone,
>>
>> I wanted to ask about resource allocation on stat1005. We
>> 
>> need quite a bit since we process every entry in wdqs_extract and I was
>> wondering how many cores and how much memory we can use without conflicting
>> with anyone else.
>>
>> Greetings,
>>
>> Adrian
>>
>> ___
>> Analytics mailing list
>> Analytics@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/analytics
>>
>>
>
>
> ___
> Analytics mailing 
> listAnalytics@lists.wikimedia.orghttps://lists.wikimedia.org/mailman/listinfo/analytics
>
>
>
> ___
> Analytics mailing list
> Analytics@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/analytics
>
>
___
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics


Re: [Analytics] Resources stat1005

2017-08-12 Thread Adrian Bielefeldt
Hi Andrew,

thanks for the advice. Quick follow-up question: Which kind of
access/account do I need for yarn.wikimedia.org? Neither my
MediaWiki-Account nor my WikiTech-Account work.

Greetings,

Adrian


On 08/12/2017 09:58 PM, Andrew Otto wrote:
> I have only 2 comments:
>
> 1.  Please nice  any
> heavy long running local processes, so that others can continue to use
> the machine.
>
> 2. For large data, consider using the Hadoop cluster!  I think you are
> getting your data from the webrequest logs in Hadoop anyway, so you
> might as well continue to do processing there, no?  If you do, you
> shouldn’t have to worry (too much) about resource
> contention: https://yarn.wikimedia.org/cluster/scheduler
>
> :)
>
> - Andrew Otto
>   Systems Engineer, WMF
>
>
>
>
> On Sat, Aug 12, 2017 at 2:20 PM, Erik Zachte  > wrote:
>
> I will soon start the two Wikistats jobs which run for about
> several weeks each month,
>
> They might use two cores each, one for unzip, one for perl.
>
> How many cores are there anyway?
>
>  
>
> Cheers,
>
> Erik
>
>  
>
> *From:*Analytics [mailto:analytics-boun...@lists.wikimedia.org
> ] *On Behalf Of
> *Adrian Bielefeldt
> *Sent:* Saturday, August 12, 2017 19:44
> *To:* analytics@lists.wikimedia.org
> 
> *Subject:* [Analytics] Resources stat1005
>
>  
>
> Hello everyone,
>
> I wanted to ask about resource allocation on stat1005. We
> 
> need quite a bit since we process every entry in wdqs_extract and
> I was wondering how many cores and how much memory we can use
> without conflicting with anyone else.
>
> Greetings,
>
> Adrian
>
>
> ___
> Analytics mailing list
> Analytics@lists.wikimedia.org 
> https://lists.wikimedia.org/mailman/listinfo/analytics
> 
>
>
>
>
> ___
> Analytics mailing list
> Analytics@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/analytics

___
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics


Re: [Analytics] Resources stat1005

2017-08-12 Thread Andrew Otto
I have only 2 comments:

1.  Please nice  any heavy long
running local processes, so that others can continue to use the machine.

2. For large data, consider using the Hadoop cluster!  I think you are
getting your data from the webrequest logs in Hadoop anyway, so you might
as well continue to do processing there, no?  If you do, you shouldn’t have
to worry (too much) about resource contention:
https://yarn.wikimedia.org/cluster/scheduler

:)

- Andrew Otto
  Systems Engineer, WMF




On Sat, Aug 12, 2017 at 2:20 PM, Erik Zachte  wrote:

> I will soon start the two Wikistats jobs which run for about several weeks
> each month,
>
> They might use two cores each, one for unzip, one for perl.
>
> How many cores are there anyway?
>
>
>
> Cheers,
>
> Erik
>
>
>
> *From:* Analytics [mailto:analytics-boun...@lists.wikimedia.org] *On
> Behalf Of *Adrian Bielefeldt
> *Sent:* Saturday, August 12, 2017 19:44
> *To:* analytics@lists.wikimedia.org
> *Subject:* [Analytics] Resources stat1005
>
>
>
> Hello everyone,
>
> I wanted to ask about resource allocation on stat1005. We
> 
> need quite a bit since we process every entry in wdqs_extract and I was
> wondering how many cores and how much memory we can use without conflicting
> with anyone else.
>
> Greetings,
>
> Adrian
>
> ___
> Analytics mailing list
> Analytics@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/analytics
>
>
___
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics


Re: [Analytics] Resources stat1005

2017-08-12 Thread Erik Zachte
I will soon start the two Wikistats jobs which run for about several weeks each 
month, 

They might use two cores each, one for unzip, one for perl. 

How many cores are there anyway?

 

Cheers,

Erik

 

From: Analytics [mailto:analytics-boun...@lists.wikimedia.org] On Behalf Of 
Adrian Bielefeldt
Sent: Saturday, August 12, 2017 19:44
To: analytics@lists.wikimedia.org
Subject: [Analytics] Resources stat1005

 

Hello everyone,

I wanted to ask about resource allocation on stat1005. We 
  need 
quite a bit since we process every entry in wdqs_extract and I was wondering 
how many cores and how much memory we can use without conflicting with anyone 
else.

Greetings,

Adrian

___
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics