Great job Luca. Thank you very much.

I have started to diversify all WMDE Analytics jobs (mainly Wikidata
related things) across the stat100* machines.
While I still mainly use stat1007, two modules of the WDCM
<https://wikitech.wikimedia.org/wiki/Wikidata_Concepts_Monitor> system are
already migrated to stat1004.

Best,
Goran

Goran S. Milovanović, PhD
Data Scientist, Software Department
Wikimedia Deutschland

------------------------------------------------
"It's not the size of the dog in the fight,
it's the size of the fight in the dog."
- Mark Twain
------------------------------------------------


On Wed, Feb 19, 2020 at 4:33 AM Neil Shah-Quinn <[email protected]>
wrote:

> Thank you very much, Luca!
>
> To make this nice documentation easier to discover, I moved it to
> Analytics/Systems/Clients
> <https://wikitech.wikimedia.org/wiki/Analytics/Systems/Clients> along
> with the other information on the clients from Analytics/Data access.
>
> On Tue, 18 Feb 2020 at 17:11, Isaac Johnson <[email protected]> wrote:
>
>> Thanks for pulling together these directions Luca! I did a little
>> clean-up and will try to remember to do so more routinely.
>>
>> Adding to what Diego said, I also started using stat1007 because it has
>> the most access to resources (dumps, Hadoop, MariaDB), and then my virtual
>> environments, config files, etc. are there and so I tend to do all of my
>> work on stat1007 even when the other stat machines might work for other
>> projects. Putting the GPU on stat1005 helped me diversify a little but I'm
>> very excited to hear that the stat machines will be more standardized so it
>> matters less which machine I choose. While I have no desire to be spread
>> out across the machines (a few projects on stat1004, a few on stat1005,
>> etc.) because then I'll certainly lose track of where different projects
>> are, I would be open to trying to choose another host as my "main"
>> workspace.
>>
>> Best,
>> Isaac
>>
>> On Tue, Feb 18, 2020 at 10:53 AM Andrew Otto <[email protected]> wrote:
>>
>>> I added a 'GPU?' column too. :)  THANKS LUCA!
>>>
>>> On Tue, Feb 18, 2020 at 11:51 AM Luca Toscano <[email protected]>
>>> wrote:
>>>
>>>> Hey Diego,
>>>>
>>>> added a section at the end of the page with the info requested, let me
>>>> know if anything is missing :)
>>>>
>>>> Luca
>>>>
>>>> Il giorno mar 18 feb 2020 alle ore 17:37 Diego Saez-Trumper <
>>>> [email protected]> ha scritto:
>>>>
>>>>> Thanks for this Luca.
>>>>>
>>>>> I tend to use stat1007 because I know that machine has a lot of
>>>>> ram/cpu and HDFS access. From other statsX I'm not sure which of them have
>>>>> what resources (I know at least one of them doesn't have HDFS access).
>>>>> There is a table where I can look at a summary of resources per machine?
>>>>>
>>>>> Thanks again.
>>>>>
>>>>> On Tue, Feb 18, 2020 at 8:53 AM Luca Toscano <[email protected]>
>>>>> wrote:
>>>>>
>>>>>> Hi everybody!
>>>>>>
>>>>>> I created the following doc:
>>>>>> https://wikitech.wikimedia.org/wiki/Analytics/Tutorials/Analytics_Client_Nodes
>>>>>>
>>>>>> It contains two FAQ:
>>>>>> - How do I ensure that there is enough space on disk before storing
>>>>>> big datasets/files ?
>>>>>> - How do I check the space used by my files/data on stat/notebook
>>>>>> hosts ?
>>>>>>
>>>>>> Please read them and let me know if anything is not clear or missing.
>>>>>> We have plenty of space on stat100X hosts, but we tend to cluster on 
>>>>>> single
>>>>>> machines like stat1007 for some reason, ending up in fighting for 
>>>>>> resources.
>>>>>>
>>>>>> On a related note, we are going to work on unifying stat/notebook
>>>>>> puppet configs in https://phabricator.wikimedia.org/T243934, so
>>>>>> eventually all Analytics clients will be exactly the same.
>>>>>>
>>>>>> Thanks!
>>>>>>
>>>>>> Luca (on behalf of the Analytics team)
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> Research-Internal mailing list
>>>>>> [email protected]
>>>>>> https://lists.wikimedia.org/mailman/listinfo/research-internal
>>>>>>
>>>>> _______________________________________________
>>>>> Analytics mailing list
>>>>> [email protected]
>>>>> https://lists.wikimedia.org/mailman/listinfo/analytics
>>>>>
>>>> _______________________________________________
>>>> Research-Internal mailing list
>>>> [email protected]
>>>> https://lists.wikimedia.org/mailman/listinfo/research-internal
>>>>
>>> _______________________________________________
>>> Analytics mailing list
>>> [email protected]
>>> https://lists.wikimedia.org/mailman/listinfo/analytics
>>>
>>
>>
>> --
>> Isaac Johnson (he/him/his) -- Research Scientist -- Wikimedia Foundation
>>
> _______________________________________________
> Analytics mailing list
> [email protected]
> https://lists.wikimedia.org/mailman/listinfo/analytics
>
_______________________________________________
Analytics mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/analytics

Reply via email to