Thanks for pulling together these directions Luca! I did a little clean-up
and will try to remember to do so more routinely.

Adding to what Diego said, I also started using stat1007 because it has the
most access to resources (dumps, Hadoop, MariaDB), and then my virtual
environments, config files, etc. are there and so I tend to do all of my
work on stat1007 even when the other stat machines might work for other
projects. Putting the GPU on stat1005 helped me diversify a little but I'm
very excited to hear that the stat machines will be more standardized so it
matters less which machine I choose. While I have no desire to be spread
out across the machines (a few projects on stat1004, a few on stat1005,
etc.) because then I'll certainly lose track of where different projects
are, I would be open to trying to choose another host as my "main"
workspace.

Best,
Isaac

On Tue, Feb 18, 2020 at 10:53 AM Andrew Otto <[email protected]> wrote:

> I added a 'GPU?' column too. :)  THANKS LUCA!
>
> On Tue, Feb 18, 2020 at 11:51 AM Luca Toscano <[email protected]>
> wrote:
>
>> Hey Diego,
>>
>> added a section at the end of the page with the info requested, let me
>> know if anything is missing :)
>>
>> Luca
>>
>> Il giorno mar 18 feb 2020 alle ore 17:37 Diego Saez-Trumper <
>> [email protected]> ha scritto:
>>
>>> Thanks for this Luca.
>>>
>>> I tend to use stat1007 because I know that machine has a lot of ram/cpu
>>> and HDFS access. From other statsX I'm not sure which of them have what
>>> resources (I know at least one of them doesn't have HDFS access). There is
>>> a table where I can look at a summary of resources per machine?
>>>
>>> Thanks again.
>>>
>>> On Tue, Feb 18, 2020 at 8:53 AM Luca Toscano <[email protected]>
>>> wrote:
>>>
>>>> Hi everybody!
>>>>
>>>> I created the following doc:
>>>> https://wikitech.wikimedia.org/wiki/Analytics/Tutorials/Analytics_Client_Nodes
>>>>
>>>> It contains two FAQ:
>>>> - How do I ensure that there is enough space on disk before storing big
>>>> datasets/files ?
>>>> - How do I check the space used by my files/data on stat/notebook hosts
>>>> ?
>>>>
>>>> Please read them and let me know if anything is not clear or missing.
>>>> We have plenty of space on stat100X hosts, but we tend to cluster on single
>>>> machines like stat1007 for some reason, ending up in fighting for 
>>>> resources.
>>>>
>>>> On a related note, we are going to work on unifying stat/notebook
>>>> puppet configs in https://phabricator.wikimedia.org/T243934, so
>>>> eventually all Analytics clients will be exactly the same.
>>>>
>>>> Thanks!
>>>>
>>>> Luca (on behalf of the Analytics team)
>>>>
>>>>
>>>> _______________________________________________
>>>> Research-Internal mailing list
>>>> [email protected]
>>>> https://lists.wikimedia.org/mailman/listinfo/research-internal
>>>>
>>> _______________________________________________
>>> Analytics mailing list
>>> [email protected]
>>> https://lists.wikimedia.org/mailman/listinfo/analytics
>>>
>> _______________________________________________
>> Research-Internal mailing list
>> [email protected]
>> https://lists.wikimedia.org/mailman/listinfo/research-internal
>>
> _______________________________________________
> Analytics mailing list
> [email protected]
> https://lists.wikimedia.org/mailman/listinfo/analytics
>


-- 
Isaac Johnson (he/him/his) -- Research Scientist -- Wikimedia Foundation
_______________________________________________
Analytics mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/analytics

Reply via email to