Great job Luca. Thank you very much. I have started to diversify all WMDE Analytics jobs (mainly Wikidata related things) across the stat100* machines. While I still mainly use stat1007, two modules of the WDCM <https://wikitech.wikimedia.org/wiki/Wikidata_Concepts_Monitor> system are already migrated to stat1004.
Best, Goran Goran S. Milovanović, PhD Data Scientist, Software Department Wikimedia Deutschland ------------------------------------------------ "It's not the size of the dog in the fight, it's the size of the fight in the dog." - Mark Twain ------------------------------------------------ On Wed, Feb 19, 2020 at 4:33 AM Neil Shah-Quinn <[email protected]> wrote: > Thank you very much, Luca! > > To make this nice documentation easier to discover, I moved it to > Analytics/Systems/Clients > <https://wikitech.wikimedia.org/wiki/Analytics/Systems/Clients> along > with the other information on the clients from Analytics/Data access. > > On Tue, 18 Feb 2020 at 17:11, Isaac Johnson <[email protected]> wrote: > >> Thanks for pulling together these directions Luca! I did a little >> clean-up and will try to remember to do so more routinely. >> >> Adding to what Diego said, I also started using stat1007 because it has >> the most access to resources (dumps, Hadoop, MariaDB), and then my virtual >> environments, config files, etc. are there and so I tend to do all of my >> work on stat1007 even when the other stat machines might work for other >> projects. Putting the GPU on stat1005 helped me diversify a little but I'm >> very excited to hear that the stat machines will be more standardized so it >> matters less which machine I choose. While I have no desire to be spread >> out across the machines (a few projects on stat1004, a few on stat1005, >> etc.) because then I'll certainly lose track of where different projects >> are, I would be open to trying to choose another host as my "main" >> workspace. >> >> Best, >> Isaac >> >> On Tue, Feb 18, 2020 at 10:53 AM Andrew Otto <[email protected]> wrote: >> >>> I added a 'GPU?' column too. :) THANKS LUCA! >>> >>> On Tue, Feb 18, 2020 at 11:51 AM Luca Toscano <[email protected]> >>> wrote: >>> >>>> Hey Diego, >>>> >>>> added a section at the end of the page with the info requested, let me >>>> know if anything is missing :) >>>> >>>> Luca >>>> >>>> Il giorno mar 18 feb 2020 alle ore 17:37 Diego Saez-Trumper < >>>> [email protected]> ha scritto: >>>> >>>>> Thanks for this Luca. >>>>> >>>>> I tend to use stat1007 because I know that machine has a lot of >>>>> ram/cpu and HDFS access. From other statsX I'm not sure which of them have >>>>> what resources (I know at least one of them doesn't have HDFS access). >>>>> There is a table where I can look at a summary of resources per machine? >>>>> >>>>> Thanks again. >>>>> >>>>> On Tue, Feb 18, 2020 at 8:53 AM Luca Toscano <[email protected]> >>>>> wrote: >>>>> >>>>>> Hi everybody! >>>>>> >>>>>> I created the following doc: >>>>>> https://wikitech.wikimedia.org/wiki/Analytics/Tutorials/Analytics_Client_Nodes >>>>>> >>>>>> It contains two FAQ: >>>>>> - How do I ensure that there is enough space on disk before storing >>>>>> big datasets/files ? >>>>>> - How do I check the space used by my files/data on stat/notebook >>>>>> hosts ? >>>>>> >>>>>> Please read them and let me know if anything is not clear or missing. >>>>>> We have plenty of space on stat100X hosts, but we tend to cluster on >>>>>> single >>>>>> machines like stat1007 for some reason, ending up in fighting for >>>>>> resources. >>>>>> >>>>>> On a related note, we are going to work on unifying stat/notebook >>>>>> puppet configs in https://phabricator.wikimedia.org/T243934, so >>>>>> eventually all Analytics clients will be exactly the same. >>>>>> >>>>>> Thanks! >>>>>> >>>>>> Luca (on behalf of the Analytics team) >>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> Research-Internal mailing list >>>>>> [email protected] >>>>>> https://lists.wikimedia.org/mailman/listinfo/research-internal >>>>>> >>>>> _______________________________________________ >>>>> Analytics mailing list >>>>> [email protected] >>>>> https://lists.wikimedia.org/mailman/listinfo/analytics >>>>> >>>> _______________________________________________ >>>> Research-Internal mailing list >>>> [email protected] >>>> https://lists.wikimedia.org/mailman/listinfo/research-internal >>>> >>> _______________________________________________ >>> Analytics mailing list >>> [email protected] >>> https://lists.wikimedia.org/mailman/listinfo/analytics >>> >> >> >> -- >> Isaac Johnson (he/him/his) -- Research Scientist -- Wikimedia Foundation >> > _______________________________________________ > Analytics mailing list > [email protected] > https://lists.wikimedia.org/mailman/listinfo/analytics >
_______________________________________________ Analytics mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/analytics
