This is awesome! Thank you team! On Tue, Feb 25, 2020 at 7:35 AM Goran Milovanovic < goran.milovanovic_...@wikimedia.de> wrote:
> Great job Luca. Thank you very much. > > I have started to diversify all WMDE Analytics jobs (mainly Wikidata > related things) across the stat100* machines. > While I still mainly use stat1007, two modules of the WDCM > <https://wikitech.wikimedia.org/wiki/Wikidata_Concepts_Monitor> system > are already migrated to stat1004. > > Best, > Goran > > Goran S. Milovanović, PhD > Data Scientist, Software Department > Wikimedia Deutschland > > ------------------------------------------------ > "It's not the size of the dog in the fight, > it's the size of the fight in the dog." > - Mark Twain > ------------------------------------------------ > > > On Wed, Feb 19, 2020 at 4:33 AM Neil Shah-Quinn <nshahqu...@wikimedia.org> > wrote: > >> Thank you very much, Luca! >> >> To make this nice documentation easier to discover, I moved it to >> Analytics/Systems/Clients >> <https://wikitech.wikimedia.org/wiki/Analytics/Systems/Clients> along >> with the other information on the clients from Analytics/Data access. >> >> On Tue, 18 Feb 2020 at 17:11, Isaac Johnson <is...@wikimedia.org> wrote: >> >>> Thanks for pulling together these directions Luca! I did a little >>> clean-up and will try to remember to do so more routinely. >>> >>> Adding to what Diego said, I also started using stat1007 because it has >>> the most access to resources (dumps, Hadoop, MariaDB), and then my virtual >>> environments, config files, etc. are there and so I tend to do all of my >>> work on stat1007 even when the other stat machines might work for other >>> projects. Putting the GPU on stat1005 helped me diversify a little but I'm >>> very excited to hear that the stat machines will be more standardized so it >>> matters less which machine I choose. While I have no desire to be spread >>> out across the machines (a few projects on stat1004, a few on stat1005, >>> etc.) because then I'll certainly lose track of where different projects >>> are, I would be open to trying to choose another host as my "main" >>> workspace. >>> >>> Best, >>> Isaac >>> >>> On Tue, Feb 18, 2020 at 10:53 AM Andrew Otto <o...@wikimedia.org> wrote: >>> >>>> I added a 'GPU?' column too. :) THANKS LUCA! >>>> >>>> On Tue, Feb 18, 2020 at 11:51 AM Luca Toscano <ltosc...@wikimedia.org> >>>> wrote: >>>> >>>>> Hey Diego, >>>>> >>>>> added a section at the end of the page with the info requested, let me >>>>> know if anything is missing :) >>>>> >>>>> Luca >>>>> >>>>> Il giorno mar 18 feb 2020 alle ore 17:37 Diego Saez-Trumper < >>>>> di...@wikimedia.org> ha scritto: >>>>> >>>>>> Thanks for this Luca. >>>>>> >>>>>> I tend to use stat1007 because I know that machine has a lot of >>>>>> ram/cpu and HDFS access. From other statsX I'm not sure which of them >>>>>> have >>>>>> what resources (I know at least one of them doesn't have HDFS access). >>>>>> There is a table where I can look at a summary of resources per machine? >>>>>> >>>>>> Thanks again. >>>>>> >>>>>> On Tue, Feb 18, 2020 at 8:53 AM Luca Toscano <ltosc...@wikimedia.org> >>>>>> wrote: >>>>>> >>>>>>> Hi everybody! >>>>>>> >>>>>>> I created the following doc: >>>>>>> https://wikitech.wikimedia.org/wiki/Analytics/Tutorials/Analytics_Client_Nodes >>>>>>> >>>>>>> It contains two FAQ: >>>>>>> - How do I ensure that there is enough space on disk before storing >>>>>>> big datasets/files ? >>>>>>> - How do I check the space used by my files/data on stat/notebook >>>>>>> hosts ? >>>>>>> >>>>>>> Please read them and let me know if anything is not clear or >>>>>>> missing. We have plenty of space on stat100X hosts, but we tend to >>>>>>> cluster >>>>>>> on single machines like stat1007 for some reason, ending up in fighting >>>>>>> for >>>>>>> resources. >>>>>>> >>>>>>> On a related note, we are going to work on unifying stat/notebook >>>>>>> puppet configs in https://phabricator.wikimedia.org/T243934, so >>>>>>> eventually all Analytics clients will be exactly the same. >>>>>>> >>>>>>> Thanks! >>>>>>> >>>>>>> Luca (on behalf of the Analytics team) >>>>>>> >>>>>>> >>>>>>> _______________________________________________ >>>>>>> Research-Internal mailing list >>>>>>> research-inter...@lists.wikimedia.org >>>>>>> https://lists.wikimedia.org/mailman/listinfo/research-internal >>>>>>> >>>>>> _______________________________________________ >>>>>> Analytics mailing list >>>>>> Analytics@lists.wikimedia.org >>>>>> https://lists.wikimedia.org/mailman/listinfo/analytics >>>>>> >>>>> _______________________________________________ >>>>> Research-Internal mailing list >>>>> research-inter...@lists.wikimedia.org >>>>> https://lists.wikimedia.org/mailman/listinfo/research-internal >>>>> >>>> _______________________________________________ >>>> Analytics mailing list >>>> Analytics@lists.wikimedia.org >>>> https://lists.wikimedia.org/mailman/listinfo/analytics >>>> >>> >>> >>> -- >>> Isaac Johnson (he/him/his) -- Research Scientist -- Wikimedia Foundation >>> >> _______________________________________________ >> Analytics mailing list >> Analytics@lists.wikimedia.org >> https://lists.wikimedia.org/mailman/listinfo/analytics >> > _______________________________________________ > Analytics mailing list > Analytics@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/analytics > -- *CherRaye Glenn (she/her)* Audience Insights Analyst Wikimedia Foundation <https://wikimediafoundation.org/>
_______________________________________________ Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics