Thanks for this Luca. I tend to use stat1007 because I know that machine has a lot of ram/cpu and HDFS access. From other statsX I'm not sure which of them have what resources (I know at least one of them doesn't have HDFS access). There is a table where I can look at a summary of resources per machine?
Thanks again. On Tue, Feb 18, 2020 at 8:53 AM Luca Toscano <[email protected]> wrote: > Hi everybody! > > I created the following doc: > https://wikitech.wikimedia.org/wiki/Analytics/Tutorials/Analytics_Client_Nodes > > It contains two FAQ: > - How do I ensure that there is enough space on disk before storing big > datasets/files ? > - How do I check the space used by my files/data on stat/notebook hosts ? > > Please read them and let me know if anything is not clear or missing. We > have plenty of space on stat100X hosts, but we tend to cluster on single > machines like stat1007 for some reason, ending up in fighting for resources. > > On a related note, we are going to work on unifying stat/notebook puppet > configs in https://phabricator.wikimedia.org/T243934, so eventually all > Analytics clients will be exactly the same. > > Thanks! > > Luca (on behalf of the Analytics team) > > > _______________________________________________ > Research-Internal mailing list > [email protected] > https://lists.wikimedia.org/mailman/listinfo/research-internal >
_______________________________________________ Analytics mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/analytics
