Thanks! Always appreciate the work to improve the analytics infrastructure.
I did a quick unscientific speed comparison by running some comparable queries in parallel on stat1004 and stat1002 (using beeline), and didn't observe a clear difference. But maybe those were times were the load on stat1002 was low anyway. I guess that in that case, the execution time will mostly be determined by the database server that both are connecting to (analytics1003.eqiad.wmnet). On Mon, May 2, 2016 at 11:50 AM, Andrew Otto <[email protected]> wrote: > Hi all! > > For years now, y’all have been accessing the Analytics Hadoop Cluster using > stat1002. This works just fine, but others use stat1002 for number > crunching outside of Hadoop as well. At times stat1002 can get pretty > overloaded, which can make accessing Hadoop via this one box a little > annoying. > > But fret no longer! stat1004 is here! stat1004 can now be accessed by > anyone in the analytics-privatedata-users and analytics-users groups. If > you previously had access to stat1002 AND used it to talk to Hive and > Hadoop, you may now also do this from stat1004. You don’t have to do > anything new to get access to stat1004 if you already had Hadoop accounts. > > stat1002 will remain useable as is. If you are looking for a more dedicated > place from which to interact with Hadoop services, use stat1004 instead. > > You don’t have to do anything to get access. > > I’ve updated the wikitech documentation accordingly. Let us know if you > have any questions! > > -Andrew > > _______________________________________________ > Analytics mailing list > [email protected] > https://lists.wikimedia.org/mailman/listinfo/analytics > -- Tilman Bayer Senior Analyst Wikimedia Foundation IRC (Freenode): HaeB _______________________________________________ Analytics mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/analytics
