Hi Randy, How much cores do you have on your machines and how much did you allocate for Yarn?
Daniel On Saturday, 23 January 2016, Randy Fox <[email protected]> wrote: > Hi, > > We just upgraded to using Yarn on Hadoop 2.6.0 – CDH5.4.5 > We are running a large job – 200K mappers, 100K reducers and we can’t get > through the shuffle phase. The node managers are 800% cpu and high GC. > The reducers get socket timouts after 1.5 hours of running and only getting > a few percent of the data from the mappers. This job took about 30 hours > total 12 in mappers on MRv1 with no issues. > > I have looked for configs that might help or issues filed and anyone that > has seen this and I have come up with nothing. > Anyone have ideas on things to try or explain why the node managers are in > GC hell and why the data is just not flowing from mappers to reducers? > > Thanks in advanced, > > Randy >
