On Apr 8, 2010, at 9:37 AM, stephen mulcahy wrote: > When I run this on the Debian 2.6.32 kernel - over the course of the run, 1 > or 2 datanodes of the cluster enter a state whereby they are no longer > responsive to network traffic.
How much free memory do you have? How many tasks per node do you have? What are the service times, etc, on your IO system? > Has anyone run into similar problems with their environments? I noticed that > the when the nodes become unresponsive, it often happens when the TeraSort is > at I've always seen Linux nodes go unresponsive when they get memory starved to the point that the OOM can't function because it can't allocate enough mem.