Hi Our job tracker and tasktrackers have been running for sometime now. And regularly number of directories in <logs-dir>/userlogs reaches 31999 and from then on jobs fail, after removing some old folders under userlogs, jobs run fine. Also, we find lot of java <defunct> processes on the task tracker machines. So I have to regularly restart job and tasktrackers. Has anyone encountered these issues earlier. Any help would be appreciated, as we are running our cluster in production, would like to solve these issues asap.
Many Thanks & Regards Sandhya