we didn't do anything on the cluster end, the company hosted our cluster did a BGP update(what ever that means) and full reset. (I think just reboot of the switches)
On Thu, Jul 14, 2011 at 1:27 PM, Robert Evans <ev...@yahoo-inc.com> wrote: > Felix, > > So did you change anything except the network configuration? What did you > do to fix the “networking issues”? > > --Bobby Evans > > > On 7/14/11 2:46 PM, "felix gao" <gre1...@gmail.com> wrote: > > recently we had some network issues with our cluster. this job used to > take on few minute to complete and how it is taking over half hour. > > when looking at the jobtracker's log i see it slowly getting all the splits > information (the list is not exhaustive) > 2011-07-14 14:42:51,434 INFO org.apache.hadoop.mapred.JobInProgress: > tip:task_201107141056_0005_m_002488 has split on node:/default-rack/x.com< > http://x.com> > 2011-07-14 14:42:56,465 INFO org.apache.hadoop.mapred.JobInProgress: > tip:task_201107141056_0005_m_002489 has split on node:/default-rack/x.com< > http://x.com> > 2011-07-14 14:43:01,446 INFO org.apache.hadoop.mapred.JobInProgress: > tip:task_201107141056_0019_m_000218 has split on node:/default-rack/x.com< > http://x.com> > 2011-07-14 14:43:01,466 INFO org.apache.hadoop.mapred.JobInProgress: > tip:task_201107141056_0010_m_001703 has split on node:/default-rack/x.com< > http://x.com> > 2011-07-14 14:43:01,490 INFO org.apache.hadoop.mapred.JobInProgress: > tip:task_201107141056_0005_m_002489 has split on node:/default-rack/x.com< > http://x.com> > 2011-07-14 14:43:06,469 INFO org.apache.hadoop.mapred.JobInProgress: > tip:task_201107141056_0010_m_001703 has split on node:/default-rack/x.com< > http://x.com> > 2011-07-14 14:43:06,473 INFO org.apache.hadoop.mapred.JobInProgress: > tip:task_201107141056_0019_m_000218 has split on node:/default-rack/x.com< > http://x.com> > 2011-07-14 14:43:06,473 INFO org.apache.hadoop.mapred.JobInProgress: > tip:task_201107141056_0019_m_000219 has split on node:/default-rack/x.com< > http://x.com> > 2011-07-14 14:43:06,473 INFO org.apache.hadoop.mapred.JobInProgress: > tip:task_201107141056_0019_m_000219 has split on node:/default-rack/x.com< > http://x.com> > 2011-07-14 14:43:11,500 INFO org.apache.hadoop.mapred.JobInProgress: > tip:task_201107141056_0019_m_000220 has split on node:/default-rack/x.com< > http://x.com> > 2011-07-14 14:43:11,542 INFO org.apache.hadoop.mapred.JobInProgress: > tip:task_201107141056_0005_m_002491 has split on node:/default-rack/x.com< > http://x.com> > 2011-07-14 14:43:16,526 INFO org.apache.hadoop.mapred.JobInProgress: > tip:task_201107141056_0019_m_000224 has split on node:/default-rack/x.com< > http://x.com> > 2011-07-14 14:43:16,526 INFO org.apache.hadoop.mapred.JobInProgress: > tip:task_201107141056_0019_m_000225 has split on node:/default-rack/x.com< > http://x.com> > 2011-07-14 14:43:16,567 INFO org.apache.hadoop.mapred.JobInProgress: > tip:task_201107141056_0005_m_002491 has split on node:/default-rack/x.com< > http://x.com> > 2011-07-14 14:45:26,791 INFO org.apache.hadoop.mapred.JobInProgress: > tip:task_201107141056_0025_m_000001 has split on node:/default-rack/x.com< > http://x.com> > 2011-07-14 14:45:28,696 INFO org.apache.hadoop.mapred.JobInProgress: > tip:task_201107141056_0005_m_002509 has split on node:/default-rack/x.com< > http://x.com> > 2011-07-14 14:45:31,770 INFO org.apache.hadoop.mapred.JobInProgress: > tip:task_201107141056_0010_m_001722 has split on node:/default-rack/x.com< > http://x.com> > 2011-07-14 14:45:31,815 INFO org.apache.hadoop.mapred.JobInProgress: > tip:task_201107141056_0025_m_000002 has split on node:/default-rack/x.com< > http://x.com> > > > 250 mappers tooks about 25 min to run, 10min spent on generating the > tasks. The question is what could have caused this slow down? > > Thanks, > > Felix > >