No -- maybe apache mailing lists don't let you send attachments. its here: http://cl.ly/2z0N04290t143S463s29
Also, I see this repeated a bunch in the logs, not sure if it helps, looks like hadoop is exiting with a non-zero exit code? I0127 23:34:57.590661 5089 master.cpp:1453] Launching task 316 on slave 201201272320-0-2 I0127 23:34:57.987401 5089 master.cpp:1001] Executor default of framework 201201272320-0-0000 on slave 201201272320-0-2 (ip-10-98-58-126.ec2.internal) exited with status 256 I0127 23:34:57.988973 5089 master.cpp:1033] Removing task 316 of framework 201201272320-0-0000 because of lost executor I0127 23:34:57.989153 5089 master.cpp:1184] Sending 1 offers to framework 201201272320-0-0000 I0127 23:34:57.990584 5089 master.cpp:679] Received reply for offer 201201272320-0-926 I0127 23:34:57.990675 5089 master.cpp:1453] Launching task 317 on slave 201201272320-0-2 I0127 23:34:58.587996 5089 master.cpp:1184] Sending 4 offers to framework 201201272320-0-0000 I0127 23:34:58.590417 5089 master.cpp:679] Received reply for offer 201201272320-0-927 I0127 23:34:58.590739 5089 master.cpp:1403] Filtered slave 201201272320-0-1 for framework 201201272320-0-0000 for 5 seconds -- Matthew Rathbone Foursquare | Software Engineer | Server Engineering Team [email protected] (mailto:[email protected]) | @rathboma (http://twitter.com/rathboma) | 4sq (http://foursquare.com/rathboma) On Friday, January 27, 2012 at 5:36 PM, Andy Konwinski wrote: > Did you forget to attach the output? > > On Fri, Jan 27, 2012 at 3:31 PM, Matthew Rathbone <[email protected] > (mailto:[email protected])>wrote: > > > Here's my output from that (attached, it's long). > > > > The regular web-uri :8080 works fine until I submit a job, it can see the > > hadoop jobtracker and everything, but when I submit a job it goes haywire. > > I can't see anything obvious in the logs either. > > > > This is all I did: > > start a cluster > > start a job tracker > > hadoop fs -put hadoop-examples.jar > > <mkdirs> > > hadoop jar hadoop-examples.jar wordcount wordcount/input wordcount/output > > > > I figured it might be something to do with MESOS_HOME not being set in > > hadoop-env.sh (http://hadoop-env.sh), so I set that too ( on all machines > > ), but it didn't seem to > > help. > > > > If it helps, the jobtracker is still up, and it received the job, but > > doesn't see any nodes. > > > > -- > > Matthew Rathbone > > Foursquare | Software Engineer | Server Engineering Team > > [email protected] | @rathboma <http://twitter.com/rathboma> | > > 4sq<http://foursquare.com/rathboma> > > > > On Friday, January 27, 2012 at 5:20 PM, Andy Konwinski wrote: > > > > It looks like a JSON parsing error in the webui python code (i.e. the error > > output shows line 11 of webui/master/index.tpl which is the json code > > "state = json.loads(data)"). > > > > What happens if you go to > > > > http://ec2-107-21-195-96.compute-1.amazonaws.com:5050/master/state.jsoninside > > the firewall (or open up port 5050 in the EC2 firewall for your > > machine)? > > > > When I do this on my machine locally (before running any frameworks or > > starting any slaves), I see: > > > > {"build_date":"2012-01-25 > > > > 11:19:19","build_user":"andyk","completed_frameworks":[],"frameworks":[],"id":"201201271511-0","pid":" > > [email protected] > > (mailto:[email protected]):5050","slaves":[],"start_time":1327705891} > > > > Andy > > > > On Fri, Jan 27, 2012 at 3:02 PM, Matthew Rathbone <[email protected] > > (mailto:[email protected]) > > > wrote: > > > > > > So I spun up a mesos cluster using the ec2 scripts. So far so good. > > > > Then I spun up a jobtracker, that worked (after some fiddling) > > > > Then I tried to submit an example job (wordcount). > > > > First of all, the job tracker receives the job, but then I get these > > errors in the terminal: > > 12/01/27 22:57:12 INFO input.FileInputFormat: Total input paths to process > > : 0 > > 12/01/27 22:57:13 INFO mapred.JobClient: Running job: job_201201272245_0002 > > 12/01/27 22:57:14 INFO mapred.JobClient: map 0% reduce 0% > > channel 6: open failed: connect failed: Connection refused > > channel 7: open failed: connect failed: Connection refused > > channel 6: open failed: connect failed: Connection refused > > > > > > So I check on the mesos dashboard (port 8080) and I see this: > > http://cl.ly/221D193v0l012k0h3W0S > > > > It doesn't look good, anyone have any pointers? (Sorry for spamming the > > list so much over the last couple of days) > > > > -- > > Matthew Rathbone > > Foursquare | Software Engineer | Server Engineering Team > > [email protected] > > (mailto:[email protected]<[email protected] > > (mailto:[email protected])>) > > | @rathboma ( > > http://twitter.com/rathboma) | 4sq (http://foursquare.com/rathboma) > > > > >
