I think you could export HADOOP_LOG_DIR=/tmp to temp. And try again. On Fri, May 8, 2015 at 3:43 PM, Brian Topping <[email protected]> wrote:
> Mesos runs as root, hadoop is as a separate user. > > On May 8, 2015, at 2:41 PM, haosdent <[email protected]> wrote: > > You run everything in root? > > On Fri, May 8, 2015 at 3:38 PM, haosdent <[email protected]> wrote: > >> Seems you don't have permission for this directory: >> >> java.io.IOException: Could not create job user log directory: >> file:/usr/lib/hadoop/logs/userlogs/job_201505080220_0001 >> >> at javax.security.auth.Subject.doAs(Subject.java:415) >> at >> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671) >> >> >> On Fri, May 8, 2015 at 3:32 PM, Brian Topping <[email protected]> >> wrote: >> >>> Thanks Hasodent, I've updated >>> https://gist.github.com/briantopping/311960f8e5454dbe9aab with the >>> output logs of what I am currently seeing. I've edited them for size, the >>> message "INFO org.apache.hadoop.mapred.MesosScheduler: Unknown/exited >>> TaskTracker: http://10.211.55.16:50060" appeared a few thousand times >>> in the logs. The configuration I have is probably still broken, 50060 is a >>> Jetty port that returns a Cloudera string when telnetting to it. >>> >>> The error I saw below were apparently the result of building against the >>> older version of CDH, when I updated the hadoop-mesos POM to match my >>> deployment version, the incorrectly calculated "slots" problem in my >>> previous message has resolved. >>> >>> My current problem is a Hadoop logging problem and nothing to do with >>> Mesos, so I didn't post. I changed hadoop.log.dir=/var/log/hadoop in >>> /etc/hadoop/conf.pseudo.mr1/log4j.properties, but it didn't make any >>> difference. Just getting back into it now. >>> >>> On May 8, 2015, at 1:56 PM, haosdent <[email protected]> wrote: >>> >>> Could you post the log in executors which run jobtracker and >>> taskstracks? It would be helpful to find the cause of this problem. >>> >>> On Fri, May 8, 2015 at 3:05 AM, Brian Topping <[email protected]> >>> wrote: >>> >>>> I think there's something weird here: >>>> >>>> cpus: offered 2.0 needed at least 1.0 >>>> mem : offered 1724.0 needed at least 1024.0 >>>> disk: offered 44124.0 needed at least 1024.0 >>>> ports: at least 2 (sufficient) >>>> >>>> >>>> Am I misreading this? All of the requirements seem to be met. >>>> >>>> Presumably it's this code from o.a.h.mapred.ResourcePolicyVariable: >>>> >>>> int slots = mapSlotsMax + reduceSlotsMax; >>>> slots = (int) Math.min(slots, (cpus - containerCpus) / slotCpus); >>>> slots = (int) Math.min(slots, (mem - containerMem) / slotMem); >>>> slots = (int) Math.min(slots, (disk - containerDisk) / slotDisk); >>>> >>>> >>>> // Is this offer too small for even the minimum slots? >>>> if (slots < 1) { >>>> return false; >>>> } >>>> >>>> >>>> Not exactly sure what this is doing. >>>> >>>> Sorry for the noise. >>>> >>>> >>>> On May 7, 2015, at 6:32 PM, Brian Topping <[email protected]> >>>> wrote: >>>> >>>> Presumably https://gist.github.com/briantopping/311960f8e5454dbe9aab has >>>> some more information necessary at this point... sorry for the omission.. >>>> >>>> On May 7, 2015, at 6:05 PM, Tom Arnfeld <[email protected]> wrote: >>>> >>>> Hi Brian, >>>> >>>> At this point you should see the TT attempting to be launched via >>>> Mesos. The "launched but not heartbeat yet" count tells us that the >>>> framework has accepted resources for 4 slots but the TT hasn't actually >>>> come up yet. >>>> >>>> Do you see the task in your Meaos cluster UI, and is there anything >>>> interesting in the task logs? >>>> >>>> -- >>>> >>>> Tom Arnfeld >>>> Developer // DueDil >>>> >>>> (+44) 7525940046 >>>> 25 Christopher Street, London, EC2A 2BS >>>> >>>> >>>> On Thu, May 7, 2015 at 12:01 PM, Brian Topping <[email protected] >>>> > wrote: >>>> >>>>> Thanks guys, this was helpful. I started the job tracker as a service, >>>>> but apparently I never started the task tracker (or it failed to start and >>>>> I didn't notice). I started it after Haosdent's message, but wasn't able >>>>> to >>>>> see any difference and I kept poking around. >>>>> >>>>> After making some changes and the VM wouldn't boot, my OCD got the >>>>> better of me and I reinstalled everything from scratch. There are just too >>>>> many moving parts to hassle you guys with an imperfect install on my end. >>>>> >>>>> This time through, I felt a lot more confident to use the Mesosphere >>>>> RPMs, but I couldn't find the best way to get things launched. >>>>> https://docs.mesosphere.com/reference/packages/ has a Last-Modified >>>>> of Fri, 01 May 2015 18:46:10 GMT (one week ago), but the RHEL 6 RPMs don't >>>>> have any init.d service descriptions as the packages page would indicate. >>>>> For now, I just launched them manually, but would like to get the machine >>>>> to completely load on boot as services. >>>>> >>>>> At this point, I have tested Mesos with: >>>>> >>>>> mesos-execute >>>>> --master="localhost:5050" --name="test-exec" --command="sleep 10" >>>>> >>>>> The only problem there is it seems that "localhost" isn't good enough >>>>> for my install, it needs to be the FQDN, but it works and the job flows >>>>> through the UI. >>>>> >>>>> Now, back to a hadoop job. When I try the job now, the logs show the >>>>> following stream of repeated messages: >>>>> >>>>> 2015-05-07 17:52:53,124 INFO >>>>> org.apache.hadoop.mapred.ResourcePolicy: Satisfied map and reduce slots >>>>> needed. >>>>> 2015-05-07 17:52:53,340 INFO org.apache.hadoop.mapred.MesosScheduler: >>>>> Unknown/exited TaskTracker: http://10.211.55.16:50060. >>>>> [Repeated a few times a second for five seconds] >>>>> >>>>> 2015-05-07 17:49:08,914 INFO org.apache.hadoop.mapred.ResourcePolicy: >>>>> JobTracker Status >>>>> >>>>> Pending Map Tasks: 4 >>>>> >>>>> Pending Reduce Tasks: 1 >>>>> Running Map Tasks: 0 >>>>> Running Reduce Tasks: 0 >>>>> Idle Map Slots: 0 >>>>> Idle Reduce Slots: 0 >>>>> Inactive Map Slots: 4 (launched but no hearbeat yet) >>>>> Inactive Reduce Slots: 1 (launched but no hearbeat yet) >>>>> Needed Map Slots: 0 >>>>> Needed Reduce Slots: 0 >>>>> Unhealthy Trackers: 0 >>>>> >>>>> >>>>> This looks close. >>>>> >>>>> What's the best way to get a JDWP port set up to break in this code >>>>> (i.e. learning to fish...)? >>>>> >>>>> best, Brian >>>>> >>>>> >>>>> On May 7, 2015, at 12:11 PM, Adam Bordelon <[email protected]> >>>>> wrote: >>>>> >>>>> From the mesos-master log and the JT log, it doesn't look like the >>>>> MesosScheduler ever registered with Mesos, which should mean that it >>>>> wouldn't start any TTs or map/reduce tasks. However, your `ps` output does >>>>> seem to show a tasktracker running. Did you start that yourself (or >>>>> automatically as a system service)? >>>>> >>>>> On Wed, May 6, 2015 at 9:32 AM, haosdent <[email protected]> wrote: >>>>> >>>>>> Do you start tasktracker successfully? >>>>>> >>>>>> On Wed, May 6, 2015 at 11:32 PM, Brian Topping < >>>>>> [email protected]> wrote: >>>>>> >>>>>>> Hi all, I'm happy to report that I'm very close to >>>>>>> getting 2.6.0-cdh5.4.0 integrated against Mesos 0.22.1 with the >>>>>>> hadoop-mesos 0.10 code on Github. Hoping someone might have a few >>>>>>> minutes >>>>>>> to parse what I've got here and suggest something to try. >>>>>>> >>>>>>> https://gist.github.com/briantopping/0dfd0777ff4ce5a81219 hopefully >>>>>>> has all the data necessary between the console output of the client run, >>>>>>> the mesos master and slave console, the XML configuration of the JT and >>>>>>> the >>>>>>> output that was generated by it. Please let me know if I've left >>>>>>> something >>>>>>> out. >>>>>>> >>>>>>> I iterated a few times getting all the errors from missing paths or >>>>>>> libraries sorted out, but the example client ultimately just sits >>>>>>> waiting >>>>>>> forever at "map 0% reduce 0%". >>>>>>> >>>>>>> Any input kindly appreciated! >>>>>>> >>>>>>> Brian >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Best Regards, >>>>>> Haosdent Huang >>>>>> >>>>> >>>>> >>>>> <signature.asc> >>>> >>>> >>>> >>>> >>>> >>> >>> >>> -- >>> Best Regards, >>> Haosdent Huang >>> >>> >>> >> >> >> -- >> Best Regards, >> Haosdent Huang >> > > > > -- > Best Regards, > Haosdent Huang > > > -- Best Regards, Haosdent Huang

