I think you could export HADOOP_LOG_DIR=/tmp to temp. And try again.

On Fri, May 8, 2015 at 3:43 PM, Brian Topping <[email protected]>
wrote:

> Mesos runs as root, hadoop is as a separate user.
>
> On May 8, 2015, at 2:41 PM, haosdent <[email protected]> wrote:
>
> You run everything in root?
>
> On Fri, May 8, 2015 at 3:38 PM, haosdent <[email protected]> wrote:
>
>> Seems you don't have permission for this directory:
>>
>> java.io.IOException: Could not create job user log directory: 
>> file:/usr/lib/hadoop/logs/userlogs/job_201505080220_0001
>>
>> at javax.security.auth.Subject.doAs(Subject.java:415)
>>      at 
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
>>      
>>
>> On Fri, May 8, 2015 at 3:32 PM, Brian Topping <[email protected]>
>> wrote:
>>
>>> Thanks Hasodent, I've updated
>>> https://gist.github.com/briantopping/311960f8e5454dbe9aab with the
>>> output logs of what I am currently seeing. I've edited them for size, the
>>> message "INFO org.apache.hadoop.mapred.MesosScheduler: Unknown/exited
>>> TaskTracker: http://10.211.55.16:50060"; appeared a few thousand times
>>> in the logs. The configuration I have is probably still broken, 50060 is a
>>> Jetty port that returns a Cloudera string when telnetting to it.
>>>
>>> The error I saw below were apparently the result of building against the
>>> older version of CDH, when I updated the hadoop-mesos POM to match my
>>> deployment version, the incorrectly calculated "slots" problem in my
>>> previous message has resolved.
>>>
>>> My current problem is a Hadoop logging problem and nothing to do with
>>> Mesos, so I didn't post. I changed hadoop.log.dir=/var/log/hadoop in
>>> /etc/hadoop/conf.pseudo.mr1/log4j.properties, but it didn't make any
>>> difference. Just getting back into it now.
>>>
>>> On May 8, 2015, at 1:56 PM, haosdent <[email protected]> wrote:
>>>
>>> Could you post the log in executors which run jobtracker and
>>> taskstracks? It would be helpful to find the cause of this problem.
>>>
>>> On Fri, May 8, 2015 at 3:05 AM, Brian Topping <[email protected]>
>>> wrote:
>>>
>>>> I think there's something weird here:
>>>>
>>>>   cpus: offered 2.0 needed at least 1.0
>>>>   mem : offered 1724.0 needed at least 1024.0
>>>>   disk: offered 44124.0 needed at least 1024.0
>>>>   ports:  at least 2 (sufficient)
>>>>
>>>>
>>>> Am I misreading this? All of the requirements seem to be met.
>>>>
>>>> Presumably it's this code from o.a.h.mapred.ResourcePolicyVariable:
>>>>
>>>> int slots = mapSlotsMax + reduceSlotsMax;
>>>> slots = (int) Math.min(slots, (cpus - containerCpus) / slotCpus);
>>>> slots = (int) Math.min(slots, (mem - containerMem) / slotMem);
>>>> slots = (int) Math.min(slots, (disk - containerDisk) / slotDisk);
>>>>
>>>>
>>>> // Is this offer too small for even the minimum slots?
>>>> if (slots < 1) {
>>>>   return false;
>>>> }
>>>>
>>>>
>>>> Not exactly sure what this is doing.
>>>>
>>>> Sorry for the noise.
>>>>
>>>>
>>>> On May 7, 2015, at 6:32 PM, Brian Topping <[email protected]>
>>>> wrote:
>>>>
>>>> Presumably https://gist.github.com/briantopping/311960f8e5454dbe9aab has
>>>> some more information necessary at this point... sorry for the omission..
>>>>
>>>> On May 7, 2015, at 6:05 PM, Tom Arnfeld <[email protected]> wrote:
>>>>
>>>> Hi Brian,
>>>>
>>>> At this point you should see the TT attempting to be launched via
>>>> Mesos. The "launched but not heartbeat yet" count tells us that the
>>>> framework has accepted resources for 4 slots but the TT hasn't actually
>>>> come up yet.
>>>>
>>>> Do you see the task in your Meaos cluster UI, and is there anything
>>>> interesting in the task logs?
>>>>
>>>> --
>>>>
>>>> Tom Arnfeld
>>>> Developer // DueDil
>>>>
>>>> (+44) 7525940046
>>>> 25 Christopher Street, London, EC2A 2BS
>>>>
>>>>
>>>> On Thu, May 7, 2015 at 12:01 PM, Brian Topping <[email protected]
>>>> > wrote:
>>>>
>>>>> Thanks guys, this was helpful. I started the job tracker as a service,
>>>>> but apparently I never started the task tracker (or it failed to start and
>>>>> I didn't notice). I started it after Haosdent's message, but wasn't able 
>>>>> to
>>>>> see any difference and I kept poking around.
>>>>>
>>>>> After making some changes and the VM wouldn't boot, my OCD got the
>>>>> better of me and I reinstalled everything from scratch. There are just too
>>>>> many moving parts to hassle you guys with an imperfect install on my end.
>>>>>
>>>>> This time through, I felt a lot more confident to use the Mesosphere
>>>>> RPMs, but I couldn't find the best way to get things launched.
>>>>> https://docs.mesosphere.com/reference/packages/ has a Last-Modified
>>>>> of Fri, 01 May 2015 18:46:10 GMT (one week ago), but the RHEL 6 RPMs don't
>>>>> have any init.d service descriptions as the packages page would indicate.
>>>>> For now, I just launched them manually, but would like to get the machine
>>>>> to completely load on boot as services.
>>>>>
>>>>> At this point, I have tested Mesos with:
>>>>>
>>>>>  mesos-execute
>>>>> --master="localhost:5050" --name="test-exec" --command="sleep 10"
>>>>>
>>>>> The only problem there is it seems that "localhost" isn't good enough
>>>>> for my install, it needs to be the FQDN, but it works and the job flows
>>>>> through the UI.
>>>>>
>>>>> Now, back to a hadoop job. When I try the job now, the logs show the
>>>>> following stream of repeated messages:
>>>>>
>>>>>  2015-05-07 17:52:53,124 INFO
>>>>> org.apache.hadoop.mapred.ResourcePolicy: Satisfied map and reduce slots
>>>>> needed.
>>>>> 2015-05-07 17:52:53,340 INFO org.apache.hadoop.mapred.MesosScheduler:
>>>>> Unknown/exited TaskTracker: http://10.211.55.16:50060.
>>>>> [Repeated a few times a second for five seconds]
>>>>>
>>>>> 2015-05-07 17:49:08,914 INFO org.apache.hadoop.mapred.ResourcePolicy:
>>>>> JobTracker Status
>>>>>
>>>>>       Pending Map Tasks: 4
>>>>>
>>>>>    Pending Reduce Tasks: 1
>>>>>       Running Map Tasks: 0
>>>>>    Running Reduce Tasks: 0
>>>>>          Idle Map Slots: 0
>>>>>       Idle Reduce Slots: 0
>>>>>      Inactive Map Slots: 4 (launched but no hearbeat yet)
>>>>>   Inactive Reduce Slots: 1 (launched but no hearbeat yet)
>>>>>        Needed Map Slots: 0
>>>>>     Needed Reduce Slots: 0
>>>>>      Unhealthy Trackers: 0
>>>>>
>>>>>
>>>>> This looks close.
>>>>>
>>>>> What's the best way to get a JDWP port set up to break in this code
>>>>> (i.e. learning to fish...)?
>>>>>
>>>>> best, Brian
>>>>>
>>>>>
>>>>>  On May 7, 2015, at 12:11 PM, Adam Bordelon <[email protected]>
>>>>> wrote:
>>>>>
>>>>> From the mesos-master log and the JT log, it doesn't look like the
>>>>> MesosScheduler ever registered with Mesos, which should mean that it
>>>>> wouldn't start any TTs or map/reduce tasks. However, your `ps` output does
>>>>> seem to show a tasktracker running. Did you start that yourself (or
>>>>> automatically as a system service)?
>>>>>
>>>>> On Wed, May 6, 2015 at 9:32 AM, haosdent <[email protected]> wrote:
>>>>>
>>>>>> Do you start tasktracker successfully?
>>>>>>
>>>>>> On Wed, May 6, 2015 at 11:32 PM, Brian Topping <
>>>>>> [email protected]> wrote:
>>>>>>
>>>>>>> Hi all, I'm happy to report that I'm very close to
>>>>>>> getting 2.6.0-cdh5.4.0 integrated against Mesos 0.22.1 with the
>>>>>>> hadoop-mesos 0.10 code on Github. Hoping someone might have a few 
>>>>>>> minutes
>>>>>>> to parse what I've got here and suggest something to try.
>>>>>>>
>>>>>>>  https://gist.github.com/briantopping/0dfd0777ff4ce5a81219 hopefully
>>>>>>> has all the data necessary between the console output of the client run,
>>>>>>> the mesos master and slave console, the XML configuration of the JT and 
>>>>>>> the
>>>>>>> output that was generated by it. Please let me know if I've left 
>>>>>>> something
>>>>>>> out.
>>>>>>>
>>>>>>> I iterated a few times getting all the errors from missing paths or
>>>>>>> libraries sorted out, but the example client ultimately just sits 
>>>>>>> waiting
>>>>>>> forever at "map 0% reduce 0%".
>>>>>>>
>>>>>>> Any input kindly appreciated!
>>>>>>>
>>>>>>> Brian
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>  --
>>>>>> Best Regards,
>>>>>> Haosdent Huang
>>>>>>
>>>>>
>>>>>
>>>>>  <signature.asc>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> Best Regards,
>>> Haosdent Huang
>>>
>>>
>>>
>>
>>
>> --
>> Best Regards,
>> Haosdent Huang
>>
>
>
>
> --
> Best Regards,
> Haosdent Huang
>
>
>


-- 
Best Regards,
Haosdent Huang

Reply via email to