I think there's something weird here:
>   cpus: offered 2.0 needed at least 1.0
>   mem : offered 1724.0 needed at least 1024.0
>   disk: offered 44124.0 needed at least 1024.0
>   ports:  at least 2 (sufficient)

Am I misreading this? All of the requirements seem to be met.

Presumably it's this code from o.a.h.mapred.ResourcePolicyVariable:

> int slots = mapSlotsMax + reduceSlotsMax;
> slots = (int) Math.min(slots, (cpus - containerCpus) / slotCpus);
> slots = (int) Math.min(slots, (mem - containerMem) / slotMem);
> slots = (int) Math.min(slots, (disk - containerDisk) / slotDisk);
> 
> // Is this offer too small for even the minimum slots?
> if (slots < 1) {
>   return false;
> }

Not exactly sure what this is doing.

Sorry for the noise.

> 
> On May 7, 2015, at 6:32 PM, Brian Topping <[email protected]> wrote:
> 
> Presumably https://gist.github.com/briantopping/311960f8e5454dbe9aab 
> <https://gist.github.com/briantopping/311960f8e5454dbe9aab> has some more 
> information necessary at this point... sorry for the omission..
> 
>> On May 7, 2015, at 6:05 PM, Tom Arnfeld <[email protected] 
>> <mailto:[email protected]>> wrote:
>> 
>> Hi Brian,
>> 
>> At this point you should see the TT attempting to be launched via Mesos. The 
>> "launched but not heartbeat yet" count tells us that the framework has 
>> accepted resources for 4 slots but the TT hasn't actually come up yet.
>> 
>> Do you see the task in your Meaos cluster UI, and is there anything 
>> interesting in the task logs?
>> 
>> --
>> 
>> Tom Arnfeld
>> Developer // DueDil
>> 
>> (+44) 7525940046
>> 25 Christopher Street, London, EC2A 2BS
>> 
>> 
>> On Thu, May 7, 2015 at 12:01 PM, Brian Topping <[email protected] 
>> <mailto:[email protected]>> wrote:
>> 
>> Thanks guys, this was helpful. I started the job tracker as a service, but 
>> apparently I never started the task tracker (or it failed to start and I 
>> didn't notice). I started it after Haosdent's message, but wasn't able to 
>> see any difference and I kept poking around.
>> 
>> After making some changes and the VM wouldn't boot, my OCD got the better of 
>> me and I reinstalled everything from scratch. There are just too many moving 
>> parts to hassle you guys with an imperfect install on my end.
>> 
>> This time through, I felt a lot more confident to use the Mesosphere RPMs, 
>> but I couldn't find the best way to get things launched. 
>> https://docs.mesosphere.com/reference/packages/ 
>> <https://docs.mesosphere.com/reference/packages/> has a Last-Modified of 
>> Fri, 01 May 2015 18:46:10 GMT (one week ago), but the RHEL 6 RPMs don't have 
>> any init.d service descriptions as the packages page would indicate. For 
>> now, I just launched them manually, but would like to get the machine to 
>> completely load on boot as services.
>> 
>> At this point, I have tested Mesos with:
>> 
>>      mesos-execute --master="localhost:5050" --name="test-exec" 
>> --command="sleep 10"
>> 
>> The only problem there is it seems that "localhost" isn't good enough for my 
>> install, it needs to be the FQDN, but it works and the job flows through the 
>> UI.
>> 
>> Now, back to a hadoop job. When I try the job now, the logs show the 
>> following stream of repeated messages:
>> 
>>> 2015-05-07 17:52:53,124 INFO org.apache.hadoop.mapred.ResourcePolicy: 
>>> Satisfied map and reduce slots needed.
>>> 2015-05-07 17:52:53,340 INFO org.apache.hadoop.mapred.MesosScheduler: 
>>> Unknown/exited TaskTracker: http://10.211.55.16:50060 
>>> <http://10.211.55.16:50060/>.
>>> [Repeated a few times a second for five seconds]
>>> 2015-05-07 17:49:08,914 INFO org.apache.hadoop.mapred.ResourcePolicy: 
>>> JobTracker Status
>>>       Pending Map Tasks: 4
>>>    Pending Reduce Tasks: 1
>>>       Running Map Tasks: 0
>>>    Running Reduce Tasks: 0
>>>          Idle Map Slots: 0
>>>       Idle Reduce Slots: 0
>>>      Inactive Map Slots: 4 (launched but no hearbeat yet)
>>>   Inactive Reduce Slots: 1 (launched but no hearbeat yet)
>>>        Needed Map Slots: 0
>>>     Needed Reduce Slots: 0
>>>      Unhealthy Trackers: 0
>> 
>> This looks close.
>> 
>> What's the best way to get a JDWP port set up to break in this code (i.e. 
>> learning to fish...)?
>> 
>> best, Brian
>> 
>> 
>>> On May 7, 2015, at 12:11 PM, Adam Bordelon <[email protected] 
>>> <mailto:[email protected]>> wrote:
>>> 
>>> From the mesos-master log and the JT log, it doesn't look like the 
>>> MesosScheduler ever registered with Mesos, which should mean that it 
>>> wouldn't start any TTs or map/reduce tasks. However, your `ps` output does 
>>> seem to show a tasktracker running. Did you start that yourself (or 
>>> automatically as a system service)?
>>> 
>>> On Wed, May 6, 2015 at 9:32 AM, haosdent <[email protected] 
>>> <mailto:[email protected]>> wrote:
>>> Do you start tasktracker successfully?
>>> 
>>> On Wed, May 6, 2015 at 11:32 PM, Brian Topping <[email protected] 
>>> <mailto:[email protected]>> wrote:
>>> Hi all, I'm happy to report that I'm very close to getting 2.6.0-cdh5.4.0 
>>> integrated against Mesos 0.22.1 with the hadoop-mesos 0.10 code on Github. 
>>> Hoping someone might have a few minutes to parse what I've got here and 
>>> suggest something to try.
>>> 
>>> https://gist.github.com/briantopping/0dfd0777ff4ce5a81219 
>>> <https://gist.github.com/briantopping/0dfd0777ff4ce5a81219> hopefully has 
>>> all the data necessary between the console output of the client run, the 
>>> mesos master and slave console, the XML configuration of the JT and the 
>>> output that was generated by it. Please let me know if I've left something 
>>> out.
>>> 
>>> I iterated a few times getting all the errors from missing paths or 
>>> libraries sorted out, but the example client ultimately just sits waiting 
>>> forever at "map 0% reduce 0%".
>>> 
>>> Any input kindly appreciated!
>>> 
>>> Brian
>>> 
>>> 
>>> 
>>> --
>>> Best Regards,
>>> Haosdent Huang
>>> 
>> 
>> <signature.asc>
>> 
> 

Attachment: signature.asc
Description: Message signed with OpenPGP using GPGMail

Reply via email to