Re: Container recovery on working on CDH with yarn.component.placement.policy=1

Gour Saha Wed, 20 May 2015 13:19:21 -0700

Thomas,
Resources.json looks ok. Can you send me the following logs so that I can
look further into it -


- Slider AM log
- Slider agent log (for the container that was killed)
- RM log
- NM log from the node where Slider agent (that was killed) was running

-Gour

On 5/19/15, 8:43 AM, "Thomas Weise" <[email protected]> wrote:

>All resources are freed up. The AM requests the replacement container and
>nothing happens after that. Please see:
>
>https://www.dropbox.com/sh/8ub0jedh60cgys4/AACPftofPcdhD5Sb2XADRMTga?dl=0
>
>resources.json
>
>{
>  "schema" : "http://example.org/specification/v2.0.0";,
>  "metadata" : {
>  },
>  "global" : {
>    "yarn.container.failure.threshold":"10",
>    "yarn.container.failure.window.hours":"1"
>  },
>  "components" : {
>    "broker" : {
>      "yarn.role.priority" : "1",
>      "yarn.component.instances" : "3",
>      "yarn.memory" : "768",
>      "yarn.vcores" : "1",
>      "yarn.component.placement.policy":"1"
>    },
>    "slider-appmaster" : {
>    }
>  }
>}
>
>
>On Wed, May 13, 2015 at 5:03 PM, Gour Saha <[email protected]> wrote:
>
>> Can you check the resources (memory, cpu) available in the host, after
>> killing the container? Is it freed? Can you hit the RM UI and share what
>> you see in the ³Cluster Metrics² table for that node?
>>
>> Also, if possible please share your resources.json.
>>
>> -Gour
>>
>> On 5/12/15, 9:34 AM, "Thomas Weise" <[email protected]> wrote:
>>
>> >We are testing KOYA on CDH 5.4. We see that after killing the container
>> >Slider as expected will ask for the same host. The request is never
>>filled
>> >and the container cannot be redeployed. We see this behavior on CDH
>>with
>> >DataTorrent also, it looks like a CDH bug.
>> >
>> >Anyone else trying to run Slider on CDH and sees the same behavior? Any
>> >insight on whether that is a CDH configuration issue or fair scheduler
>> >bug?
>> >
>> >Thanks,
>> >Thomas
>>
>>

Re: Container recovery on working on CDH with yarn.component.placement.policy=1

Reply via email to