Thanks. I updated the YARN bug with the OS info.

I saw that RM HA is disabled. By the way there is a patch submitted by
YARN for the RM HA issue -
https://issues.apache.org/jira/browse/SLIDER-846


As part of the YARN bug https://issues.apache.org/jira/browse/YARN-2605.

If you want I can provide you a patch to test, if you are okay to get a
jar from us.

-Gour

On 4/29/15, 11:18 AM, "Chackravarthy Esakkimuthu" <[email protected]>
wrote:

>OS installed is debian 7.
>And as I was facing issue (components were not starting) with RM HA
>enabled, I am testing it with RM HA disabled only. And yes, still NN HA is
>still enabled in the cluster.
>
>On Wed, Apr 29, 2015 at 11:37 PM, Gour Saha <[email protected]> wrote:
>
>> Unfortunately we haven¹t reproduced this issue in the envs we usually
>>test
>> on. We might have to create an exact replica of your cluster (with RM
>>HA,
>> NN HA, OS version, # of nodes, etc.) to be able to reproduce it. The
>>YARN
>> team is looking into this issue.
>>
>> By the way, what is the OS and version of the nodes in your cluster?
>>
>> -Gour
>>
>> On 4/29/15, 10:49 AM, "Chackravarthy Esakkimuthu"
>><[email protected]>
>> wrote:
>>
>> >sure Gour, Thanks for helping out.
>> >Do you also see these kind of issues? Is it reproducible for you as
>>well?
>> >
>> >On Wed, Apr 29, 2015 at 8:58 PM, Gour Saha <[email protected]>
>>wrote:
>> >
>> >> Thanks Chackra for providing the Slider and NM logs and configs of
>>the
>> >> cluster. From the logs it seems like a YARN bug, so I went ahead and
>> >>filed
>> >> one. I will follow up with the YARN team to see what is causing this
>>-
>> >>
>> >> https://issues.apache.org/jira/browse/YARN-3561
>> >>
>> >>
>> >> -Gour
>> >>
>> >> On 4/28/15, 7:48 AM, "Gour Saha" <[email protected]> wrote:
>> >>
>> >> >Can you send us the complete-config dump?
>> >> >
>> >> >-Gour
>> >> >
>> >> >On 4/28/15, 2:45 AM, "Chackravarthy Esakkimuthu"
>> >><[email protected]>
>> >> >wrote:
>> >> >
>> >> >>yes this is the config taken by slider also.
>> >> >>
>> >> >>
>> >>
>> >>
>> 
>>http://host2:8088/proxy/application_1428575950531_0016/ws/v1/slider/publi
>> >> >>s
>> >> >>her/slider/complete-config
>> >> >>
>> >> >>yarn.nodemanager.sleep-delay-before-sigkill.ms: "250"
>> >> >>
>> >> >>its default value coming from yarn-default.
>> >> >>We have not configured it in yarn-site.
>> >> >>
>> >> >>On Tue, Apr 28, 2015 at 3:03 PM, Chackravarthy Esakkimuthu <
>> >> >>[email protected]> wrote:
>> >> >>
>> >> >>> Following is the config which I get from RM UI,
>> >> >>>
>> >> >>> http://host2:8088/conf
>> >> >>>
>> >> >>> <property>
>> >> >>> <name>yarn.nodemanager.sleep-delay-before-sigkill.ms</name>
>> >> >>> <value>250</value>
>> >> >>> <source>yarn-default.xml</source>
>> >> >>> </property>
>> >> >>>
>> >> >>> On Tue, Apr 28, 2015 at 2:50 PM, Steve Loughran
>> >> >>><[email protected]>
>> >> >>> wrote:
>> >> >>>
>> >> >>>>
>> >> >>>> > On 28 Apr 2015, at 10:07, Chackravarthy Esakkimuthu <
>> >> >>>> [email protected]> wrote:
>> >> >>>> >
>> >> >>>> > sure, will send you the logs.
>> >> >>>> >
>> >> >>>> > And the same pattern follows for hbase installation also.
>> >> >>>> > 'stop' command stops only SliderAM.
>> >> >>>> > 'destroy' command stops HMaster and RegionServer only..
>> >>HBASE_REST
>> >> >>>>and
>> >> >>>> > THRIFT_2 still running after destroy command, And slider
>>agents
>> >> >>>>running
>> >> >>>> in
>> >> >>>> > all 4 hosts where container was launched.
>> >> >>>> >
>> >> >>>>
>> >> >>>>
>> >> >>>>
>> >> >>>> do you have YARN set up to actually kill processes when the
>> >>containers
>> >> >>>> are released.?
>> >> >>>>
>> >> >>>> For example:
>> >> >>>>
>> >> >>>> <!--time before the process gets a -9 -->
>> >> >>>> <property>
>> >> >>>>   <name>yarn.nodemanager.sleep-delay-before-sigkill.ms</name>
>> >> >>>>   <value>30000</value>
>> >> >>>> </property>
>> >> >>>>
>> >> >>>
>> >> >>>
>> >> >
>> >>
>> >>
>>
>>

Reply via email to