sure Gour, would like to test with RM HA enabled. On Thu, Apr 30, 2015 at 12:12 AM, Gour Saha <[email protected]> wrote:
> Thanks. I updated the YARN bug with the OS info. > > I saw that RM HA is disabled. By the way there is a patch submitted by > YARN for the RM HA issue - > https://issues.apache.org/jira/browse/SLIDER-846 > > > As part of the YARN bug https://issues.apache.org/jira/browse/YARN-2605. > > If you want I can provide you a patch to test, if you are okay to get a > jar from us. > > -Gour > > On 4/29/15, 11:18 AM, "Chackravarthy Esakkimuthu" <[email protected]> > wrote: > > >OS installed is debian 7. > >And as I was facing issue (components were not starting) with RM HA > >enabled, I am testing it with RM HA disabled only. And yes, still NN HA is > >still enabled in the cluster. > > > >On Wed, Apr 29, 2015 at 11:37 PM, Gour Saha <[email protected]> > wrote: > > > >> Unfortunately we haven¹t reproduced this issue in the envs we usually > >>test > >> on. We might have to create an exact replica of your cluster (with RM > >>HA, > >> NN HA, OS version, # of nodes, etc.) to be able to reproduce it. The > >>YARN > >> team is looking into this issue. > >> > >> By the way, what is the OS and version of the nodes in your cluster? > >> > >> -Gour > >> > >> On 4/29/15, 10:49 AM, "Chackravarthy Esakkimuthu" > >><[email protected]> > >> wrote: > >> > >> >sure Gour, Thanks for helping out. > >> >Do you also see these kind of issues? Is it reproducible for you as > >>well? > >> > > >> >On Wed, Apr 29, 2015 at 8:58 PM, Gour Saha <[email protected]> > >>wrote: > >> > > >> >> Thanks Chackra for providing the Slider and NM logs and configs of > >>the > >> >> cluster. From the logs it seems like a YARN bug, so I went ahead and > >> >>filed > >> >> one. I will follow up with the YARN team to see what is causing this > >>- > >> >> > >> >> https://issues.apache.org/jira/browse/YARN-3561 > >> >> > >> >> > >> >> -Gour > >> >> > >> >> On 4/28/15, 7:48 AM, "Gour Saha" <[email protected]> wrote: > >> >> > >> >> >Can you send us the complete-config dump? > >> >> > > >> >> >-Gour > >> >> > > >> >> >On 4/28/15, 2:45 AM, "Chackravarthy Esakkimuthu" > >> >><[email protected]> > >> >> >wrote: > >> >> > > >> >> >>yes this is the config taken by slider also. > >> >> >> > >> >> >> > >> >> > >> >> > >> > >> > http://host2:8088/proxy/application_1428575950531_0016/ws/v1/slider/publi > >> >> >>s > >> >> >>her/slider/complete-config > >> >> >> > >> >> >>yarn.nodemanager.sleep-delay-before-sigkill.ms: "250" > >> >> >> > >> >> >>its default value coming from yarn-default. > >> >> >>We have not configured it in yarn-site. > >> >> >> > >> >> >>On Tue, Apr 28, 2015 at 3:03 PM, Chackravarthy Esakkimuthu < > >> >> >>[email protected]> wrote: > >> >> >> > >> >> >>> Following is the config which I get from RM UI, > >> >> >>> > >> >> >>> http://host2:8088/conf > >> >> >>> > >> >> >>> <property> > >> >> >>> <name>yarn.nodemanager.sleep-delay-before-sigkill.ms</name> > >> >> >>> <value>250</value> > >> >> >>> <source>yarn-default.xml</source> > >> >> >>> </property> > >> >> >>> > >> >> >>> On Tue, Apr 28, 2015 at 2:50 PM, Steve Loughran > >> >> >>><[email protected]> > >> >> >>> wrote: > >> >> >>> > >> >> >>>> > >> >> >>>> > On 28 Apr 2015, at 10:07, Chackravarthy Esakkimuthu < > >> >> >>>> [email protected]> wrote: > >> >> >>>> > > >> >> >>>> > sure, will send you the logs. > >> >> >>>> > > >> >> >>>> > And the same pattern follows for hbase installation also. > >> >> >>>> > 'stop' command stops only SliderAM. > >> >> >>>> > 'destroy' command stops HMaster and RegionServer only.. > >> >>HBASE_REST > >> >> >>>>and > >> >> >>>> > THRIFT_2 still running after destroy command, And slider > >>agents > >> >> >>>>running > >> >> >>>> in > >> >> >>>> > all 4 hosts where container was launched. > >> >> >>>> > > >> >> >>>> > >> >> >>>> > >> >> >>>> > >> >> >>>> do you have YARN set up to actually kill processes when the > >> >>containers > >> >> >>>> are released.? > >> >> >>>> > >> >> >>>> For example: > >> >> >>>> > >> >> >>>> <!--time before the process gets a -9 --> > >> >> >>>> <property> > >> >> >>>> <name>yarn.nodemanager.sleep-delay-before-sigkill.ms</name> > >> >> >>>> <value>30000</value> > >> >> >>>> </property> > >> >> >>>> > >> >> >>> > >> >> >>> > >> >> > > >> >> > >> >> > >> > >> > >
