Re: Slider upgrade command error : org.apache.slider.core.persist.LockAcquireFailedException: Failed to acquire lock

2018-01-17 Thread Gour Saha
You are probably right that this file is not deleted on a stop. I donĀ¹t have a cluster with Slider to quickly test this. YARN Service keeps us busy you know :) But if you can replicate this you should file a jira. -Gour On 1/17/18, 5:02 PM, "Manoj Samel" wrote:

Re: Slider upgrade command error : org.apache.slider.core.persist.LockAcquireFailedException: Failed to acquire lock

2018-01-17 Thread Manoj Samel
Gour, Thanks for the prompt reply. 1. Temp hickup in HDFS as possible cause has been on mind as well, wanted to reach out to slider community to check if there were other issues causing this symptom. 2. I remember I had stopped and started the slider app after this time stamp.

Re: Slider upgrade command error : org.apache.slider.core.persist.LockAcquireFailedException: Failed to acquire lock

2018-01-17 Thread Gour Saha
Manoj, By any chance is it possible to find out (maybe from logs or sar files) if there was HDFS unavailability (say NN node connection issue) around the time of 2018-01-06 00:33 (based on the readlock file timestamp)? -rw-r--r-- 3 xxx xxx 23 2018-01-06 00:33

Slider upgrade command error : org.apache.slider.core.persist.LockAcquireFailedException: Failed to acquire lock

2018-01-17 Thread Manoj Samel
Hello, Slider version 0.80 on CDH 5.5.1 cluster with kerberos Slider upgrade --template /xxx/appConfig.json --resources /xxx/resources.json --queue tenant --force failed with following trace 2018-01-17 20:31:23,030 [main] INFO tools.SliderUtils - JVM initialized into secure mode with kerberos