Manoj,
By any chance is it possible to find out (maybe from logs or sar files) if
there was HDFS unavailability (say NN node connection issue) around the
time of 2018-01-06 00:33 (based on the readlock file timestamp)?

-rw-r--r--   3 xxx xxx         23 2018-01-06 00:33
hdfs://xxx/user/xxx/.slider/cluster/spas/readlock


-Gour

On 1/17/18, 1:05 PM, "Manoj Samel" <manojsamelt...@gmail.com> wrote:

>Hello,
>
>Slider version 0.80 on CDH 5.5.1 cluster with kerberos
>
>Slider upgrade <App> --template /xxx/appConfig.json --resources
>/xxx/resources.json --queue tenant --force failed with following trace
>
>2018-01-17 20:31:23,030 [main] INFO  tools.SliderUtils - JVM initialized
>into secure mode with kerberos realm BIGDATA
>2018-01-17 20:31:23,869 [main]
>INFO  client.ConfiguredRMFailoverProxyProvider - Failing over to rm2
>2018-01-17 20:31:24,325 [main] WARN  client.SliderClient - Failed to get a
>Lock on Builder working with spas at
>hdfs://xxx/user/xxx/.slider/cluster/spas :
>org.apache.slider.core.persist.LockAcquireFailedException: Failed to
>acquire lock hdfs://xxx/user/xxx/.slider/cluster/spas/readlock
>org.apache.slider.core.persist.LockAcquireFailedException: Failed to
>acquire lock hdfs://xxx/user/xxx/.slider/cluster/spas/readlock
>    at
>org.apache.slider.core.persist.ConfPersister.acquireWritelock(ConfPersiste
>r.java:141)
>
>    at
>org.apache.slider.core.persist.ConfPersister.save(ConfPersister.java:253)
>    at
>org.apache.slider.core.build.InstanceBuilder.persist(InstanceBuilder.java:
>270)
>
>    at
>org.apache.slider.client.SliderClient.persistInstanceDefinition(SliderClie
>nt.java:1836)
>
>    at
>org.apache.slider.client.SliderClient.buildInstanceDefinition(SliderClient
>.java:1734)
>
>    at
>org.apache.slider.client.SliderClient.actionUpgrade(SliderClient.java:802)
>    at org.apache.slider.client.SliderClient.exec(SliderClient.java:542)
>    at
>org.apache.slider.client.SliderClient.runService(SliderClient.java:424)
>    at
>org.apache.slider.core.main.ServiceLauncher.launchService(ServiceLauncher.
>java:188)
>
>    at
>org.apache.slider.core.main.ServiceLauncher.launchServiceRobustly(ServiceL
>auncher.java:475)
>
>    at
>org.apache.slider.core.main.ServiceLauncher.launchServiceAndExit(ServiceLa
>uncher.java:403)
>
>    at
>org.apache.slider.core.main.ServiceLauncher.serviceMain(ServiceLauncher.ja
>va:630)
>
>    at org.apache.slider.Slider.main(Slider.java:49)
>2018-01-17 20:31:24,327 [main] ERROR main.ServiceLauncher - Failed to save
>spas: org.apache.slider.core.persist.LockAcquireFailedException: Failed to
>acquire lock hdfs://xxx/user/xxx/.slider/cluster/spas/readlock
>2018-01-17 20:31:24,328 [main] INFO  util.ExitUtil - Exiting with status
>70
>
>HDFS ls listing showed a file readlock was created few days back
>
>hdfs dfs -ls hdfs://xxx/user/xxx/.slider/cluster/spas
>...
>-rw-r--r--   3 xxx xxx         23 2018-01-06 00:33
>hdfs://xxx/user/xxx/.slider/cluster/spas/readlock
>...
>
>After deleting this file manually, the upgrade command works.
>
>Any idea when is this file created and why it was not removed ?
>
>Thanks in advance,
>
>Manoj

Reply via email to