[ 
https://issues.apache.org/jira/browse/SLIDER-599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14193556#comment-14193556
 ] 

Sumit Mohanty commented on SLIDER-599:
--------------------------------------

Another user reported the same issue where he was using Slider App View and was 
logged on as the hdfs user.

*Could this be the reason:*

When you use user hdfs and create an application (HBase) then you run into the 
issue of agents not being able to detect that the HBase components are running. 
_The pid file names are different between what the agent thinks and what 
hbase-daemon.sh thinks_.  This is a different JIRA.

So if HBase master is running while the app is being destroyed, is there a 
possibility of the "data" directly will not get deleted. I do not know for sure 
why the second call would succeed. One possibility is that HBase daemons stop 
after they detect that ZK node is not present and first destroy deletes the ZK 
node.

The reason I am bringing this up is, I ran into a situation where "slider list" 
failed to list an application and I was testing as hdfs. While "slider list" 
detected no app, "slider create" failed saying application is already present. 
"slider list" did not detect an app as internal.json was not present and the 
only sub-dir present under cluster folder was "data". I did not make the 
connection at that point but seems relevant in the light of this bug.

> When application is created as user hdfs need to call destroy twice to delete 
> the hdfs folder for the app
> ---------------------------------------------------------------------------------------------------------
>
>                 Key: SLIDER-599
>                 URL: https://issues.apache.org/jira/browse/SLIDER-599
>             Project: Slider
>          Issue Type: Bug
>          Components: client
>    Affects Versions: Slider 0.50
>            Reporter: Sumit Mohanty
>            Assignee: Steve Loughran
>             Fix For: Slider 0.60
>
>
> It was also reported by another user. This is not a critical issue as it is 
> not expected that application be created as user "hdfs".
> Assigning to check if there is any other issue hiding behind this symptom.
> {noformat}
> [hdfs@c6403 bin]$ ./slider destroy cl1
> 2014-11-01 04:34:06,112 [main] INFO  impl.TimelineClientImpl - Timeline 
> service address: http://c6403.ambari.apache.org:8188/ws/v1/timeline/
> 2014-11-01 04:34:07,161 [main] WARN  shortcircuit.DomainSocketFactory - The 
> short-circuit local reads feature cannot be used because libhadoop cannot be 
> loaded.
> 2014-11-01 04:34:07,172 [main] INFO  client.RMProxy - Connecting to 
> ResourceManager at c6403.ambari.apache.org/192.168.64.103:8050
> 2014-11-01 04:34:07,516 [main] INFO  zk.BlockingZKWatcher - waiting for ZK 
> event
> 2014-11-01 04:34:07,568 [main-EventThread] INFO  zk.BlockingZKWatcher - ZK 
> binding callback received
> 2014-11-01 04:34:07,572 [main] INFO  client.SliderClient - Deleting zookeeper 
> path /services/slider/users/hdfs/cl1
> 2014-11-01 04:34:07,852 [main] INFO  imps.CuratorFrameworkImpl - Starting
> 2014-11-01 04:34:07,942 [main-EventThread] INFO  state.ConnectionStateManager 
> - State change: CONNECTED
> 2014-11-01 04:34:07,943 [ConnectionStateManager-0] WARN  
> state.ConnectionStateManager - There are no ConnectionStateListeners 
> registered.
> 2014-11-01 04:34:08,969 [main] INFO  client.SliderClient - Destroyed cluster 
> cl1
> 2014-11-01 04:34:08,977 [main] INFO  util.ExitUtil - Exiting with status 0
> {noformat}
> {noformat}
> [hdfs@c6403 bin]$ ./slider create cl1 --template 
> /usr/work/hbase/appConfig.json --resources /usr/work/hbase/resources.json
> 2014-11-01 04:35:12,816 [main] INFO  impl.TimelineClientImpl - Timeline 
> service address: http://c6403.ambari.apache.org:8188/ws/v1/timeline/
> 2014-11-01 04:35:13,561 [main] WARN  shortcircuit.DomainSocketFactory - The 
> short-circuit local reads feature cannot be used because libhadoop cannot be 
> loaded.
> 2014-11-01 04:35:13,568 [main] INFO  client.RMProxy - Connecting to 
> ResourceManager at c6403.ambari.apache.org/192.168.64.103:8050
> 2014-11-01 04:35:14,028 [main] INFO  zk.BlockingZKWatcher - waiting for ZK 
> event
> 2014-11-01 04:35:14,052 [main-EventThread] INFO  zk.BlockingZKWatcher - ZK 
> binding callback received
> 2014-11-01 04:35:14,063 [main] INFO  agent.AgentClientProvider - Validating 
> app definition 
> .slider/package/HBASE/slider-hbase-app-package-0.98.4.2.2.0.0-1623-hadoop2.zip
> 2014-11-01 04:35:14,064 [main] INFO  agent.AgentUtils - Reading metainfo at 
> .slider/package/HBASE/slider-hbase-app-package-0.98.4.2.2.0.0-1623-hadoop2.zip
> 2014-11-01 04:35:14,299 [main] INFO  tools.SliderUtils - Reading metainfo.xml 
> of size 6909
> 2014-11-01 04:35:14,447 [main] ERROR tools.CoreFileSystem - Dir 
> hdfs://c6403.ambari.apache.org:8020/user/hdfs/.slider/cluster/cl1 exists: 
> hdfs://c6403.ambari.apache.org:8020/user/hdfs/.slider/cluster/cl1/database  0
> 2014-11-01 04:35:14,448 [main] ERROR main.ServiceLauncher - Application 
> Instance dir already exists: 
> hdfs://c6403.ambari.apache.org:8020/user/hdfs/.slider/cluster/cl1
> 2014-11-01 04:35:14,450 [main] INFO  util.ExitUtil - Exiting with status 75
> {noformat}
> {noformat}
> [hdfs@c6403 bin]$ hdfs dfs -ls /user/hdfs/.slider/cluster
> Found 1 items
> drwxr-xr-x   - hdfs hdfs          0 2014-11-01 04:34 
> /user/hdfs/.slider/cluster/cl1
> {noformat}
> {noformat}
> [hdfs@c6403 bin]$ ./slider destroy cl1
> 2014-11-01 04:37:25,003 [main] INFO  impl.TimelineClientImpl - Timeline 
> service address: http://c6403.ambari.apache.org:8188/ws/v1/timeline/
> 2014-11-01 04:37:25,682 [main] WARN  shortcircuit.DomainSocketFactory - The 
> short-circuit local reads feature cannot be used because libhadoop cannot be 
> loaded.
> 2014-11-01 04:37:25,692 [main] INFO  client.RMProxy - Connecting to 
> ResourceManager at c6403.ambari.apache.org/192.168.64.103:8050
> 2014-11-01 04:37:25,965 [main] INFO  zk.BlockingZKWatcher - waiting for ZK 
> event
> 2014-11-01 04:37:25,989 [main-EventThread] INFO  zk.BlockingZKWatcher - ZK 
> binding callback received
> 2014-11-01 04:37:25,993 [main] INFO  client.SliderClient - Deleting zookeeper 
> path /services/slider/users/hdfs/cl1
> 2014-11-01 04:37:26,037 [main] INFO  imps.CuratorFrameworkImpl - Starting
> 2014-11-01 04:37:26,099 [main-EventThread] INFO  state.ConnectionStateManager 
> - State change: CONNECTED
> 2014-11-01 04:37:26,100 [ConnectionStateManager-0] WARN  
> state.ConnectionStateManager - There are no ConnectionStateListeners 
> registered.
> 2014-11-01 04:37:27,107 [main] INFO  client.SliderClient - Destroyed cluster 
> cl1
> 2014-11-01 04:37:27,109 [main] INFO  util.ExitUtil - Exiting with status 0
> {noformat}
> {noformat}
> [hdfs@c6403 bin]$ hdfs dfs -ls /user/hdfs/.slider/cluster
> [hdfs@c6403 bin]$
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to