[
https://issues.apache.org/jira/browse/SLIDER-599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14193556#comment-14193556
]
Sumit Mohanty commented on SLIDER-599:
--------------------------------------
Another user reported the same issue where he was using Slider App View and was
logged on as the hdfs user.
*Could this be the reason:*
When you use user hdfs and create an application (HBase) then you run into the
issue of agents not being able to detect that the HBase components are running.
_The pid file names are different between what the agent thinks and what
hbase-daemon.sh thinks_. This is a different JIRA.
So if HBase master is running while the app is being destroyed, is there a
possibility of the "data" directly will not get deleted. I do not know for sure
why the second call would succeed. One possibility is that HBase daemons stop
after they detect that ZK node is not present and first destroy deletes the ZK
node.
The reason I am bringing this up is, I ran into a situation where "slider list"
failed to list an application and I was testing as hdfs. While "slider list"
detected no app, "slider create" failed saying application is already present.
"slider list" did not detect an app as internal.json was not present and the
only sub-dir present under cluster folder was "data". I did not make the
connection at that point but seems relevant in the light of this bug.
> When application is created as user hdfs need to call destroy twice to delete
> the hdfs folder for the app
> ---------------------------------------------------------------------------------------------------------
>
> Key: SLIDER-599
> URL: https://issues.apache.org/jira/browse/SLIDER-599
> Project: Slider
> Issue Type: Bug
> Components: client
> Affects Versions: Slider 0.50
> Reporter: Sumit Mohanty
> Assignee: Steve Loughran
> Fix For: Slider 0.60
>
>
> It was also reported by another user. This is not a critical issue as it is
> not expected that application be created as user "hdfs".
> Assigning to check if there is any other issue hiding behind this symptom.
> {noformat}
> [hdfs@c6403 bin]$ ./slider destroy cl1
> 2014-11-01 04:34:06,112 [main] INFO impl.TimelineClientImpl - Timeline
> service address: http://c6403.ambari.apache.org:8188/ws/v1/timeline/
> 2014-11-01 04:34:07,161 [main] WARN shortcircuit.DomainSocketFactory - The
> short-circuit local reads feature cannot be used because libhadoop cannot be
> loaded.
> 2014-11-01 04:34:07,172 [main] INFO client.RMProxy - Connecting to
> ResourceManager at c6403.ambari.apache.org/192.168.64.103:8050
> 2014-11-01 04:34:07,516 [main] INFO zk.BlockingZKWatcher - waiting for ZK
> event
> 2014-11-01 04:34:07,568 [main-EventThread] INFO zk.BlockingZKWatcher - ZK
> binding callback received
> 2014-11-01 04:34:07,572 [main] INFO client.SliderClient - Deleting zookeeper
> path /services/slider/users/hdfs/cl1
> 2014-11-01 04:34:07,852 [main] INFO imps.CuratorFrameworkImpl - Starting
> 2014-11-01 04:34:07,942 [main-EventThread] INFO state.ConnectionStateManager
> - State change: CONNECTED
> 2014-11-01 04:34:07,943 [ConnectionStateManager-0] WARN
> state.ConnectionStateManager - There are no ConnectionStateListeners
> registered.
> 2014-11-01 04:34:08,969 [main] INFO client.SliderClient - Destroyed cluster
> cl1
> 2014-11-01 04:34:08,977 [main] INFO util.ExitUtil - Exiting with status 0
> {noformat}
> {noformat}
> [hdfs@c6403 bin]$ ./slider create cl1 --template
> /usr/work/hbase/appConfig.json --resources /usr/work/hbase/resources.json
> 2014-11-01 04:35:12,816 [main] INFO impl.TimelineClientImpl - Timeline
> service address: http://c6403.ambari.apache.org:8188/ws/v1/timeline/
> 2014-11-01 04:35:13,561 [main] WARN shortcircuit.DomainSocketFactory - The
> short-circuit local reads feature cannot be used because libhadoop cannot be
> loaded.
> 2014-11-01 04:35:13,568 [main] INFO client.RMProxy - Connecting to
> ResourceManager at c6403.ambari.apache.org/192.168.64.103:8050
> 2014-11-01 04:35:14,028 [main] INFO zk.BlockingZKWatcher - waiting for ZK
> event
> 2014-11-01 04:35:14,052 [main-EventThread] INFO zk.BlockingZKWatcher - ZK
> binding callback received
> 2014-11-01 04:35:14,063 [main] INFO agent.AgentClientProvider - Validating
> app definition
> .slider/package/HBASE/slider-hbase-app-package-0.98.4.2.2.0.0-1623-hadoop2.zip
> 2014-11-01 04:35:14,064 [main] INFO agent.AgentUtils - Reading metainfo at
> .slider/package/HBASE/slider-hbase-app-package-0.98.4.2.2.0.0-1623-hadoop2.zip
> 2014-11-01 04:35:14,299 [main] INFO tools.SliderUtils - Reading metainfo.xml
> of size 6909
> 2014-11-01 04:35:14,447 [main] ERROR tools.CoreFileSystem - Dir
> hdfs://c6403.ambari.apache.org:8020/user/hdfs/.slider/cluster/cl1 exists:
> hdfs://c6403.ambari.apache.org:8020/user/hdfs/.slider/cluster/cl1/database 0
> 2014-11-01 04:35:14,448 [main] ERROR main.ServiceLauncher - Application
> Instance dir already exists:
> hdfs://c6403.ambari.apache.org:8020/user/hdfs/.slider/cluster/cl1
> 2014-11-01 04:35:14,450 [main] INFO util.ExitUtil - Exiting with status 75
> {noformat}
> {noformat}
> [hdfs@c6403 bin]$ hdfs dfs -ls /user/hdfs/.slider/cluster
> Found 1 items
> drwxr-xr-x - hdfs hdfs 0 2014-11-01 04:34
> /user/hdfs/.slider/cluster/cl1
> {noformat}
> {noformat}
> [hdfs@c6403 bin]$ ./slider destroy cl1
> 2014-11-01 04:37:25,003 [main] INFO impl.TimelineClientImpl - Timeline
> service address: http://c6403.ambari.apache.org:8188/ws/v1/timeline/
> 2014-11-01 04:37:25,682 [main] WARN shortcircuit.DomainSocketFactory - The
> short-circuit local reads feature cannot be used because libhadoop cannot be
> loaded.
> 2014-11-01 04:37:25,692 [main] INFO client.RMProxy - Connecting to
> ResourceManager at c6403.ambari.apache.org/192.168.64.103:8050
> 2014-11-01 04:37:25,965 [main] INFO zk.BlockingZKWatcher - waiting for ZK
> event
> 2014-11-01 04:37:25,989 [main-EventThread] INFO zk.BlockingZKWatcher - ZK
> binding callback received
> 2014-11-01 04:37:25,993 [main] INFO client.SliderClient - Deleting zookeeper
> path /services/slider/users/hdfs/cl1
> 2014-11-01 04:37:26,037 [main] INFO imps.CuratorFrameworkImpl - Starting
> 2014-11-01 04:37:26,099 [main-EventThread] INFO state.ConnectionStateManager
> - State change: CONNECTED
> 2014-11-01 04:37:26,100 [ConnectionStateManager-0] WARN
> state.ConnectionStateManager - There are no ConnectionStateListeners
> registered.
> 2014-11-01 04:37:27,107 [main] INFO client.SliderClient - Destroyed cluster
> cl1
> 2014-11-01 04:37:27,109 [main] INFO util.ExitUtil - Exiting with status 0
> {noformat}
> {noformat}
> [hdfs@c6403 bin]$ hdfs dfs -ls /user/hdfs/.slider/cluster
> [hdfs@c6403 bin]$
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)