Nathan Falk created AMBARI-12974:
------------------------------------
Summary: fast-hdfs-resource fails when sticky bit is used for chmod
Key: AMBARI-12974
URL: https://issues.apache.org/jira/browse/AMBARI-12974
Project: Ambari
Issue Type: Bug
Components: contrib
Affects Versions: 2.1.0
Environment: x86 or power, any OS
IBM Open Platform
Reporter: Nathan Falk
IBM Open Platform version 4.1 uses the permission 01777 for Spark's event log
directory:
{code}
[root@compute000 ~]# grep spark_eventlog_dir_mode
/var/lib/ambari-server/resources/stacks/BigInsights/4.1/services/SPARK/package/scripts/params.py
spark_eventlog_dir_mode = 01777
{code}
In our case, the error is that the Spark History Server fails to start with an
IllegalArgumentException:
{code}
Traceback (most recent call last):
File
"/var/lib/ambari-agent/cache/stacks/BigInsights/4.1/services/SPARK/package/scripts/job_history_server.py",
line 167, in <module>
JobHistoryServer().execute()
File
"/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py",
line 218, in execute
method(env)
File
"/var/lib/ambari-agent/cache/stacks/BigInsights/4.1/services/SPARK/package/scripts/job_history_server.py",
line 73, in start
self.create_historyServer_directory()
File
"/var/lib/ambari-agent/cache/stacks/BigInsights/4.1/services/SPARK/package/scripts/job_history_server.py",
line 120, in create_historyServer_directory
params.HdfsResource(None, action="execute")
File "/usr/lib/python2.6/site-packages/resource_management/core/base.py",
line 157, in __init__
self.env.run()
File
"/usr/lib/python2.6/site-packages/resource_management/core/environment.py",
line 152, in run
self.run_action(resource, action)
File
"/usr/lib/python2.6/site-packages/resource_management/core/environment.py",
line 118, in run_action
provider_action()
File
"/usr/lib/python2.6/site-packages/resource_management/libraries/providers/hdfs_resource.py",
line 396, in action_execute
self.get_hdfs_resource_executor().action_execute(self)
File
"/usr/lib/python2.6/site-packages/resource_management/libraries/providers/hdfs_resource.py",
line 117, in action_execute
logoutput=logoutput,
File "/usr/lib/python2.6/site-packages/resource_management/core/base.py",
line 157, in __init__
self.env.run()
File
"/usr/lib/python2.6/site-packages/resource_management/core/environment.py",
line 152, in run
self.run_action(resource, action)
File
"/usr/lib/python2.6/site-packages/resource_management/core/environment.py",
line 118, in run_action
provider_action()
File
"/usr/lib/python2.6/site-packages/resource_management/core/providers/system.py",
line 258, in action_run
tries=self.resource.tries, try_sleep=self.resource.try_sleep)
File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py",
line 70, in inner
result = function(command, **kwargs)
File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py",
line 92, in checked_call
tries=tries, try_sleep=try_sleep)
File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py",
line 140, in _call_wrapper
result = _call(command, **kwargs_copy)
File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py",
line 291, in _call
raise Fail(err_msg)
resource_management.core.exceptions.Fail: Execution of 'hadoop --config
/usr/iop/current/hadoop-client/conf jar
/var/lib/ambari-agent/lib/fast-hdfs-resource.jar
/var/lib/ambari-agent/data/hdfs_resources.json' returned 1. WARNING: Use "yarn
jar" to launch YARN applications.
Using filesystem uri: hdfs://localhost:8020
Creating: Resource [source=null, target=/user/spark, type=directory,
action=create, owner=spark, group=hadoop, mode=755, recursiveChown=false,
recursiveChmod=false, changePermissionforParents=false]
Creating: Resource [source=null,
target=hdfs://localhost:8020/iop/apps/4.1.0.0/spark/logs/history-server,
type=directory, action=create, owner=spark, group=hadoop, mode=1777,
recursiveChown=false, recursiveChmod=false, changePermissionforParents=false]
Exception in thread "main" java.lang.IllegalArgumentException: 1777
at
org.apache.hadoop.fs.permission.PermissionParser.<init>(PermissionParser.java:60)
at org.apache.hadoop.fs.permission.UmaskParser.<init>(UmaskParser.java:42)
at
org.apache.hadoop.fs.permission.FsPermission.<init>(FsPermission.java:106)
at org.apache.ambari.fast_hdfs_resource.Resource.setMode(Resource.java:217)
at org.apache.ambari.fast_hdfs_resource.Runner.main(Runner.java:78)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
{code}
The problem is in fast-hdfs-resource, which means that when WebHDFS is used, a
different path is taken and there is no error.
Since some of our users are using IBM Spectrum Scale instead of HDFS, it is not
possible to enable WebHDFS, and so fast-hdfs-resource is used for all hadoop
file operations.
A JIRA had been opened for this problem previously, and a patch provided
(AMBARI-11351). The JIRA was closed because it was thought that the problem
went away. In reality, the problem was still there, but it was masked by the
use of WebHDFS.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)