[
https://issues.apache.org/jira/browse/AMBARI-22918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16354873#comment-16354873
]
Raghavender Rao Guruvannagari edited comment on AMBARI-22918 at 2/7/18 2:24 AM:
--------------------------------------------------------------------------------
Issue can be recreated in lab and it looks that the problem is with
"-Djava.security.auth.login.config" option passed. hbase.distro command takes
this as command and the it is passed as first argument to the script.
Same command executed manually with shell debug show how argument is passed as
below and fails with the error "Could not find or load main class
org.jruby.Main"
{code:java}
+ exec /usr/jdk64/jdk1.8.0_112/bin/java
-Dproc_-Djava.security.auth.login.config=/usr/hdp/current/hbase-master/conf/hbase_master_jaas.conf
'-XX:OnOutOfMemoryError=kill -9 %p' -XX:+UseConcMarkSweepGC
-XX:ErrorFile=/var/log/hbase/hs_err_pid%p.log
-Djava.security.auth.login.config=/usr/hdp/current/hbase-regionserver/conf/hbase_client_jaas.conf
-Djava.io.tmpdir=/tmp -Dhbase.log.dir=/var/log/hbase
-Dhbase.log.file=hbase.log -Dhbase.home.dir=/usr/hdp/2.6.3.0-235/hbase/bin/..
-Dhbase.id.str= -Dhbase.root.logger=INFO,console
-Djava.library.path=:/usr/hdp/2.6.3.0-235/hadoop/lib/native/Linux-amd64-64:/usr/lib/hadoop/lib/native/Linux-amd64-64:/usr/hdp/current/hadoop-client/lib/native/Linux-amd64-64:/usr/hdp/2.6.3.0-235/hadoop/lib/native
-Dhbase.security.logger=INFO,NullAppender
-Djava.security.auth.login.config=/usr/hdp/current/hbase-master/conf/hbase_master_jaas.conf
org.jruby.Main /usr/hdp/current/hbase-master/bin/draining_servers.rb add worker
Error: Could not find or load main class org.jruby.Main{code}
[https://github.com/apache/hbase/blob/master/bin/hbase]
{code:java}
112 # get arguments
113 COMMAND=$1
114 shift
115
[...]
484 if [ "${HBASE_NOEXEC}" != "" ]; then
485 "$JAVA" -Dproc_$COMMAND -XX:OnOutOfMemoryError="kill -9 %p"
$HEAP_SETTINGS $HBASE_OPTS $CLASS "$@"
486 else
487 exec "$JAVA" -Dproc_$COMMAND -XX:OnOutOfMemoryError="kill -9 %p"
$HEAP_SETTINGS $HBASE_OPTS $CLASS "$@"
488 fi
{code}
This option was added to ambari script as part of "AMBARI-15295" which is
causing this issue and doesnt seem to be required now in "
/var/lib/ambari-agent/cache/common-services/HBASE/0.96.0.2.0/package/scripts/hbase_decommission.py"
Removing "\{master_security_config}" from below three lines in
hbase_decommission.py fixes the issue:
{code:java}
66 "{kinit_cmd} {hbase_cmd} --config {hbase_conf_dir}
{master_security_config} org.jruby.Main {region_drainer} remove {host}")
78 "{kinit_cmd} {hbase_cmd} --config {hbase_conf_dir}
{master_security_config} org.jruby.Main {region_drainer} add {host}")
80 "{kinit_cmd} {hbase_cmd} --config {hbase_conf_dir}
{master_security_config} org.jruby.Main {region_mover} unload {host}")
{code}
Changed to below
{code:java}
66 "{kinit_cmd} {hbase_cmd} --config {hbase_conf_dir} org.jruby.Main
{region_drainer} remove {host}")
78 "{kinit_cmd} {hbase_cmd} --config {hbase_conf_dir} org.jruby.Main
{region_drainer} add {host}")
80 "{kinit_cmd} {hbase_cmd} --config {hbase_conf_dir} org.jruby.Main
{region_mover} unload {host}")
{code}
was (Author: rguruvannagari):
Issue can be recreated in lab and it looks that the problem is with
"-Djava.security.auth.login.config" option passed. hbase.distro command takes
this as command and the it is passed as first argument to the script.
Same command executed manually with shell debug show how argument is passed as
below and fails with the error "Could not find or load main class
org.jruby.Main"
{code:java}
+ exec /usr/jdk64/jdk1.8.0_112/bin/java
-Dproc_-Djava.security.auth.login.config=/usr/hdp/current/hbase-master/conf/hbase_master_jaas.conf
'-XX:OnOutOfMemoryError=kill -9 %p' -XX:+UseConcMarkSweepGC
-XX:ErrorFile=/var/log/hbase/hs_err_pid%p.log
-Djava.security.auth.login.config=/usr/hdp/current/hbase-regionserver/conf/hbase_client_jaas.conf
-Djava.io.tmpdir=/tmp -Dhbase.log.dir=/var/log/hbase
-Dhbase.log.file=hbase.log -Dhbase.home.dir=/usr/hdp/2.6.3.0-235/hbase/bin/..
-Dhbase.id.str= -Dhbase.root.logger=INFO,console
-Djava.library.path=:/usr/hdp/2.6.3.0-235/hadoop/lib/native/Linux-amd64-64:/usr/lib/hadoop/lib/native/Linux-amd64-64:/usr/hdp/current/hadoop-client/lib/native/Linux-amd64-64:/usr/hdp/2.6.3.0-235/hadoop/lib/native
-Dhbase.security.logger=INFO,NullAppender
-Djava.security.auth.login.config=/usr/hdp/current/hbase-master/conf/hbase_master_jaas.conf
org.jruby.Main /usr/hdp/current/hbase-master/bin/draining_servers.rb add worker
Error: Could not find or load main class org.jruby.Main{code}
[https://github.com/apache/hbase/blob/master/bin/hbase]
{code:java}
112 # get arguments
113 COMMAND=$1
114 shift
115
[...]
484 if [ "${HBASE_NOEXEC}" != "" ]; then
485 "$JAVA" -Dproc_$COMMAND -XX:OnOutOfMemoryError="kill -9 %p"
$HEAP_SETTINGS $HBASE_OPTS $CLASS "$@"
486 else
487 exec "$JAVA" -Dproc_$COMMAND -XX:OnOutOfMemoryError="kill -9 %p"
$HEAP_SETTINGS $HBASE_OPTS $CLASS "$@"
488 fi
{code}
In recent HDP versions "-Djava.security.auth.login.config" is not required as
this option is already passed as "HBASE_OPTS" via hbase-env.sh.
{code:java}
{% if security_enabled %}
export HBASE_OPTS="$HBASE_OPTS -XX:+UseConcMarkSweepGC
-XX:ErrorFile={{log_dir}}/hs_err_pid%p.log
-Djava.security.auth.login.config={{client_jaas_config_file}}
-Djava.io.tmpdir={{java_io_tmpdir}}"
{code}
This option was added to ambari script as part of "AMBARI-15295" which is
causing this issue and doesnt seem to be required now in "
/var/lib/ambari-agent/cache/common-services/HBASE/0.96.0.2.0/package/scripts/hbase_decommission.py"
Removing "\{master_security_config}" from below three lines in
hbase_decommission.py fixes the issue:
{code:java}
66 "{kinit_cmd} {hbase_cmd} --config {hbase_conf_dir}
{master_security_config} org.jruby.Main {region_drainer} remove {host}")
78 "{kinit_cmd} {hbase_cmd} --config {hbase_conf_dir}
{master_security_config} org.jruby.Main {region_drainer} add {host}")
80 "{kinit_cmd} {hbase_cmd} --config {hbase_conf_dir}
{master_security_config} org.jruby.Main {region_mover} unload {host}")
{code}
Changed to below
{code:java}
66 "{kinit_cmd} {hbase_cmd} --config {hbase_conf_dir} org.jruby.Main
{region_drainer} remove {host}")
78 "{kinit_cmd} {hbase_cmd} --config {hbase_conf_dir} org.jruby.Main
{region_drainer} add {host}")
80 "{kinit_cmd} {hbase_cmd} --config {hbase_conf_dir} org.jruby.Main
{region_mover} unload {host}")
{code}
> Decommission RegionServer fails when kerberos is enabled
> --------------------------------------------------------
>
> Key: AMBARI-22918
> URL: https://issues.apache.org/jira/browse/AMBARI-22918
> Project: Ambari
> Issue Type: Bug
> Components: ambari-server
> Reporter: Toshihiro Suzuki
> Priority: Major
> Labels: pull-request-available
> Time Spent: 50m
> Remaining Estimate: 0h
>
> When kerberos is enabled, Decommission RegionServer fails with the following
> errors:
> stderr:
> {code:java}
> Traceback (most recent call last):
> File
> "/var/lib/ambari-agent/cache/common-services/HBASE/0.96.0.2.0/package/scripts/hbase_master.py",
> line 114, in <module>
> HbaseMaster().execute()
> File
> "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py",
> line 329, in execute
> method(env)
> File
> "/var/lib/ambari-agent/cache/common-services/HBASE/0.96.0.2.0/package/scripts/hbase_master.py",
> line 55, in decommission
> hbase_decommission(env)
> File "/usr/lib/python2.6/site-packages/ambari_commons/os_family_impl.py",
> line 89, in thunk
> return fn(*args, **kwargs)
> File
> "/var/lib/ambari-agent/cache/common-services/HBASE/0.96.0.2.0/package/scripts/hbase_decommission.py",
> line 84, in hbase_decommission
> logoutput=True
> File "/usr/lib/python2.6/site-packages/resource_management/core/base.py",
> line 166, in __init__
> self.env.run()
> File
> "/usr/lib/python2.6/site-packages/resource_management/core/environment.py",
> line 160, in run
> self.run_action(resource, action)
> File
> "/usr/lib/python2.6/site-packages/resource_management/core/environment.py",
> line 124, in run_action
> provider_action()
> File
> "/usr/lib/python2.6/site-packages/resource_management/core/providers/system.py",
> line 262, in action_run
> tries=self.resource.tries, try_sleep=self.resource.try_sleep)
> File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py",
> line 72, in inner
> result = function(command, **kwargs)
> File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py",
> line 102, in checked_call
> tries=tries, try_sleep=try_sleep,
> timeout_kill_strategy=timeout_kill_strategy)
> File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py",
> line 150, in _call_wrapper
> result = _call(command, **kwargs_copy)
> File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py",
> line 303, in _call
> raise ExecutionFailed(err_msg, code, out, err)
> resource_management.core.exceptions.ExecutionFailed: Execution of
> '/usr/bin/kinit -kt /etc/security/keytabs/hbase.service.keytab
> hbase/[email protected]; /usr/hdp/current/hbase-master/bin/hbase --config
> /usr/hdp/current/hbase-master/conf
> -Djava.security.auth.login.config=/usr/hdp/current/hbase-master/conf/hbase_master_jaas.conf
> org.jruby.Main /usr/hdp/current/hbase-master/bin/draining_servers.rb add
> worker1' returned 1. Error: Could not find or load main class
> org.jruby.Main{code}
> stdout:
> {code:java}
> 2018-02-06 07:25:03,453 - Stack Feature Version Info: Cluster Stack=2.6,
> Cluster Current Version=2.6.2.0-205, Command Stack=None, Command
> Version=2.6.2.0-205 -> 2.6.2.0-205
> 2018-02-06 07:25:03,476 - Using hadoop conf dir:
> /usr/hdp/current/hadoop-client/conf
> 2018-02-06 07:25:03,484 - checked_call['hostid'] {}
> 2018-02-06 07:25:03,490 - checked_call returned (0, '1aacc56c')
> 2018-02-06 07:25:03,502 -
> File['/usr/hdp/current/hbase-master/bin/draining_servers.rb'] {'content':
> StaticFile('draining_servers.rb'), 'mode': 0755}
> 2018-02-06 07:25:03,504 - Execute['/usr/bin/kinit -kt
> /etc/security/keytabs/hbase.service.keytab hbase/[email protected];
> /usr/hdp/current/hbase-master/bin/hbase --config
> /usr/hdp/current/hbase-master/conf
> -Djava.security.auth.login.config=/usr/hdp/current/hbase-master/conf/hbase_master_jaas.conf
> org.jruby.Main /usr/hdp/current/hbase-master/bin/draining_servers.rb add
> worker1'] {'logoutput': True, 'user': 'hbase'}
> Error: Could not find or load main class org.jruby.Main
> Command failed after 1 tries{code}
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)