[ https://issues.apache.org/jira/browse/AMBARI-14213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Dmitry Lysnichenko updated AMBARI-14213: ---------------------------------------- Affects Version/s: 2.2.0 > Decommission RegionServer fails > ------------------------------- > > Key: AMBARI-14213 > URL: https://issues.apache.org/jira/browse/AMBARI-14213 > Project: Ambari > Issue Type: Bug > Components: ambari-server > Affects Versions: 2.2.0 > Reporter: Dmitry Lysnichenko > Assignee: Dmitry Lysnichenko > Fix For: 2.2.0 > > Attachments: AMBARI-14213.patch > > > Decommision of Region server after deployment has failed with the below > error. please help take a look: > {code} > ############################################################################## > { > "href" : "http://host:8080/api/v1/clusters/cl1/requests/85", > "Requests" : { > "aborted_task_count" : 0, > "cluster_name" : "cl1", > "completed_task_count" : 1, > "create_time" : 1448862258061, > "end_time" : 1448862275882, > "exclusive" : true, > "failed_task_count" : 1, > "id" : 85, > "inputs" : > "{\"excluded_hosts\":\"host2.novalocal\",\"slave_type\":\"HBASE_REGIONSERVER\"}", > "operation_level" : "HOST_COMPONENT", > "progress_percent" : 100.0, > "queued_task_count" : 0, > "request_context" : "Decommission RegionServer - Turn drain mode on", > "request_status" : "FAILED", > "resource_filters" : [ > { > "service_name" : "HBASE", > "component_name" : "HBASE_MASTER" > } > ], > "start_time" : 1448862258080, > "task_count" : 1, > "timed_out_task_count" : 0, > "type" : "COMMAND", > "request_schedule" : { > "href" : "http://host:8080/api/v1/clusters/cl1/request_schedules/8", > "schedule_id" : 8 > } > }, > "stages" : [ > { > "href" : "http://host:8080/api/v1/clusters/cl1/requests/85/stages/0", > "Stage" : { > "cluster_name" : "cl1", > "request_id" : 85, > "stage_id" : 0 > } > } > ], > "tasks" : [ > { > "href" : "http://host:8080/api/v1/clusters/cl1/requests/85/tasks/948", > "Tasks" : { > "cluster_name" : "cl1", > "id" : 948, > "request_id" : 85, > "stage_id" : 0 > } > } > ] > } > -------------------------------------------------------------------------------- > { > "href" : "http://host:8080/api/v1/clusters/cl1/requests/85/tasks/948", > "Tasks" : { > "attempt_cnt" : 1, > "cluster_name" : "cl1", > "command" : "CUSTOM_COMMAND", > "command_detail" : "DECOMMISSION, Excluded: host2.novalocal", > "custom_command_name" : "DECOMMISSION", > "end_time" : 1448862275871, > "error_log" : "/var/lib/ambari-agent/data/errors-948.txt", > "exit_code" : 1, > "host_name" : "os-r6-hirdwu-ambari-hv-db-6-4.novalocal", > "id" : 948, > "output_log" : "/var/lib/ambari-agent/data/output-948.txt", > "request_id" : 85, > "role" : "HBASE_MASTER", > "stage_id" : 0, > "start_time" : 1448862258084, > "status" : "FAILED", > "stderr" : "Traceback (most recent call last):\n File > \"/var/lib/ambari-agent/cache/common-services/HBASE/0.96.0.2.0/package/scripts/hbase_master.py\", > line 149, in <module>\n HbaseMaster().execute()\n File > \"/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py\", > line 217, in execute\n method(env)\n File > \"/var/lib/ambari-agent/cache/common-services/HBASE/0.96.0.2.0/package/scripts/hbase_master.py\", > line 48, in decommission\n hbase_decommission(env)\n File > \"/usr/lib/python2.6/site-packages/ambari_commons/os_family_impl.py\", line > 89, in thunk\n return fn(*args, **kwargs)\n File > \"/var/lib/ambari-agent/cache/common-services/HBASE/0.96.0.2.0/package/scripts/hbase_decommission.py\", > line 84, in hbase_decommission\n logoutput=True\n File > \"/usr/lib/python2.6/site-packages/resource_management/core/base.py\", line > 154, in __init__\n self.env.run()\n File > \"/usr/lib/python2.6/site-packages/resource_management/core/environment.py\", > line 158, in run\n self.run_action(resource, action)\n File > \"/usr/lib/python2.6/site-packages/resource_management/core/environment.py\", > line 121, in run_action\n provider_action()\n File > \"/usr/lib/python2.6/site-packages/resource_management/core/providers/system.py\", > line 238, in action_run\n tries=self.resource.tries, > try_sleep=self.resource.try_sleep)\n File > \"/usr/lib/python2.6/site-packages/resource_management/core/shell.py\", line > 70, in inner\n result = function(command, **kwargs)\n File > \"/usr/lib/python2.6/site-packages/resource_management/core/shell.py\", line > 92, in checked_call\n tries=tries, try_sleep=try_sleep)\n File > \"/usr/lib/python2.6/site-packages/resource_management/core/shell.py\", line > 140, in _call_wrapper\n result = _call(command, **kwargs_copy)\n File > \"/usr/lib/python2.6/site-packages/resource_management/core/shell.py\", line > 291, in _call\n raise > Fail(err_msg)\nresource_management.core.exceptions.Fail: Execution of > '/usr/bin/kinit -kt /etc/security/keytabs/hbase.headless.keytab > hbaserndwebei4ypdy0wtttcdnb...@hwqe.hortonworks.com; > /usr/hdp/current/hbase-master/bin/hbase --config > /usr/hdp/current/hbase-master/conf org.jruby.Main > /usr/hdp/current/hbase-master/bin/draining_servers.rb add host2.novalocal' > returned 1. ######## Hortonworks #############\nThis is MOTD message, added > for testing in qe infra\nSLF4J: Class path contains multiple SLF4J > bindings.\nSLF4J: Found binding in > [jar:file:/grid/0/hdp/2.3.4.0-3360/hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]\nSLF4J: > Found binding in > [jar:file:/grid/0/hdp/2.3.4.0-3360/zookeeper/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]\nSLF4J: > See http://www.slf4j.org/codes.html#multiple_bindings for an > explanation.\nSLF4J: Actual binding is of type > [org.slf4j.impl.Log4jLoggerFactory]\nNativeException: > org.apache.zookeeper.KeeperException$NoAuthException: KeeperErrorCode = > NoAuth for /hbase-secure/draining/host2.novalocal,16020,1448856025519\n > __for__ at /usr/hdp/current/hbase-master/bin/draining_servers.rb:94\n > each at org/jruby/RubyArray.java:1620\n __ensure__ at > /usr/hdp/current/hbase-master/bin/draining_servers.rb:92\n addServers at > /usr/hdp/current/hbase-master/bin/draining_servers.rb:91\n (root) at > /usr/hdp/current/hbase-master/bin/draining_servers.rb:152", > "stdout" : "2015-11-30 05:44:19,254 - > File['/usr/hdp/current/hbase-master/bin/draining_servers.rb'] {'content': > StaticFile('draining_servers.rb'), 'mode': 0755}\n2015-11-30 05:44:19,386 - > Execute['/usr/bin/kinit -kt /etc/security/keytabs/hbase.headless.keytab > hbaserndwebei4ypdy0wtttcdnb...@hwqe.hortonworks.com; > /usr/hdp/current/hbase-master/bin/hbase --config > /usr/hdp/current/hbase-master/conf org.jruby.Main > /usr/hdp/current/hbase-master/bin/draining_servers.rb add host2.novalocal'] > {'logoutput': True, 'user': 'cstm-hbase'}\n######## Hortonworks > #############\nThis is MOTD message, added for testing in qe infra\nSLF4J: > Class path contains multiple SLF4J bindings.\nSLF4J: Found binding in > [jar:file:/grid/0/hdp/2.3.4.0-3360/hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]\nSLF4J: > Found binding in > [jar:file:/grid/0/hdp/2.3.4.0-3360/zookeeper/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]\nSLF4J: > See http://www.slf4j.org/codes.html#multiple_bindings for an > explanation.\nSLF4J: Actual binding is of type > [org.slf4j.impl.Log4jLoggerFactory]\nNativeException: > org.apache.zookeeper.KeeperException$NoAuthException: KeeperErrorCode = > NoAuth for /hbase-secure/draining/host2.novalocal,16020,1448856025519\n > __for__ at /usr/hdp/current/hbase-master/bin/draining_servers.rb:94\n > each at org/jruby/RubyArray.java:1620\n __ensure__ at > /usr/hdp/current/hbase-master/bin/draining_servers.rb:92\n addServers at > /usr/hdp/current/hbase-master/bin/draining_servers.rb:91\n (root) at > /usr/hdp/current/hbase-master/bin/draining_servers.rb:152", > "structured_out" : { } > } > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)