yep, I'd go to database and start deleting those records, check hostcomponentstate, hostdesiredstate, servicedesiredstate and I believe servicecomponentstate. You can take a backup of the database if you're concerned.
On Fri, Oct 30, 2015 at 7:43 PM, Ken Barclay <[email protected]> wrote: > Hi Artem/Adam, > > Thanks very much for your input on this issue: we tried a few other things > today. > > We were finally able to install Spark through the Ambari UI: it was listed > in a failed state in the UI, so after doing the upgrade to 2.1.2, I just > tried the install again, and this time the install was successful and I was > able to start the service. > > Next we tried to install Kafka. > When we install Kafka through Ambari GUI for the first time, we get stuck > in that weird state I mentioned last time, where it won’t proceed beyond > “recommended configurations". Ambari shows it has a Kafka service – the > Broker doesn’t get installed, and there’s no configuration on the file > system. > > If we delete Kafka through the API and re-install through the API, we > could install Kafka service and component, then install the component to a > node and it installed successfully. There were configuration files created > on the file system in /etc/kafka also, but the the configurations were > blank in the Ambari UI. (I tried restarting ambari-server but there was no > change.) Kafka would not start however, probably because configurations > were missing, and Ambari would not allow us to add or set up the > configuration through the UI: it’s just blank. > > We can see there’s some duplicate key issue in tables (pasted below) when > we try to perform these INSERT and DELETE operations. We’re tailing the > postgres log. > > At this point we’ve deleted the service components from the cluster and > we’re trying to track down the entries in the tables so we can delete > entries associated with the kafka service. > We’ll attempt to re-install if we find records that prove to be in the way. > > We noticed in the logs that when we install Kafka the message “kafka-env > not found in dictionary” (below), which seems to show there’s a disconnect > between service configuration templates and actual service configurations. > When we had trouble installing Spark a while ago we saw this same message, > except it was “spark-env” that was not found. > > raise Fail("Configuration parameter '" + self.name + "' was not found in > configurations dictionary!") > resource_management.core.exceptions.Fail: Configuration parameter 'kafka-env' > was not found in configurations dictionary! > > > ERROR: update or delete on table "servicecomponentdesiredstate" violates > foreign key constraint "hstcomponentstatecomponentname" on table > "hostcomponentstate" > DETAIL: Key > (component_name,cluster_id,service_name)=(KAFKA_BROKER,2,KAFKA) is still > referenced from table "hostcomponentstate". > STATEMENT: DELETE FROM servicecomponentdesiredstate WHERE > (((component_name = $1) AND (cluster_id = $2)) AND (service_name = $3)) > ERROR: current transaction is aborted, commands ignored until end of > transaction block > STATEMENT: SELECT 1 > ERROR: duplicate key value violates unique constraint > "servicecomponentdesiredstate_pkey" > STATEMENT: INSERT INTO servicecomponentdesiredstate (component_name, > desired_state, cluster_id, service_name, desired_stack_id) VALUES ($1, $2, > $3, $4, $5) > ERROR: current transaction is aborted, commands ignored until end of > transaction block > STATEMENT: SELECT 1 > ERROR: update or delete on table "clusterservices" violates foreign key > constraint "srvccmponentdesiredstatesrvcnm" on table > "servicecomponentdesiredstate" > DETAIL: Key (service_name,cluster_id)=(KAFKA,2) is still referenced from > table "servicecomponentdesiredstate". > STATEMENT: DELETE FROM clusterservices WHERE ((cluster_id = $1) AND > (service_name = $2)) > ERROR: current transaction is aborted, commands ignored until end of > transaction block > STATEMENT: SELECT 1 > ERROR: duplicate key value violates unique constraint > "clusterservices_pkey" > STATEMENT: INSERT INTO clusterservices (service_name, service_enabled, > cluster_id) VALUES ($1, $2, $3) > ERROR: current transaction is aborted, commands ignored until end of > transaction block > STATEMENT: SELECT 1 > ERROR: duplicate key value violates unique constraint > "servicecomponentdesiredstate_pkey" > STATEMENT: INSERT INTO servicecomponentdesiredstate (component_name, > desired_state, cluster_id, service_name, desired_stack_id) VALUES ($1, $2, > $3, $4, $5) > ERROR: current transaction is aborted, commands ignored until end of > transaction block > STATEMENT: SELECT 1 > LOG: received SIGHUP, reloading configuration files > > We’ll let you know if we make more progress. > > Cheers > Ken > > From: <[email protected]> on behalf of Artem Ervits <[email protected] > > > Reply-To: "[email protected]" <[email protected]> > Date: Friday, October 30, 2015 at 9:32 AM > > To: "[email protected]" <[email protected]> > Subject: Re: Any way to reset Ambari Install Wizard? > > I am guessing his issues are with ambari database, he's concerned to do > any kind of changes in the database directly. I'm trying to nail down the > issue to delete just that bad row. In that sense, upgrading ambari is not a > big deal. Resetting the database and creating a new cluster and import data > is a big deal. What I would do is take account of all the services he has > running. Once he knows what should be in Ambari and what shouldn't, go > through every table in the Ambari database and see if that service or any > reference to it exists. Purge that row and see where that takes you. I > personally had issues similar to that in Ambari as well with earlier > releases, 2.1.2 addressed many issues in the UI and in configuration. > > On Fri, Oct 30, 2015 at 11:51 AM, Adam Gover <[email protected]> > wrote: > >> Hi Artem, >> >> Valid Point. I was surprised you suggest he update to 2.1.2 in the midst >> of this however. Doesn’t that increase the risk of further problems? >> >> Thanks >> >> Adam >> >> >> From: <[email protected]> on behalf of Artem Ervits < >> [email protected]> >> Reply-To: "[email protected]" <[email protected]> >> Date: Friday, October 30, 2015 at 11:25 AM >> >> To: "[email protected]" <[email protected]> >> Subject: Re: Any way to reset Ambari Install Wizard? >> >> note that if a bad config is included in your json which may happen if >> you gather the configs, once you reset and reapply, it may come back and >> all these steps will be useless. We need to figure out what the issue is. I >> want him to avoid going the reset route until we exhaust every other option. >> >> On Fri, Oct 30, 2015 at 10:47 AM, Adam Gover <[email protected]> >> wrote: >> >>> >>> >>> Hi There Ken, >>> >>> Lets try this again… now actually complete >>> >>> So I’ve been following along on this thread hoping someone would come >>> back with a better solution than the one I have. Since I haven’t seen any >>> Ill provide the details to my solution. >>> >>> Prereqs/comments: >>> >>> - tested only on Ambari 2.1.2 – but should work on Ambari 2+ (also >>> will work with some tweaks on 1.6, but won’t work on 1.7) >>> - Tested using external postgres database but should also work with >>> mysql >>> - Test this on your own as it tends to have issues under some >>> circumstances >>> >>> >>> I can’t provide the code I use to accomplish all this – but ill provide >>> an outline which should allow you to do the same thing. >>> >>> General info: >>> Base path for access to rest api is: >>> http://<ambari host>:8080/api/v1 >>> >>> This can be accessed using a standard curl call similar to: >>> Curl –u admin:admin –H ‘X-Requested-By: ambari’ http://<ambari >>> host>:8080/api/v1 >>> >>> Ill indicate path to access info will just say “goto rest” and provide >>> additional path info (any options needed will need to be inserted before >>> the url). Also note I’m in some cases copying parts of the scripts I’m >>> using so the values of the variables need to be populated. >>> >>> >>> 1. Backup all external databases (hive/oozie/ambari) >>> 2. Backup the filesystem after forcing a check point >>> 3. Before downing ambari collect a complete set of configs: >>> 1. Get list of all configs available >>> Goto rest: >>> clusters/${cluster_name}/?fields=Clusters/desired_configs >>> 2. Using the list retrieve ALL the json config files for the >>> cluster >>> Goto rest: >>> >>> http://${ambari_host}:8080/api/v1/clusters/${cluster_name}/configurations?(type=${config_type}&tag=${tag}) >>> >>> >>> So cluster_name=your defined cluster name, >>> config_type=config_filename, tag=the most recent version of this >>> config >>> file (this is provided by the first rest call) >>> >>> Note that the output here is NOT usable directly – you will need >>> to slightly reformat these files prior to reimporting them >>> >>> > 1. Next shutdown ambari > 2. On the command line as root execute “ambari-server reset” > 3. Setup the base cluster name: > Goto rest: OPTION: -d '{"Clusters":{"version":"HDP-2.2"}}’ > /clusters/${cluster_name} > 4. For each host on the cluster – add it to the cluster > Goto rest: OPTION –X POST /clusters/${cluster_name}/hosts/${hostname} > 5. Push ALL your configs captured in the part 1/3rd step to the > cluster via > May want to use this: > /var/lib/ambari-server/resources/scripts/configs.sh > > NOTE I do this using perl – its basically a raw read that pushes using > (PUT) into > Goto rest: OPTION –X PUT /clusters/${cluster_name} > 6. Next add each service & its associated components > > To add service: > Goto rest: OPTION –X POST > /clusters/${cluster_name}/services/${service_name} > > To add component: > Goto rest: OPTION –X POST > /clusters/${cluster_name}/services/${service_name}/components/${component} > 7. Next for each host apply the required components using the follow 2 > rest calls > Goto rest: OPTION –X POST > /clusters/${cluster_name}/hosts/${hostname}/host_components/${component} > Goto rest: OPTION –X PUT OPTION –d > '{"HostRoles":{"state":"INSTALLED"}}' > /clusters/${cluster_name}/hosts/${hostname}/host_components/${component} > 8. Next set cluster status > Goto rest: OPTION –X POST OPTION –d '{"CLUSTER_CURRENT_STATUS": > "{"clusterState":"CLUSTER_STARTED_5"}"}’ /persist > 9. Now set each service to an installed state > Goto rest: OPTION –X PUT OPTION -d > '{"ServiceInfo":{"state":"INSTALLED"}}’ > /clusters/${cluster_name}/services/${service_name} > 10. Finally set the cluster itself to INSTALLED – this (as far as I > know) is best done using SQL - I’m sure there is a rest call > but I haven’t found it yet > Update clusters set > provisioning_state=‘INSTALLED’,security_type=‘NONE’ where > cluster_name=${cluster_name} > > > ALL of this can be automated and unless you have a 2 node cluster I would. > > NOTES – this process will work with HA & kerberized clusters but will need > additional steps (especially for kerberos) > > Anyways I hope this helps – its complicated but doable and will save you > copying/rebuilding which in larger clusters is really not doable. > > Cheers > Adam > > > From: Ken Barclay <[email protected]> > Reply-To: "[email protected]" <[email protected]> > Date: Friday, October 30, 2015 at 2:34 AM > To: "[email protected]" <[email protected]> > Subject: Re: Any way to reset Ambari Install Wizard? > > Hi Artem, > > I upgraded all Ambari components to 2.1.2, restarted everything, and after > logging in, restarted all components where it was indicated. > > I tried the Add Service wizard for Kafka, and got to the page that allows > me to assign masters and such, but clicking Next after that takes me to > Customize Services, which gets stuck because the Next button on that page > is never sensitized. It just freezes there, saying it has recommended > configurations, with the update icon spinning in the middle. All I can do > is click Back at that point. > > Anything else I can try? > > Thanks > Ken > > From: Artem Ervits <[email protected]> > Reply-To: "[email protected]" <[email protected]> > Date: Thursday, October 29, 2015 at 1:55 PM > To: "[email protected]" <[email protected]> > Subject: Re: Any way to reset Ambari Install Wizard? > > Please upgrade to latest 2.1.2 and restart all agents and Ambari server. > Ctrl-shft-r on browser after you navigate to ambari URL. Login and let me > know if it still shows same problem. > On Oct 29, 2015 10:19 AM, "Ken Barclay" <[email protected]> wrote: > >> Hi Artem, >> >> I started with 2.0.1, and upgraded it to 2.1 back in August. >> >> From: Artem Ervits <[email protected]> >> Reply-To: "[email protected]" <[email protected]> >> Date: Thursday, October 29, 2015 at 2:09 AM >> To: "[email protected]" <[email protected]> >> Subject: Re: Any way to reset Ambari Install Wizard? >> >> What version of Ambari are you running? >> On Oct 27, 2015 6:51 PM, "Ken Barclay" <[email protected]> wrote: >> >>> Hello, >>> >>> I’m returning to an issue we’ve left hanging since July – we have now to >>> fix Ambari on this cluster or take the whole cluster down and reinstall >>> from scratch. >>> >>> Our situation is that although our HDP 2.2 cluster is running well, >>> Ambari cannot be used to install anything because the wizard is broken. >>> >>> I did a restart of Ambari server and agents per Artem, but without >>> knowing exactly what changes to make to the postgres tables I’m reluctant >>> to try that part. We also tried to add a new component (Spark) using the >>> Ambari API instead of the wizard, but that also failed, as did trying to >>> remove the Spark (again via the API) that had failed to install. >>> >>> We have 1.5T of monitoring data on this 4-node cluster that want to >>> preserve. The cluster is dedicated to storing metrics in HBase via OpenTSDB >>> and that is all it is used for. >>> >>> I just want to confirm with the group that since Ambari can only be used >>> to manage a cluster that it installed itself, our best option in this >>> scenario would be to: >>> >>> Shut down monitoring >>> Copy all the data to another cluster >>> Completely remove Ambari and HDP per >>> https://cwiki.apache.org/confluence/display/AMBARI/Host+Cleanup+for+Ambari+and+Stack >>> Do a fresh install of HDP 2.2 using the latest Ambari, and >>> Copy the data back to the new cluster. >>> >>> Please let us know if this is a valid approach >>> Thanks >>> >>> Ken >>> >>> >>> >>> From: <[email protected]> on behalf of Artem Ervits < >>> [email protected]> >>> Reply-To: "[email protected]" <[email protected]> >>> Date: Tuesday, July 28, 2015 at 12:48 PM >>> To: "[email protected]" <[email protected]> >>> Subject: Re: Any way to reset Ambari Install Wizard? >>> >>> try to restart ambari server and agents, then stop and start services, >>> sometimes services need to announce themselves to Ambari that they're >>> installed. Always refer to the ambari-server log. Worst case scenario, >>> delete Ambari_metrics service with API and clean up the postgres DB >>> manually, tables to concentrate on are hostservicedesiredstate, >>> servicedesiredstate etc. This should be last resort. >>> >>> On Tue, Jul 28, 2015 at 3:11 PM, Benoit Perroud <[email protected]> >>> wrote: >>> >>>> Some manual update in DB is most likely needed. >>>> >>>> *WARNING* use this at your own risk >>>> >>>> The table that needs to be updated is cluster_version. >>>> >>>> As far as I tested 2.1, it required less manual intervention than >>>> 2.0.1. Upgrade has a retry button for most of the steps, and this is really >>>> cool. >>>> >>>> Hope this help. >>>> >>>> Benoit >>>> >>>> >>>> >>>> 2015-07-28 20:01 GMT+02:00 Ken Barclay <[email protected]>: >>>> >>>>> Hello, >>>>> >>>>> I upgraded a small test cluster from HDP 2.1 to HDP 2.2 and Ambari >>>>> 2.0.1. In following the steps to replace Nagios + Ganglia with the Ambari >>>>> Metrics System using the Ambari Wizard, an install failure occurred on one >>>>> node due to an outdated glibc library. I updated glibc and verified the >>>>> Metrics packages could be installed, but couldn’t go back and finish the >>>>> installation through the wizard. The problem is: it flags some of the >>>>> default settings, saying they need to be changed, but it skips past the >>>>> screen very quickly that enables those settings to be changed, without >>>>> allowing anything to be entered. So the button that allows you to proceed >>>>> with the installation never becomes enabled. >>>>> >>>>> I subsequently manually finished the Metrics installation using the >>>>> Ambari API and have it running in Distributed mode. But Ambari’s wizard >>>>> cannot be used for anything now: the same problem described above >>>>> occurs for every service I try to install. >>>>> >>>>> Can Ambari be reset somehow in this situation, or do I need to >>>>> reinstall it? >>>>> Or do you recommend installing 2.1? >>>>> >>>>> Thanks >>>>> Ken >>>>> >>>> >>>> >>> > >
