So, I've been able to diagnose this further. The only actual package that was updated that seems to possibly be related was we updated Python 2.7.8 to 2.7.9. Apparently that included back porting some ssl fixes from Python 3. Those fixes apparently make it so the cert that ambari generates is no longer considered valid. I guess I'll open a JIRA on this.
Greg On 1/7/15 3:55 PM, "Greg Hill" <[email protected]> wrote: >More info from the server log: > >21:20:41,833 INFO [main] Configuration:411 - Reading password from >existing file >21:20:41,866 INFO [main] Configuration:422 - API SSL Authentication is >turned on. >21:20:41,866 ERROR [main] Configuration:437 - There is no keystore for >https UI connection. >21:20:41,866 ERROR [main] Configuration:438 - Run "ambari-server >setup-https" or set api.ssl = false. >21:20:41,877 ERROR [main] ViewRegistry:249 - Caught exception extracting >view archive /var/lib/ambari-server/resources/views/slider-0 >.0.1-SNAPSHOT.jar. >com.google.inject.ProvisionException: Guice provision errors: > >1) Error injecting constructor, java.lang.RuntimeException: Error reading >certificate password from file /var/lib/ambari-server/keys/ >https.pass.txt > at >org.apache.ambari.server.configuration.Configuration.<init>(Configuration. >j >ava:330) > at >org.apache.ambari.server.configuration.Configuration.class(Configuration.j >a >va:321) > while locating org.apache.ambari.server.configuration.Configuration > >1 error > at >com.google.inject.internal.InjectorImpl$4.get(InjectorImpl.java:987) > at >com.google.inject.internal.InjectorImpl.getInstance(InjectorImpl.java:1013 >) > at >org.apache.ambari.server.view.ViewRegistry.main(ViewRegistry.java:240) >Caused by: java.lang.RuntimeException: Error reading certificate password >from file /var/lib/ambari-server/keys/https.pass.txt > at >org.apache.ambari.server.configuration.Configuration.<init>(Configuration. >j >ava:439) > at >org.apache.ambari.server.configuration.Configuration.<init>(Configuration. >j >ava:330) > at >org.apache.ambari.server.configuration.Configuration$$FastClassByGuice$$3b >5 >88b69.newInstance(<generated>) > at >com.google.inject.internal.cglib.reflect.$FastConstructor.newInstance(Fast >C >onstructor.java:40) > at >com.google.inject.internal.DefaultConstructionProxyFactory$1.newInstance(D >e >faultConstructionProxyFactory.java:60) > at >com.google.inject.internal.ConstructorInjector.construct(ConstructorInject >o >r.java:85) > at >com.google.inject.internal.ConstructorBindingImpl$Factory.get(ConstructorB >i >ndingImpl.java:254) > at >com.google.inject.internal.ProviderToInternalFactoryAdapter$1.call(Provide >r >ToInternalFactoryAdapter.java:46) > at >com.google.inject.internal.InjectorImpl.callInContext(InjectorImpl.java:10 >3 >1) > at >com.google.inject.internal.ProviderToInternalFactoryAdapter.get(ProviderTo >I >nternalFactoryAdapter.java:40) > at com.google.inject.Scopes$1$1.get(Scopes.java:65) > at >com.google.inject.internal.InternalFactoryToProviderAdapter.get(InternalFa >c >toryToProviderAdapter.java:40) > at >com.google.inject.internal.InjectorImpl$4$1.call(InjectorImpl.java:978) > at >com.google.inject.internal.InjectorImpl.callInContext(InjectorImpl.java:10 >2 >4) > at >com.google.inject.internal.InjectorImpl$4.get(InjectorImpl.java:974) > ... 2 more >21:20:44,055 INFO [main] Configuration:411 - Reading password from >existing file >21:20:44,107 INFO [main] Configuration:422 - API SSL Authentication is >turned on. >21:20:44,107 ERROR [main] Configuration:437 - There is no keystore for >https UI connection. >21:20:44,107 ERROR [main] Configuration:438 - Run "ambari-server >setup-https" or set api.ssl = false. > >ambari-server setup-https is not a valid command. Also, it appears to >eventually recover, as seen below: > >21:20:54,895 INFO [main] Configuration:411 - Reading password from >existing file >21:20:54,915 INFO [main] Configuration:422 - API SSL Authentication is >turned on. >21:20:54,915 ERROR [main] Configuration:437 - There is no keystore for >https UI connection. >21:20:54,915 ERROR [main] Configuration:438 - Run "ambari-server >setup-https" or set api.ssl = false. >21:21:23,930 INFO [main] Configuration:411 - Reading password from >existing file >21:21:23,950 INFO [main] Configuration:422 - API SSL Authentication is >turned on. >21:21:23,950 INFO [main] Configuration:427 - Reading password from >existing file >... > >21:21:35,586 INFO [main] CertificateManager:69 - Initialization of root >certificate >21:21:35,587 INFO [main] CertificateManager:71 - Certificate exists:false >21:21:35,588 INFO [main] CertificateManager:138 - Generation of server >certificate >21:21:36,628 INFO [main] ShellCommandUtil:44 - Command openssl genrsa >-des3 -passout pass:**** -out /var/lib/ambari-server/keys/ca.k >ey 4096 was finished with exit code: 0 - the operation was completely >successfully. >21:21:36,653 INFO [main] ShellCommandUtil:44 - Command openssl req >-passin pass:**** -new -key /var/lib/ambari-server/keys/ca.key -o >ut /var/lib/ambari-server/keys/ca.csr -batch was finished with exit code: >0 - the operation was completely successfully. >21:21:36,706 INFO [main] ShellCommandUtil:44 - Command openssl ca >-create_serial -out /var/lib/ambari-server/keys/ca.crt -days 365 -keyfile >/var/lib/ambari-server/keys/ca.key -key **** -selfsign -extensions jdk7_ca >-config /var/lib/ambari-server/keys/ca.config -batch -infiles >/var/lib/ambari-server/keys/ca.csr was finished with exit code: 0 - the >operation was completely successfully. >21:21:36,728 INFO [main] ShellCommandUtil:44 - Command openssl pkcs12 >-export -in /var/lib/ambari-server/keys/ca.crt -inkey >/var/lib/ambari-server/keys/ca.key -certfile >/var/lib/ambari-server/keys/ca.crt -out >/var/lib/ambari-server/keys/keystore.p12 -password pass:**** -passin >pass:**** > was finished with exit code: 0 - the operation was completely >successfully. >21:21:37,048 INFO [main] Configuration:487 - Credential provider creation >failed. Reason: Master key initialization failed. > > >So, it manages to create all the key/cert/ca stuff, but then fails. > >Any pointers are appreciated, but I'll keep digging tomorrow. > >Greg > > >On 1/7/15 3:01 PM, "Greg Hill" <[email protected]> wrote: > >>During agent registration. They all fail to register because the ssl >>cert >>validation fails and it can't connect to the ambari server. >> >>I should note that we *are not* using bootstrapping. We preinstall the >>agents manually. Nothing has changed since it was working other than >>updating to the latest CentOS and Ambari updates (still Ambari 1.7.0, >>though, we're not using trunk or anything). >> >>Greg >> >>On 1/7/15 2:54 PM, "Erin Boyd" <[email protected]> wrote: >> >>>When do you get this error? During registration or some other time? >>> >>>Erin >>> >>>----- Original Message ----- >>>From: "Greg Hill" <[email protected]> >>>To: "Erin Boyd" <[email protected]>, [email protected] >>>Sent: Wednesday, January 7, 2015 1:52:03 PM >>>Subject: Re: ssl changes recently? >>> >>>[root@ambari ~]# rpm -qa | grep openssl >>>openssl-1.0.1e-30.el6_6.4.x86_64 >>> >>> >>>We apparently have an even newer version. Perhaps they broke something >>>else more recently? We just spun up this image yesterday with the >>>latest >>>CentOS 6.5 stuff. >>> >>>Greg >>> >>>On 1/7/15 2:48 PM, "Erin Boyd" <[email protected]> wrote: >>> >>>>Hey Greg, >>>>On RHEL 6.5 we got a similar error during agent registration. >>>>Here is the workaround: >>>>http://hortonworks.com/community/forums/topic/ambari-agent-registration >>>>- >>>>f >>>>a >>>>ilure-on-rhel-6-5-due-to-openssl-2/ >>>> >>>>Hope that helps, >>>>Erin >>>> >>>> >>>>----- Original Message ----- >>>>From: "Greg Hill" <[email protected]> >>>>To: [email protected] >>>>Sent: Wednesday, January 7, 2015 1:44:40 PM >>>>Subject: ssl changes recently? >>>> >>>>I sent this to the wrong list earlier. >>>> >>>>I recently updated our Ambari 1.7.0 image and am now getting SSL errors >>>>from the agents: >>>> >>>>INFO 2015-01-07 16:59:02,116 NetUtil.py:48 - Connecting to >>>>https://ambari.local:8440/ca >>>>ERROR 2015-01-07 16:59:02,645 NetUtil.py:66 - [SSL: >>>>CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:581) >>>>ERROR 2015-01-07 16:59:02,646 NetUtil.py:67 - SSLError: Failed to >>>>connect. Please check openssl library versions. >>>>Refer to: https://bugzilla.redhat.com/show_bug.cgi?id=1022468 for more >>>>details. >>>>WARNING 2015-01-07 16:59:02,651 NetUtil.py:92 - Server at >>>>https://ambari.local:8440<https://ambari.local:8440/> is not reachable, >>>>sleeping for 10 secondsÅ >>>> >>>>We're just using the default SSL certs that Ambari creates for agent >>>>communication. This worked up until we made this new image, which pull >>>>in upstream CentOS system updates. >>>> >>>>Is it possible that some change in upstream has broken this for Ambari? >>>>Is there a workaround? >>>> >>>>I have noticed that the "server_crt" >>>>(/var/lib/ambari-agent/keys/ca.crt) >>>>does not exist on the hosts. Is this something I'm supposed to inject? >>>>We weren't before, but it was working just fine without it. >>>> >>>>Greg >>>> >>> >> >
