----- Original Message ----- > From: "Clayton Coleman" <[email protected]> > To: "Alejandro Nieto Boza" <[email protected]>, [email protected] > Cc: "users" <[email protected]> > Sent: Wednesday, February 10, 2016 11:56:14 AM > Subject: Re: Enabling Cluster Metrics > > I don't know what unconfigured table means (beyond maybe your tables need > to be recreated because you have an old version) but I bet Matt does. > > On Feb 10, 2016, at 10:50 AM, Alejandro Nieto Boza <[email protected]> > wrote: > > Thanks, the Openshift DNS wasn't running correctly. Now the error doesn't > appear but... > > Now I've an error (this error have already appears to me in other > scenarios). > > This is the state of my metrics pods: > > # oc get pods > NAME READY STATUS RESTARTS AGE > hawkular-cassandra-1-j09f6 1/1 Running 0 10m > hawkular-metrics-xpa33 0/1 Error 1 10m > heapster-42vyz 0/1 Error 2 10m > metrics-deployer-e5e3v 0/1 Completed 0 12m > > > # oc get pods > NAME READY STATUS RESTARTS AGE > hawkular-cassandra-1-j09f6 1/1 Running 0 12m > hawkular-metrics-xpa33 0/1 Completed 2 12m > heapster-42vyz 0/1 CrashLoopBackOff 4 12m > metrics-deployer-e5e3v 0/1 Completed 0 15m > > > The pod hawkular-metrics change its state between completed and error (?) > > > These are some logs of hawkular-metrics pod: > > # oc logs hawkular-metrics-xpa33 > 15:22:08,104 ERROR [org.jboss.msc.service.fail] (MSC service thread 1-1) > MSC000001: Failed to start service > jboss.deployment.unit."hawkular-metrics-api-jaxrs.war": > org.jboss.msc.service.StartException in service > jboss.deployment.unit."hawkular-metrics-api-jaxrs.war": Failed to start > service > at > org.jboss.msc.service.ServiceControllerImpl$StartTask.run(ServiceControllerImpl.java:1904) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.lang.IllegalStateException: Container is down > ............... > > 15:22:08,211 ERROR [org.jboss.msc.service.fail] (MSC service thread 1-1) > MSC000001: Failed to start service > jboss.serverManagement.controller.management.http: > org.jboss.msc.service.StartException in service > jboss.serverManagement.controller.management.http: Failed to start service > at > org.jboss.msc.service.ServiceControllerImpl$StartTask.run(ServiceControllerImpl.java:1904) > ............... > > > 15:29:35,416 FATAL [org.hawkular.metrics.api.jaxrs.MetricsServiceLifecycle] > (metricsservice-lifecycle-thread) HAWKMETRICS200006: An error occurred > trying to connect to the Cassandra cluster: > com.datastax.driver.core.exceptions.InvalidQueryException: unconfigured > table retentions_idx > at > com.datastax.driver.core.exceptions.InvalidQueryException.copy(InvalidQueryException.java:35) > .................
I have not seen this issue exact issue before, but it is similar to something I have seen where if you use origin-metrics and then switch to the OSE metric images then something similar may happen (the version of Hawkular Metrics in origin metrics uses a different schema than the OSE images). Are you running this without persistent storage? and if using persistent storage, was it used previously for a different version of Hawkular Metrics? > > > And obviously heapster cannot connect to hawkular-metrics: > > # oc logs heapster-42vyz > Could not connect to https://hawkular-metrics:443/hawkular/metrics/status. > Curl exit code: 7. Status Code 000 > 'https://hawkular-metrics:443/hawkular/metrics/status' is not accessible > [HTTP status code: 000. Curl exit code 7]. Retrying. > > > hawkular-cassandra logs don't show errors. > > > > 2016-02-10 14:50 GMT+01:00 Clayton Coleman <[email protected]>: > > > Can you try from one of your nodes to reach the nameserver directly and > > via the proxy? > > > > dig @<your master ip> kubernetes.default.svc.cluster.local > > dig @172.30.0.1 kubernetes.default.svc.cluster.local > > > > > > > > On Feb 10, 2016, at 8:40 AM, Alejandro Nieto Boza <[email protected]> > > wrote: > > > > It's like you said. > > > > Test logs: > > # oc logs test > > % Total % Received % Xferd Average Speed Time Time Time > > Current > > Dload Upload Total Spent Left > > Speed > > 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- > > 0curl: (6) Could not resolve host: kubernetes; Unknown error > > > > > > > > > > Test2 logs: > > # oc logs test2 > > nameserver "172.30.0.1" > > nameserver "another-ip" > > > > > > > > > > # oc get svc/kubernetes -n default > > NAME CLUSTER_IP EXTERNAL_IP PORT(S) SELECTOR > > AGE > > kubernetes "172.30.0.1" <none> 443/TCP,53/UDP,53/TCP <none> > > 92d > > search test.svc.cluster.local svc.cluster.local cluster.local test.es > > options ndots:5 > > > > > > > > > > > > > > 2016-02-10 14:01 GMT+01:00 Clayton Coleman <[email protected]>: > > > >> That seems to indicate that inside the deployment container DNS is not > >> working. Can you do the following to check: > >> > >> oc run --image centos:7 test --generator=run-pod/v1 --restart=Never > >> -- curl https://kubernetes > >> oc logs test > >> > >> And then > >> > >> oc run --image centos:7 test2 --generator=run-pod/v1 --restart=Never > >> -- cat /etc/resolv.conf > >> oc logs test2 > >> > >> The latter should have a nameserver pointing to the master by its service > >> IP - the command: > >> > >> oc get svc/kubernetes -n default > >> > >> Should show that same IP > >> > >> On Feb 10, 2016, at 7:39 AM, Alejandro Nieto Boza <[email protected]> > >> wrote: > >> > >> Hi, > >> > >> I've been following the following steps to deploy metrics: > >> > >> https://docs.openshift.org/latest/install_config/cluster_metrics.html > >> > >> When I run the following command: > >> > >> > >> oc process -f metrics.yaml -v \ > >> HAWKULAR_METRICS_HOSTNAME=hawkular-metrics.example.com,USE_PERSISTENT_STORAGE=false > >> \ > >> | oc create -f - > >> > >> > >> I get the following error: > >> > >> Creating the Cassandra Certificate Secrets configuration json file > >> +++ base64 > >> ++++ echo hawkular-cassandra > >> +++ base64 -w 0 /etc/deploy/_output/hawkular-cassandra.truststore > >> +++ base64 > >> ++++ echo RjR--747mUzmTS- > >> +++ base64 -w 0 /etc/deploy/_output/hawkular-cassandra.pem > >> ++ echo > >> ++ echo 'Creating the Cassandra Certificate Secrets configuration json > >> file' > >> ++ cat > >> +++ base64 -w 0 /etc/deploy/_output/hawkular-cassandra.cert > >> +++ base64 -w 0 /etc/deploy/_output/hawkular-cassandra-ca.cert > >> Creating Hawkular Metrics & Cassandra Secrets > >> ++ echo 'Creating Hawkular Metrics & Cassandra Secrets' > >> ++ oc create -f /etc/deploy/_output/hawkular-metrics-secrets.json > >> unable to connect to a server to handle "secrets": Get > >> https://kubernetes.default.svc:443/api: dial tcp: lookup > >> kubernetes.default.svc: no such host > >> > >> > >> > >> > >> # oc get pods > >> NAME READY STATUS RESTARTS AGE > >> metrics-deployer-7gcpd 0/1 Error 0 39m > >> > >> > >> How can I know if my kubernetes master URL is > >> https://kubernetes.default.svc:443 or is another URL? > >> > >> My Openshift installation isn't an update. > >> > >> _______________________________________________ > >> users mailing list > >> [email protected] > >> http://lists.openshift.redhat.com/openshiftmm/listinfo/users > >> > >> > > > _______________________________________________ users mailing list [email protected] http://lists.openshift.redhat.com/openshiftmm/listinfo/users
