Origin metrics ansible install failure

2018-02-28 Thread Feld, Michael (IMS)
Let me begin by saying there seems to be confusion regarding which branch in 
the openshift-ansible github project you should be using for deploying 
components of OpenShift Origin. The official documentation for installing the 
cluster says to use the master branch, however I've found that the master 
branch never works and you must use the specific release branch for it to be 
successful.

With that said, it looks like the ansible scripts to deploy metrics are no 
longer in the release branch (I'm trying release-3.6 at the moment), and only 
in the master branch. When I try running the playbook from the master branch on 
a brand new 3.6.0 cluster, I get the following output from ansible:

TASK [openshift_master : Ensure that Heapster has nodes to run on] 
**
fatal: [openshift-master]: FAILED! => {"failed": true, "msg": "Unexpected 
templating type error occurred on ({{  
__schedulable_nodes_matching_selector['results']['results'][0]['items'] | 
default([]) | length != 0 }}): object of type 'builtin_function_or_method' has 
no len()"}
to retry, use: --limit 
@/openshift-ansible/playbooks/openshift-metrics/config.retry

Any ideas?



Information in this e-mail may be confidential. It is intended only for the 
addressee(s) identified above. If you are not the addressee(s), or an employee 
or agent of the addressee(s), please note that any dissemination, distribution, 
or copying of this communication is strictly prohibited. If you have received 
this e-mail in error, please notify the sender of the error.
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users


RE: How to use DNS hostname of OpenShift on AWS

2018-02-21 Thread Feld, Michael (IMS)
Deploying with https://github.com/openshift/openshift-ansible you can define 
the hostnames in your inventory file. There is a sample inventory file at 
https://docs.openshift.org/latest/install_config/install/advanced_install.html 
that shows how to define the master/etcd/nodes, and those names should be used 
as the hostnames in the cluster.

From: users-boun...@lists.openshift.redhat.com 
[mailto:users-boun...@lists.openshift.redhat.com] On Behalf Of Joel Pearson
Sent: Wednesday, February 21, 2018 7:14 AM
To: users 
Subject: How to use DNS hostname of OpenShift on AWS

Hi,

I'm trying to figure out how to use the DNS hostname when deploying OpenShift 
on AWS using 
https://github.com/openshift/openshift-ansible-contrib/tree/master/reference-architecture/aws-ansible
 Currently it uses private dns name, eg, 
ip-10-2-7-121.ap-southeast-2.compute.internal but that isn't too useful a name 
for me.  I've managed to set the hostname on the ec2 instance properly but 
disabling the relevant cloud-init setting, but it still grabs the private dns 
name somehow.

I tried adding "openshift_hostname" to be the same as "name" on this line: 
https://github.com/openshift/openshift-ansible-contrib/blob/master/reference-architecture/aws-ansible/playbooks/roles/instance-groups/tasks/main.yaml#L11

Which did set the hostname in the node-config.yaml, but then when running "oc 
get nodes" it still returned the private dns name somehow, and installation 
failed waiting for the nodes to start properly, I guess a mismatch between node 
names somewhere.

I found an old github issue, but it's all referring to files in ansible that 
exist no longer:
https://github.com/openshift/openshift-ansible/issues/1170

Even on OpenShift Online Starter, they're using the default ec2 names, eg: 
ip-172-31-28-11.ca-central-1.compute.internal, which isn't a good sign I guess.

Has anyone successfully used a DNS name for OpenShift on AWS?

Thanks,

Joel



Information in this e-mail may be confidential. It is intended only for the 
addressee(s) identified above. If you are not the addressee(s), or an employee 
or agent of the addressee(s), please note that any dissemination, distribution, 
or copying of this communication is strictly prohibited. If you have received 
this e-mail in error, please notify the sender of the error.
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users


RE: ansible failing etcd v2 > v3 migration

2018-02-13 Thread Feld, Michael (IMS)
Yes, /var/lib/etcd is the correct etcd data dir. This file exists 
/var/lib/etcd/member/snap/db

From: Clayton Coleman [mailto:ccole...@redhat.com]
Sent: Tuesday, February 13, 2018 5:17 PM
To: Feld, Michael (IMS) <fe...@imsweb.com>
Cc: users@lists.openshift.redhat.com
Subject: Re: ansible failing etcd v2 > v3 migration

Is /var/lib/etcd your etcd data directory?  Ie is there anything in that folder?

On Feb 13, 2018, at 4:50 PM, Feld, Michael (IMS) 
<fe...@imsweb.com<mailto:fe...@imsweb.com>> wrote:
Hi all,

I am trying to use the ansible playbook to migrate etcd from v2 to v3 for a 
3.6.0 origin cluster and it keeps failing with the following:

fatal: [openshift-master]: FAILED! => {"changed": false, "failed": true, "msg": 
"Before the migration can proceed the etcd member must write down at least one 
snapshot under /var/lib/etcd//member/snap directory."}

I have tried using the release-3.6 branch of openshift-ansible as well as the 
master branch migrate.yml playbooks, both give the same result. It doesn’t even 
look from the ansible output that it’s even taking a snapshot. What am I doing 
wrong?

Mike



Information in this e-mail may be confidential. It is intended only for the 
addressee(s) identified above. If you are not the addressee(s), or an employee 
or agent of the addressee(s), please note that any dissemination, distribution, 
or copying of this communication is strictly prohibited. If you have received 
this e-mail in error, please notify the sender of the error.
___
users mailing list
users@lists.openshift.redhat.com<mailto:users@lists.openshift.redhat.com>
http://lists.openshift.redhat.com/openshiftmm/listinfo/users



Information in this e-mail may be confidential. It is intended only for the 
addressee(s) identified above. If you are not the addressee(s), or an employee 
or agent of the addressee(s), please note that any dissemination, distribution, 
or copying of this communication is strictly prohibited. If you have received 
this e-mail in error, please notify the sender of the error.
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users


ansible failing etcd v2 > v3 migration

2018-02-13 Thread Feld, Michael (IMS)
Hi all,

I am trying to use the ansible playbook to migrate etcd from v2 to v3 for a 
3.6.0 origin cluster and it keeps failing with the following:

fatal: [openshift-master]: FAILED! => {"changed": false, "failed": true, "msg": 
"Before the migration can proceed the etcd member must write down at least one 
snapshot under /var/lib/etcd//member/snap directory."}

I have tried using the release-3.6 branch of openshift-ansible as well as the 
master branch migrate.yml playbooks, both give the same result. It doesn't even 
look from the ansible output that it's even taking a snapshot. What am I doing 
wrong?

Mike



Information in this e-mail may be confidential. It is intended only for the 
addressee(s) identified above. If you are not the addressee(s), or an employee 
or agent of the addressee(s), please note that any dissemination, distribution, 
or copying of this communication is strictly prohibited. If you have received 
this e-mail in error, please notify the sender of the error.
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users


Pod persistence without replication controller

2018-01-09 Thread Feld, Michael (IMS)
Does anyone know why a standalone pod (without a replication controller) 
sometimes persists through a host/node reboot, but not all times (not 
evacuating first)? We have a database pod that we cannot risk scaling, and want 
to ensure that it's always running.



Information in this e-mail may be confidential. It is intended only for the 
addressee(s) identified above. If you are not the addressee(s), or an employee 
or agent of the addressee(s), please note that any dissemination, distribution, 
or copying of this communication is strictly prohibited. If you have received 
this e-mail in error, please notify the sender of the error.
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users


router certificate question

2017-12-06 Thread Feld, Michael (IMS)
Hey all,

I have a cluster where we use an external HAProxy to terminate SSL and send 
traffic to the routers in the OpenShift cluster, so the routes within the 
cluster do not use TLS. It looks like when this cluster was setup, default 
certificates were given to the routers and are expiring soon (I get this when 
running the ansible easy-mode.yaml):

"router": [
{
  "cert_cn": "OU=Domain Control Validated:, CN=*..com:, 
DNS:*. .com, DNS: .com",
  "days_remaining": 11,
  "expiry": "2017-12-17 20:13:24",
  "health": "warning",
  "path": "/api/v1/namespaces/default/secrets/router-certs",
  "serial": ,
  "serial_hex": ""
}
  ]

My question is, is it OK to let this expire without taking any action? How can 
I safely remove the default certificates to remove the warnings in the future?

Thanks
Mike



Information in this e-mail may be confidential. It is intended only for the 
addressee(s) identified above. If you are not the addressee(s), or an employee 
or agent of the addressee(s), please note that any dissemination, distribution, 
or copying of this communication is strictly prohibited. If you have received 
this e-mail in error, please notify the sender of the error.
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users


router certificate question

2017-12-06 Thread Feld, Michael (IMS)
Hey all,

I have a cluster where we use an external HAProxy to terminate SSL and send 
traffic to the routers in the OpenShift cluster, so the routes within the 
cluster do not use TLS. It looks like when this cluster was setup, default 
certificates were given to the routers and are expiring soon (I get this when 
running the ansible easy-mode.yaml):

"router": [
{
  "cert_cn": "OU=Domain Control Validated:, CN=*..com:, 
DNS:*. .com, DNS: .com",
  "days_remaining": 11,
  "expiry": "2017-12-17 20:13:24",
  "health": "warning",
  "path": "/api/v1/namespaces/default/secrets/router-certs",
  "serial": ,
  "serial_hex": ""
}
  ]

My question is, is it OK to let this expire without taking any action? How can 
I safely remove the default certificates to remove the warnings in the future?

Thanks
Mike



Information in this e-mail may be confidential. It is intended only for the 
addressee(s) identified above. If you are not the addressee(s), or an employee 
or agent of the addressee(s), please note that any dissemination, distribution, 
or copying of this communication is strictly prohibited. If you have received 
this e-mail in error, please notify the sender of the error.
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users