Re: [openstack-dev] [magnum] supported OS images and magnum spawn failures for Swarm and Kubernetes

2018-08-23 Thread Tobias Urdin

Now with Fedora 26 I have etcd available but etcd fails.

[root@swarm-u2rnie4d4ik6-master-0 ~]# /usr/bin/etcd 
--name="${ETCD_NAME}" --data-dir="${ETCD_DATA_DIR}" 
--listen-client-urls="${ETCD_LISTEN_CLIENT_URLS}" --debug
2018-08-23 14:34:15.596516 E | etcdmain: error verifying flags, 
--advertise-client-urls is required when --listen-client-urls is set 
explicitly. See 'etcd --help'.
2018-08-23 14:34:15.596611 E | etcdmain: When listening on specific 
address(es), this etcd process must advertise accessible url(s) to each 
connected client.


There is a issue where the --advertise-client-urls and TLS --cert-file 
and --key-file is not passed in the systemd file, changing this to:
/usr/bin/etcd --name="${ETCD_NAME}" --data-dir="${ETCD_DATA_DIR}" 
--listen-client-urls="${ETCD_LISTEN_CLIENT_URLS}" 
--advertise-client-urls="${ETCD_ADVERTISE_CLIENT_URLS}" 
--cert-file="${ETCD_PEER_CERT_FILE}" --key-file="${ETCD_PEER_KEY_FILE}"


Makes it work, any thoughts?

Best regards
Tobias

On 08/23/2018 03:54 PM, Tobias Urdin wrote:
Found the issue, I assume I have to use Fedora Atomic 26 until Rocky 
where I can start using Fedora Atomic 27.

Will Fedora Atomia 28 be supported for Rocky?

https://bugs.launchpad.net/magnum/+bug/1735381 (Run etcd and flanneld 
in system containers, In Fedora Atomic 27 etcd and flanneld are 
removed from the base image.)
https://review.openstack.org/#/c/524116/ (Run etcd and flanneld in a 
system container)


Still wondering about the "The Parameter (nodes_affinity_policy) was 
not provided" when using Mesos + Ubuntu?


Best regards
Tobias

On 08/23/2018 02:56 PM, Tobias Urdin wrote:

Thanks for all of your help everyone,

I've been busy with other thing but was able to pick up where I left 
regarding Magnum.
After fixing some issues I have been able to provision a working 
Kubernetes cluster.


I'm still having issues with getting Docker Swarm working, I've tried 
with both Docker and flannel as the networking layer but
none of these works. After investigating the issue seems to be that 
etcd.service is not installed (unit file doesn't exist) so the master
doesn't work, the minion swarm node is provisioned but cannot join 
the cluster because there is no etcd.


Anybody seen this issue before? I've been digging through all 
cloud-init logs and cannot see anything that would cause this.


I also have another separate issue, when provisioning using the 
magnum-ui in Horizon and selecting ubuntu with Mesos I get the error
"The Parameter (nodes_affinity_policy) was not provided". The 
nodes_affinity_policy do have a default value in magnum.conf so I'm 
starting

to think this might be an issue with the magnum-ui dashboard?

Best regards
Tobias

On 08/04/2018 06:24 PM, Joe Topjian wrote:
We recently deployed Magnum and I've been making my way through 
getting both Swarm and Kubernetes running. I also ran into some 
initial issues. These notes may or may not help, but thought I'd 
share them in case:


* We're using Barbican for SSL. I have not tried with the internal 
x509keypair.


* I was only able to get things running with Fedora Atomic 27, 
specifically the version used in the Magnum docs: 
https://docs.openstack.org/magnum/latest/install/launch-instance.html


Anything beyond that wouldn't even boot in my cloud. I haven't dug 
into this.


* Kubernetes requires a Cluster Template to have a label of 
cert_manager_api=true set in order for the cluster to fully come up 
(at least, it didn't work for me until I set this).


As far as troubleshooting methods go, check the cloud-init logs on 
the individual instances to see if any of the "parts" have failed to 
run. Manually re-run the parts on the command-line to get a better 
idea of why they failed. Review the actual script, figure out the 
variable interpolation and how it relates to the Cluster Template 
being used.


Eventually I was able to get clusters running with the stock 
driver/templates, but wanted to tune them in order to better fit in 
our cloud, so I've "forked" them. This is in no way a slight against 
the existing drivers/templates nor do I recommend doing this until 
you reach a point where the stock drivers won't meet your needs. But 
I mention it because it's possible to do and it's not terribly hard. 
This is still a work-in-progress and a bit hacky:


https://github.com/cybera/magnum-templates

Hope that helps,
Joe

On Fri, Aug 3, 2018 at 6:46 AM, Tobias Urdin > wrote:


Hello,

I'm testing around with Magnum and have so far only had issues.
I've tried deploying Docker Swarm (on Fedora Atomic 27, Fedora
Atomic 28) and Kubernetes (on Fedora Atomic 27) and haven't been
able to get it working.

Running Queens, is there any information about supported images?
Is Magnum maintained to support Fedora Atomic still?
What is in charge of population the certificates inside the
instances, because this seems to be the root of all issues, I'm
not using Barbican 

Re: [openstack-dev] [magnum] supported OS images and magnum spawn failures for Swarm and Kubernetes

2018-08-23 Thread Tobias Urdin
Found the issue, I assume I have to use Fedora Atomic 26 until Rocky 
where I can start using Fedora Atomic 27.

Will Fedora Atomia 28 be supported for Rocky?

https://bugs.launchpad.net/magnum/+bug/1735381 (Run etcd and flanneld in 
system containers, In Fedora Atomic 27 etcd and flanneld are removed 
from the base image.)
https://review.openstack.org/#/c/524116/ (Run etcd and flanneld in a 
system container)


Still wondering about the "The Parameter (nodes_affinity_policy) was not 
provided" when using Mesos + Ubuntu?


Best regards
Tobias

On 08/23/2018 02:56 PM, Tobias Urdin wrote:

Thanks for all of your help everyone,

I've been busy with other thing but was able to pick up where I left 
regarding Magnum.
After fixing some issues I have been able to provision a working 
Kubernetes cluster.


I'm still having issues with getting Docker Swarm working, I've tried 
with both Docker and flannel as the networking layer but
none of these works. After investigating the issue seems to be that 
etcd.service is not installed (unit file doesn't exist) so the master
doesn't work, the minion swarm node is provisioned but cannot join the 
cluster because there is no etcd.


Anybody seen this issue before? I've been digging through all 
cloud-init logs and cannot see anything that would cause this.


I also have another separate issue, when provisioning using the 
magnum-ui in Horizon and selecting ubuntu with Mesos I get the error
"The Parameter (nodes_affinity_policy) was not provided". The 
nodes_affinity_policy do have a default value in magnum.conf so I'm 
starting

to think this might be an issue with the magnum-ui dashboard?

Best regards
Tobias

On 08/04/2018 06:24 PM, Joe Topjian wrote:
We recently deployed Magnum and I've been making my way through 
getting both Swarm and Kubernetes running. I also ran into some 
initial issues. These notes may or may not help, but thought I'd 
share them in case:


* We're using Barbican for SSL. I have not tried with the internal 
x509keypair.


* I was only able to get things running with Fedora Atomic 27, 
specifically the version used in the Magnum docs: 
https://docs.openstack.org/magnum/latest/install/launch-instance.html


Anything beyond that wouldn't even boot in my cloud. I haven't dug 
into this.


* Kubernetes requires a Cluster Template to have a label of 
cert_manager_api=true set in order for the cluster to fully come up 
(at least, it didn't work for me until I set this).


As far as troubleshooting methods go, check the cloud-init logs on 
the individual instances to see if any of the "parts" have failed to 
run. Manually re-run the parts on the command-line to get a better 
idea of why they failed. Review the actual script, figure out the 
variable interpolation and how it relates to the Cluster Template 
being used.


Eventually I was able to get clusters running with the stock 
driver/templates, but wanted to tune them in order to better fit in 
our cloud, so I've "forked" them. This is in no way a slight against 
the existing drivers/templates nor do I recommend doing this until 
you reach a point where the stock drivers won't meet your needs. But 
I mention it because it's possible to do and it's not terribly hard. 
This is still a work-in-progress and a bit hacky:


https://github.com/cybera/magnum-templates

Hope that helps,
Joe

On Fri, Aug 3, 2018 at 6:46 AM, Tobias Urdin > wrote:


Hello,

I'm testing around with Magnum and have so far only had issues.
I've tried deploying Docker Swarm (on Fedora Atomic 27, Fedora
Atomic 28) and Kubernetes (on Fedora Atomic 27) and haven't been
able to get it working.

Running Queens, is there any information about supported images?
Is Magnum maintained to support Fedora Atomic still?
What is in charge of population the certificates inside the
instances, because this seems to be the root of all issues, I'm
not using Barbican but the x509keypair driver
is that the reason?

Perhaps I missed some documentation that x509keypair does not
support what I'm trying to do?

I've seen the following issues:

Docker:
* Master does not start and listen on TCP because of certificate
issues
dockerd-current[1909]: Could not load X509 key pair (cert:
"/etc/docker/server.crt", key: "/etc/docker/server.key")

* Node does not start with:
Dependency failed for Docker Application Container Engine.
docker.service: Job docker.service/start failed with result
'dependency'.

Kubernetes:
* Master etcd does not start because /run/etcd does not exist
** When that is created it fails to start because of certificate
2018-08-03 12:41:16.554257 C | etcdmain: open
/etc/etcd/certs/server.crt: no such file or directory

* Master kube-apiserver does not start because of certificate
unable to load server certificate: open
/etc/kubernetes/certs/server.crt: no such file or directory

* Master 

Re: [openstack-dev] [magnum] supported OS images and magnum spawn failures for Swarm and Kubernetes

2018-08-23 Thread Tobias Urdin

Thanks for all of your help everyone,

I've been busy with other thing but was able to pick up where I left 
regarding Magnum.
After fixing some issues I have been able to provision a working 
Kubernetes cluster.


I'm still having issues with getting Docker Swarm working, I've tried 
with both Docker and flannel as the networking layer but
none of these works. After investigating the issue seems to be that 
etcd.service is not installed (unit file doesn't exist) so the master
doesn't work, the minion swarm node is provisioned but cannot join the 
cluster because there is no etcd.


Anybody seen this issue before? I've been digging through all cloud-init 
logs and cannot see anything that would cause this.


I also have another separate issue, when provisioning using the 
magnum-ui in Horizon and selecting ubuntu with Mesos I get the error
"The Parameter (nodes_affinity_policy) was not provided". The 
nodes_affinity_policy do have a default value in magnum.conf so I'm starting

to think this might be an issue with the magnum-ui dashboard?

Best regards
Tobias

On 08/04/2018 06:24 PM, Joe Topjian wrote:
We recently deployed Magnum and I've been making my way through 
getting both Swarm and Kubernetes running. I also ran into some 
initial issues. These notes may or may not help, but thought I'd share 
them in case:


* We're using Barbican for SSL. I have not tried with the internal 
x509keypair.


* I was only able to get things running with Fedora Atomic 27, 
specifically the version used in the Magnum docs: 
https://docs.openstack.org/magnum/latest/install/launch-instance.html


Anything beyond that wouldn't even boot in my cloud. I haven't dug 
into this.


* Kubernetes requires a Cluster Template to have a label of 
cert_manager_api=true set in order for the cluster to fully come up 
(at least, it didn't work for me until I set this).


As far as troubleshooting methods go, check the cloud-init logs on the 
individual instances to see if any of the "parts" have failed to run. 
Manually re-run the parts on the command-line to get a better idea of 
why they failed. Review the actual script, figure out the variable 
interpolation and how it relates to the Cluster Template being used.


Eventually I was able to get clusters running with the stock 
driver/templates, but wanted to tune them in order to better fit in 
our cloud, so I've "forked" them. This is in no way a slight against 
the existing drivers/templates nor do I recommend doing this until you 
reach a point where the stock drivers won't meet your needs. But I 
mention it because it's possible to do and it's not terribly hard. 
This is still a work-in-progress and a bit hacky:


https://github.com/cybera/magnum-templates

Hope that helps,
Joe

On Fri, Aug 3, 2018 at 6:46 AM, Tobias Urdin > wrote:


Hello,

I'm testing around with Magnum and have so far only had issues.
I've tried deploying Docker Swarm (on Fedora Atomic 27, Fedora
Atomic 28) and Kubernetes (on Fedora Atomic 27) and haven't been
able to get it working.

Running Queens, is there any information about supported images?
Is Magnum maintained to support Fedora Atomic still?
What is in charge of population the certificates inside the
instances, because this seems to be the root of all issues, I'm
not using Barbican but the x509keypair driver
is that the reason?

Perhaps I missed some documentation that x509keypair does not
support what I'm trying to do?

I've seen the following issues:

Docker:
* Master does not start and listen on TCP because of certificate
issues
dockerd-current[1909]: Could not load X509 key pair (cert:
"/etc/docker/server.crt", key: "/etc/docker/server.key")

* Node does not start with:
Dependency failed for Docker Application Container Engine.
docker.service: Job docker.service/start failed with result
'dependency'.

Kubernetes:
* Master etcd does not start because /run/etcd does not exist
** When that is created it fails to start because of certificate
2018-08-03 12:41:16.554257 C | etcdmain: open
/etc/etcd/certs/server.crt: no such file or directory

* Master kube-apiserver does not start because of certificate
unable to load server certificate: open
/etc/kubernetes/certs/server.crt: no such file or directory

* Master heat script just sleeps forever waiting for port 8080 to
become available (kube-apiserver) so it can never kubectl apply
the final steps.

* Node does not even start and times out when Heat deploys it,
probably because master never finishes

Any help is appreciated perhaps I've missed something crucial,
I've not tested Kubernetes on CoreOS yet.

Best regards
Tobias

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe:

Re: [openstack-dev] [magnum] supported OS images and magnum spawn failures for Swarm and Kubernetes

2018-08-04 Thread Joe Topjian
We recently deployed Magnum and I've been making my way through getting
both Swarm and Kubernetes running. I also ran into some initial issues.
These notes may or may not help, but thought I'd share them in case:

* We're using Barbican for SSL. I have not tried with the internal
x509keypair.

* I was only able to get things running with Fedora Atomic 27, specifically
the version used in the Magnum docs:
https://docs.openstack.org/magnum/latest/install/launch-instance.html

Anything beyond that wouldn't even boot in my cloud. I haven't dug into
this.

* Kubernetes requires a Cluster Template to have a label of
cert_manager_api=true set in order for the cluster to fully come up (at
least, it didn't work for me until I set this).

As far as troubleshooting methods go, check the cloud-init logs on the
individual instances to see if any of the "parts" have failed to run.
Manually re-run the parts on the command-line to get a better idea of why
they failed. Review the actual script, figure out the variable
interpolation and how it relates to the Cluster Template being used.

Eventually I was able to get clusters running with the stock
driver/templates, but wanted to tune them in order to better fit in our
cloud, so I've "forked" them. This is in no way a slight against the
existing drivers/templates nor do I recommend doing this until you reach a
point where the stock drivers won't meet your needs. But I mention it
because it's possible to do and it's not terribly hard. This is still a
work-in-progress and a bit hacky:

https://github.com/cybera/magnum-templates

Hope that helps,
Joe

On Fri, Aug 3, 2018 at 6:46 AM, Tobias Urdin  wrote:

> Hello,
>
> I'm testing around with Magnum and have so far only had issues.
> I've tried deploying Docker Swarm (on Fedora Atomic 27, Fedora Atomic 28)
> and Kubernetes (on Fedora Atomic 27) and haven't been able to get it
> working.
>
> Running Queens, is there any information about supported images? Is Magnum
> maintained to support Fedora Atomic still?
> What is in charge of population the certificates inside the instances,
> because this seems to be the root of all issues, I'm not using Barbican but
> the x509keypair driver
> is that the reason?
>
> Perhaps I missed some documentation that x509keypair does not support what
> I'm trying to do?
>
> I've seen the following issues:
>
> Docker:
> * Master does not start and listen on TCP because of certificate issues
> dockerd-current[1909]: Could not load X509 key pair (cert:
> "/etc/docker/server.crt", key: "/etc/docker/server.key")
>
> * Node does not start with:
> Dependency failed for Docker Application Container Engine.
> docker.service: Job docker.service/start failed with result 'dependency'.
>
> Kubernetes:
> * Master etcd does not start because /run/etcd does not exist
> ** When that is created it fails to start because of certificate
> 2018-08-03 12:41:16.554257 C | etcdmain: open /etc/etcd/certs/server.crt:
> no such file or directory
>
> * Master kube-apiserver does not start because of certificate
> unable to load server certificate: open /etc/kubernetes/certs/server.crt:
> no such file or directory
>
> * Master heat script just sleeps forever waiting for port 8080 to become
> available (kube-apiserver) so it can never kubectl apply the final steps.
>
> * Node does not even start and times out when Heat deploys it, probably
> because master never finishes
>
> Any help is appreciated perhaps I've missed something crucial, I've not
> tested Kubernetes on CoreOS yet.
>
> Best regards
> Tobias
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [magnum] supported OS images and magnum spawn failures for Swarm and Kubernetes

2018-08-03 Thread Bogdan Katynski

> On 3 Aug 2018, at 13:46, Tobias Urdin  wrote:
> 
> Kubernetes:
> * Master etcd does not start because /run/etcd does not exist

This could be an issue with etcd rpm. With Systemd, /run is an in-memory tmpfs 
and is wiped on reboots.

We’ve come across a similar issue in mariadb rpm on CentOS 7: 
https://bugzilla.redhat.com/show_bug.cgi?id=1538066

If the etcd rpm only creates /run/etcd during installation, that directory will 
not survive reboots. The rpm should also drop a file in 
/usr/lib/tmpfiles.d/etcd.conf with contents similar to

d /run/etcd 0755 etcd etcd - -


--
Bogdan Katyński
freenode: bodgix







signature.asc
Description: Message signed with OpenPGP
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [magnum] supported OS images and magnum spawn failures for Swarm and Kubernetes

2018-08-03 Thread Tobias Urdin

Hello,

I'm testing around with Magnum and have so far only had issues.
I've tried deploying Docker Swarm (on Fedora Atomic 27, Fedora Atomic 
28) and Kubernetes (on Fedora Atomic 27) and haven't been able to get it 
working.


Running Queens, is there any information about supported images? Is 
Magnum maintained to support Fedora Atomic still?
What is in charge of population the certificates inside the instances, 
because this seems to be the root of all issues, I'm not using Barbican 
but the x509keypair driver

is that the reason?

Perhaps I missed some documentation that x509keypair does not support 
what I'm trying to do?


I've seen the following issues:

Docker:
* Master does not start and listen on TCP because of certificate issues
dockerd-current[1909]: Could not load X509 key pair (cert: 
"/etc/docker/server.crt", key: "/etc/docker/server.key")


* Node does not start with:
Dependency failed for Docker Application Container Engine.
docker.service: Job docker.service/start failed with result 'dependency'.

Kubernetes:
* Master etcd does not start because /run/etcd does not exist
** When that is created it fails to start because of certificate
2018-08-03 12:41:16.554257 C | etcdmain: open 
/etc/etcd/certs/server.crt: no such file or directory


* Master kube-apiserver does not start because of certificate
unable to load server certificate: open 
/etc/kubernetes/certs/server.crt: no such file or directory


* Master heat script just sleeps forever waiting for port 8080 to become 
available (kube-apiserver) so it can never kubectl apply the final steps.


* Node does not even start and times out when Heat deploys it, probably 
because master never finishes


Any help is appreciated perhaps I've missed something crucial, I've not 
tested Kubernetes on CoreOS yet.


Best regards
Tobias

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev