OpenShift 3.11 Upgrade pre-checks fail

2019-11-12 Thread Shane Ripley
Greetings, I'm attempting to perform an in-place upgrade of a 9 node OKD 3.11 
cluster and have run the docker version issue below: 

$ ansible-playbook -i inventory.ini 
openshift-ansible/playbooks/byo/openshift-cluster/upgrades/v3_11/upgrade.yml 


TASK [container_runtime : Required docker version not available (non-atomic)] 
***
 
fatal: [lmaster1]: FAILED! => {"changed": false, "msg": "This playbook requires 
access to Docker 1.13 or later"} 
fatal: [lmaster2]: FAILED! => {"changed": false, "msg": "This playbook requires 
access to Docker 1.13 or later"} 
fatal: [master3]: FAILED! => {"changed": false, "msg": "This playbook requires 
access to Docker 1.13 or later"} 
to retry, use: --limit 
@/home/origin/openshift-ansible/playbooks/byo/openshift-cluster/upgrades/v3_11/upgrade.retry
 

PLAY RECAP 
**
 
master1 : ok=158 changed=10 unreachable=0 failed=1 
master2 : ok=132 changed=10 unreachable=0 failed=1 
master3 : ok=132 changed=10 unreachable=0 failed=1 
localhost : ok=28 changed=0 unreachable=0 failed=0 


INSTALLER STATUS 

 
Initialization : Complete (0:02:34) 

Failure summary: 

1. Hosts: master1,master2,master3 
Play: Verify docker upgrade targets 
Task: Required docker version not available (non-atomic) 
Message: This playbook requires access to Docker 1.13 or later 

oc version 
oc v3.11.0+62803d0-1 
kubernetes v1.11.0+d4cacc0 
features: Basic-Auth GSSAPI Kerberos SPNEGO 

Server https://console-ext.okd.local.domain:443 
openshift v3.11.0+d0c29df-98 
kubernetes v1.11.0+d4cacc0 

I'm using the same inventory file I used to build the cluster initially and 
docker 1.13 is installed. 

rpm -qa |grep docker |sort 
cockpit-docker-176-2.el7.centos.x86_64 
docker-1.13.1-75.git8633870.el7.centos.x86_64 
docker-client-1.13.1-75.git8633870.el7.centos.x86_64 
docker-common-1.13.1-75.git8633870.el7.centos.x86_64 
origin-docker-excluder-3.11.0-1.el7.git.0.62803d0.noarch 
python-docker-py-1.10.6-4.el7.noarch 
python-docker-pycreds-1.10.6-4.el7.noarch 

I'm not really sure what to check for troubleshooting since docker 1.13 is 
already installed on all of the nodes. 




___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users


aggregationRule failed on OC 3.11 but succeeded on OC 4.1

2019-11-12 Thread Weiqiang Zhuang
I am trying to create a clusterrole using following definition
 
apiVersion: rbac.authorization.k8s.io/v1kind: ClusterRolemetadata:  name: kubeflow-istio-admin  labels:    rbac.authorization.kubeflow.org/aggregate-to-kubeflow-admin: "true"aggregationRule:  clusterRoleSelectors:  - matchLabels:      rbac.authorization.kubeflow.org/aggregate-to-kubeflow-istio-admin: "true"rules: []
 
This failed on OC 3.11 cluster with 
 

Error from server (Forbidden): error when creating "role.yaml": clusterroles.rbac.authorization.k8s.io "kubeflow-istio-admin" is forbidden: must have cluster-admin privileges to use the aggregationRule

 
But the same succeeded on OC 4.1.
 
Tried to search for explanation of different behavior but in vain. Anyone here knows what could be the reason?
 
Thanks.
 
Weiqiang Zhuang___IBM CODAITIBM Silicon Valley LabTel: 408-463-5992
T/L: 25435992Email: wzhu...@us.ibm.com

___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users


Re: How to use extra trusted CA certs when pulling images for a builder

2019-11-12 Thread Ben Parees
On Mon, Nov 11, 2019 at 11:27 PM Ben Parees  wrote:

>
>
> On Mon, Nov 11, 2019 at 10:47 PM Joel Pearson <
> japear...@agiledigital.com.au> wrote:
>
>>
>>
>> On Tue, 12 Nov 2019 at 06:56, Ben Parees  wrote:
>>
>>>
>>>

 Can I use the “trustedCA” part of the proxy configuration without
 actually specifying an explicit proxy?

>>>
>>> you should be able to.  Daneyon can you confirm?  (if you can't i'd
>>> consider it a bug).
>>>
>>> It does work! Thanks for that. user-ca-bundle already existed and had my
>> certificate in there, I just needed to reference user-ca-bundle in the
>> proxy config.
>>
>
> cool, given that you supplied the CAs during install, and the
> user-ca-bundle CM was created, i'm a little surprised the install didn't
> automatically setup the reference in the proxyconfig resource for you.  I'm
> guessing it did not because there was no actual proxy hostname configured.
> I think that's a gap we should close..would you mind filing a bug?  (
> bugzilla.redhat.com).  You can submit it against the install component.
>

fyi I've filed a bug for this aspect of the issues you ran into:
https://bugzilla.redhat.com/show_bug.cgi?id=1771564

we still need to chase down the issues you hit with respect to the various
CAs (the cluster proxy CA config and the image CA config) seemingly not
being used during image import, there are no tracker bugs for those yet but
Oleg is investigating.



>
>
>
>>
>> apiVersion: config.openshift.io/v1
>> kind: Proxy
>> metadata:
>>   name: cluster
>> spec:
>>   trustedCA:
>> name: user-ca-bundle
>>
>
>
> --
> Ben Parees | OpenShift
>
>

-- 
Ben Parees | OpenShift
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users


Re: How to use extra trusted CA certs when pulling images for a builder

2019-11-12 Thread Gabe Montero
On Mon, Nov 11, 2019 at 11:27 PM Joel Pearson 
wrote:

> I've now discovered that the cluster-samples-operator doesn't seem honour
> the proxy settings, and I see lots of errors in the
> cluster-samples-operator- pod logs
>
> time="2019-11-12T04:15:49Z" level=warning msg="Image import for
> imagestream dotnet tag 2.1 generation 2 failed with detailed message
> Internal error occurred: Get https://registry.redhat.io/v2/: x509:
> certificate signed by unknown authority"
>
> Is there a way to get that operator to use the same user-ca-bundle?
>

Samples operator just reports the status of the sample imagestreams.  It
does not actually execute the imagestream import, and thus is not the
controller that consumes the user-ca-bundle.

Imagestream import is a function of the imagestream controller in the
openshift-controller-manager and the internal image registry.

That said, my understanding was those items should consume global CA /
cluster image configuration as well.

The folks on here already, plus Oleg, who I have now included, can
elaborate.  My quick scan of the docs did not find where that was explained.


> On Tue, 12 Nov 2019 at 14:46, Joel Pearson 
> wrote:
>
>>
>>
>> On Tue, 12 Nov 2019 at 06:56, Ben Parees  wrote:
>>
>>>
>>>

 Can I use the “trustedCA” part of the proxy configuration without
 actually specifying an explicit proxy?

>>>
>>> you should be able to.  Daneyon can you confirm?  (if you can't i'd
>>> consider it a bug).
>>>
>>> It does work! Thanks for that. user-ca-bundle already existed and had my
>> certificate in there, I just needed to reference user-ca-bundle in the
>> proxy config.
>>
>> apiVersion: config.openshift.io/v1
>> kind: Proxy
>> metadata:
>>   name: cluster
>> spec:
>>   trustedCA:
>> name: user-ca-bundle
>>
>
>
> ___
> users mailing list
> users@lists.openshift.redhat.com
> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users


Re: Prometheus (OKD/3.11) NFS?

2019-11-12 Thread Alan Christie
Thanks, it turned out to be relatively simple in that I just needed 2 PVs 
(prometheus-1 and 2) to satisfy prometheus and 3 (alertmanager-1 to 3) to 
satisfy the alertmanager. It wasn’t immediately obvious how to solve this in a 
‘static’ system it is documented and now working - I just set the two 
`storage_enabled` variables but leave the `storageslass` ones alone.

I understand the side-effects of NFS but my options (at the moment) are 
extremely limited.

Thanks for your help.

Alan Christie
achris...@informaticsmatters.com



> On 12 Nov 2019, at 1:28 pm, Simon Pasquier  wrote:
> 
> On Wed, Nov 6, 2019 at 6:11 PM Alan Christie
>  wrote:
>> 
>> Hi,
>> 
>> My cluster doesn’t have dynamic volumes at the moment. For experimentation 
>> is it possible to use NFS volumes in 3.11 for prometheus and the alert 
>> manager?
> 
> In general Prometheus doesn't play well with NFS because most
> implementations don't fully support POSIX.
> It is called out in the Prometheus documentation:
> https://prometheus.io/docs/prometheus/latest/storage/#operational-aspects
> 
>> 
>> I notice that the ansible playbook variables for prometheus in OKD 3.11 are 
>> very different to those in 3.7 but...
> 
> Yes the big difference starting with 3.11 is that Prometheus is
> deployed using the Prometheus operator via the Cluster Monitoring
> Operator.
> 
>> 
>> - Can I use pre-provisioned NFS volumes?
>> - …or…is there an equivalent of 3.7’s 
>> “openshift_prometheus_alertmanager_storage_kind=nfs"?
> 
> See 
> https://docs.okd.io/3.11/install_config/prometheus_cluster_monitoring.html#persistent-storage
> 
> 
>> 
>> Any advice would be greatly appreciated.
>> 
>> Alan Christie
>> achris...@informaticsmatters.com
>> 
>> 
>> 
>> 
>> ___
>> users mailing list
>> users@lists.openshift.redhat.com
>> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
> 


___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users


Re: How to use extra trusted CA certs when pulling images for a builder

2019-11-12 Thread Ben Parees
On Tue, Nov 12, 2019 at 3:45 AM Joel Pearson 
wrote:

>
>
> On Tue, 12 Nov 2019 at 15:37, Ben Parees  wrote:
>
>>
>>
>> On Mon, Nov 11, 2019 at 11:26 PM Joel Pearson <
>> japear...@agiledigital.com.au> wrote:
>>
>>> I've now discovered that the cluster-samples-operator doesn't seem
>>> honour the proxy settings, and I see lots of errors in the
>>> cluster-samples-operator- pod logs
>>>
>>> time="2019-11-12T04:15:49Z" level=warning msg="Image import for
>>> imagestream dotnet tag 2.1 generation 2 failed with detailed message
>>> Internal error occurred: Get https://I /v2/
>>> : x509: certificate signed by unknown
>>> authority"
>>>
>>> Is there a way to get that operator to use the same user-ca-bundle?
>>>
>>
>> image import should be using those CAs (it's really about the
>> openshift-apiserver, not the samples operator) automatically (sounds like
>> another potential bug, but i'll let Oleg weigh in on this one).
>>
>> However barring that, you can use the mechanism described here to
>> setup additional CAs for importing from registries:
>>
>> https://docs.openshift.com/container-platform/4.2/openshift_images/image-configuration.html#images-configuration-file_image-configuration
>>
>> you can follow the more detailed instructions here:
>>
>> https://docs.openshift.com/container-platform/4.2/builds/setting-up-trusted-ca.html#configmap-adding-ca_setting-up-trusted-ca
>>
>
> I tried this approach but it didn't work for me.
>
> I ran this command:
>
> oc create configmap registry-cas -n openshift-config \
> --from-file=registry.redhat.io..5000=/path/to/ca.crt \
> --from-file=registry.redhat.io..443=/path/to/ca.crt \
> --from-file=registry.redhat.io=/path/to/ca.crt
>
> and:
>
> oc patch image.config.openshift.io/cluster --patch
> '{"spec":{"additionalTrustedCA":{"name":"registry-cas"}}}' --type=merge
>
> And that still didn't work. First I deleted the
> cluster-samples-operator- pod, then I tried forcing the masters to
> restart by touching some machine config (I don't know a better way).
> But it still didn't work.  Maybe the samples operator doesn't let you
> easily override the trusted CA certs?
>

Because no good bug report should not be rewarded with some educational
background:

The samples operator is only responsible for creating the imagestream, it
isn't actually doing the import (ie reaching out to the registry and
pulling down the metadata and putting it in the imagestream).  That task is
performed by the openshift-apiserver.  What should be happening when you
update the image config resource with the name of the CA configmap is that
the openshift-apiserver operator should observe the configuration change
and provide the new CAs to the openshift-apiserver pods (which necessitates
a restart of the openshift-apiserver pods).

Once the openshift-apiserver pods are restarted with the new CAs, you
should be able to run "oc import-image" to retry the import.  (The samples
operator is supposed to retry the failed imports periodically, but there is
a different bug that is being fixed related to that, so until then you'll
have to retry the import manually once you've corrected whatever caused the
failure).

So again, there may be a bug here in terms of the openshift-apiserver
picking up the CAs and we need to investigate it (as well as a separate bug
if it is not picking up the proxy CAs), but I wanted you to understand the
relevant components so your own debugging process can be more productive.




>
>
>>
>>
>> (Brandi/Adam, we should really include the example from that second link,
>> in the general "image resource configuration" page from the first link).
>>
>> Unfortunately it does not allow you to reuse the user-ca-bundle CM since
>> the format of the CM is a bit different (needs an entry per registry
>> hostname).
>>
>>

-- 
Ben Parees | OpenShift
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users


Re: How to use extra trusted CA certs when pulling images for a builder

2019-11-12 Thread Adam Kaplan
Slightly related - there is an existing bugzilla where `oc import-image`
and `oc tag` will fail if the "origin" tag references the internal registry
with a similar x509 error [1].
Echoing Clayton, please file a bug and if warranted we'll link the two
together.

[1] https://bugzilla.redhat.com/show_bug.cgi?id=1716835

On Tue, Nov 12, 2019 at 8:42 AM Clayton Coleman  wrote:

>
>
> On Nov 12, 2019, at 3:44 AM, Joel Pearson 
> wrote:
>
>
>
> On Tue, 12 Nov 2019 at 15:37, Ben Parees  wrote:
>
>>
>>
>> On Mon, Nov 11, 2019 at 11:26 PM Joel Pearson <
>> japear...@agiledigital.com.au> wrote:
>>
>>> I've now discovered that the cluster-samples-operator doesn't seem
>>> honour the proxy settings, and I see lots of errors in the
>>> cluster-samples-operator- pod logs
>>>
>>> time="2019-11-12T04:15:49Z" level=warning msg="Image import for
>>> imagestream dotnet tag 2.1 generation 2 failed with detailed message
>>> Internal error occurred: Get https://I /v2/
>>> : x509: certificate signed by unknown
>>> authority"
>>>
>>> Is there a way to get that operator to use the same user-ca-bundle?
>>>
>>
>> image import should be using those CAs (it's really about the
>> openshift-apiserver, not the samples operator) automatically (sounds like
>> another potential bug, but i'll let Oleg weigh in on this one).
>>
>> However barring that, you can use the mechanism described here to
>> setup additional CAs for importing from registries:
>>
>> https://docs.openshift.com/container-platform/4.2/openshift_images/image-configuration.html#images-configuration-file_image-configuration
>>
>> you can follow the more detailed instructions here:
>>
>> https://docs.openshift.com/container-platform/4.2/builds/setting-up-trusted-ca.html#configmap-adding-ca_setting-up-trusted-ca
>>
>
> I tried this approach but it didn't work for me.
>
> I ran this command:
>
> oc create configmap registry-cas -n openshift-config \
> --from-file=registry.redhat.io..5000=/path/to/ca.crt \
> --from-file=registry.redhat.io..443=/path/to/ca.crt \
> --from-file=registry.redhat.io=/path/to/ca.crt
>
> and:
>
> oc patch image.config.openshift.io/cluster --patch
> '{"spec":{"additionalTrustedCA":{"name":"registry-cas"}}}' --type=merge
>
> And that still didn't work. First I deleted the
> cluster-samples-operator- pod, then I tried forcing the masters to
> restart by touching some machine config (I don't know a better way).
> But it still didn't work.  Maybe the samples operator doesn't let you
> easily override the trusted CA certs?
>
>
> No, as Ben said this should be working.  Please file a bug.
>
>
>
>>
>>
>> (Brandi/Adam, we should really include the example from that second link,
>> in the general "image resource configuration" page from the first link).
>>
>> Unfortunately it does not allow you to reuse the user-ca-bundle CM since
>> the format of the CM is a bit different (needs an entry per registry
>> hostname).
>>
>> ___
> users mailing list
> users@lists.openshift.redhat.com
> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>
>

-- 

Adam Kaplan

He/Him

Senior Software Engineer - OpenShift

Red Hat 

100 E. Davie St. Raleigh, NC 27601 USA

adam.kap...@redhat.comT: +1-919-754-4843 IM: adambkaplan

___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users


Re: How to use extra trusted CA certs when pulling images for a builder

2019-11-12 Thread Clayton Coleman
On Nov 12, 2019, at 3:44 AM, Joel Pearson 
wrote:



On Tue, 12 Nov 2019 at 15:37, Ben Parees  wrote:

>
>
> On Mon, Nov 11, 2019 at 11:26 PM Joel Pearson <
> japear...@agiledigital.com.au> wrote:
>
>> I've now discovered that the cluster-samples-operator doesn't seem honour
>> the proxy settings, and I see lots of errors in the
>> cluster-samples-operator- pod logs
>>
>> time="2019-11-12T04:15:49Z" level=warning msg="Image import for
>> imagestream dotnet tag 2.1 generation 2 failed with detailed message
>> Internal error occurred: Get https://I /v2/
>> : x509: certificate signed by unknown
>> authority"
>>
>> Is there a way to get that operator to use the same user-ca-bundle?
>>
>
> image import should be using those CAs (it's really about the
> openshift-apiserver, not the samples operator) automatically (sounds like
> another potential bug, but i'll let Oleg weigh in on this one).
>
> However barring that, you can use the mechanism described here to
> setup additional CAs for importing from registries:
>
> https://docs.openshift.com/container-platform/4.2/openshift_images/image-configuration.html#images-configuration-file_image-configuration
>
> you can follow the more detailed instructions here:
>
> https://docs.openshift.com/container-platform/4.2/builds/setting-up-trusted-ca.html#configmap-adding-ca_setting-up-trusted-ca
>

I tried this approach but it didn't work for me.

I ran this command:

oc create configmap registry-cas -n openshift-config \
--from-file=registry.redhat.io..5000=/path/to/ca.crt \
--from-file=registry.redhat.io..443=/path/to/ca.crt \
--from-file=registry.redhat.io=/path/to/ca.crt

and:

oc patch image.config.openshift.io/cluster --patch
'{"spec":{"additionalTrustedCA":{"name":"registry-cas"}}}' --type=merge

And that still didn't work. First I deleted the
cluster-samples-operator- pod, then I tried forcing the masters to
restart by touching some machine config (I don't know a better way).
But it still didn't work.  Maybe the samples operator doesn't let you
easily override the trusted CA certs?


No, as Ben said this should be working.  Please file a bug.



>
>
> (Brandi/Adam, we should really include the example from that second link,
> in the general "image resource configuration" page from the first link).
>
> Unfortunately it does not allow you to reuse the user-ca-bundle CM since
> the format of the CM is a bit different (needs an entry per registry
> hostname).
>
> ___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users


Re: Prometheus (OKD/3.11) NFS?

2019-11-12 Thread Simon Pasquier
On Wed, Nov 6, 2019 at 6:11 PM Alan Christie
 wrote:
>
> Hi,
>
> My cluster doesn’t have dynamic volumes at the moment. For experimentation is 
> it possible to use NFS volumes in 3.11 for prometheus and the alert manager?

In general Prometheus doesn't play well with NFS because most
implementations don't fully support POSIX.
It is called out in the Prometheus documentation:
https://prometheus.io/docs/prometheus/latest/storage/#operational-aspects

>
> I notice that the ansible playbook variables for prometheus in OKD 3.11 are 
> very different to those in 3.7 but...

Yes the big difference starting with 3.11 is that Prometheus is
deployed using the Prometheus operator via the Cluster Monitoring
Operator.

>
> - Can I use pre-provisioned NFS volumes?
> - …or…is there an equivalent of 3.7’s 
> “openshift_prometheus_alertmanager_storage_kind=nfs"?

See 
https://docs.okd.io/3.11/install_config/prometheus_cluster_monitoring.html#persistent-storage


>
> Any advice would be greatly appreciated.
>
> Alan Christie
> achris...@informaticsmatters.com
>
>
>
>
> ___
> users mailing list
> users@lists.openshift.redhat.com
> http://lists.openshift.redhat.com/openshiftmm/listinfo/users


___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users


Re: How to use extra trusted CA certs when pulling images for a builder

2019-11-12 Thread Joel Pearson
On Tue, 12 Nov 2019 at 15:37, Ben Parees  wrote:

>
>
> On Mon, Nov 11, 2019 at 11:26 PM Joel Pearson <
> japear...@agiledigital.com.au> wrote:
>
>> I've now discovered that the cluster-samples-operator doesn't seem honour
>> the proxy settings, and I see lots of errors in the
>> cluster-samples-operator- pod logs
>>
>> time="2019-11-12T04:15:49Z" level=warning msg="Image import for
>> imagestream dotnet tag 2.1 generation 2 failed with detailed message
>> Internal error occurred: Get https://I /v2/
>> : x509: certificate signed by unknown
>> authority"
>>
>> Is there a way to get that operator to use the same user-ca-bundle?
>>
>
> image import should be using those CAs (it's really about the
> openshift-apiserver, not the samples operator) automatically (sounds like
> another potential bug, but i'll let Oleg weigh in on this one).
>
> However barring that, you can use the mechanism described here to
> setup additional CAs for importing from registries:
>
> https://docs.openshift.com/container-platform/4.2/openshift_images/image-configuration.html#images-configuration-file_image-configuration
>
> you can follow the more detailed instructions here:
>
> https://docs.openshift.com/container-platform/4.2/builds/setting-up-trusted-ca.html#configmap-adding-ca_setting-up-trusted-ca
>

I tried this approach but it didn't work for me.

I ran this command:

oc create configmap registry-cas -n openshift-config \
--from-file=registry.redhat.io..5000=/path/to/ca.crt \
--from-file=registry.redhat.io..443=/path/to/ca.crt \
--from-file=registry.redhat.io=/path/to/ca.crt

and:

oc patch image.config.openshift.io/cluster --patch
'{"spec":{"additionalTrustedCA":{"name":"registry-cas"}}}' --type=merge

And that still didn't work. First I deleted the
cluster-samples-operator- pod, then I tried forcing the masters to
restart by touching some machine config (I don't know a better way).
But it still didn't work.  Maybe the samples operator doesn't let you
easily override the trusted CA certs?


>
>
> (Brandi/Adam, we should really include the example from that second link,
> in the general "image resource configuration" page from the first link).
>
> Unfortunately it does not allow you to reuse the user-ca-bundle CM since
> the format of the CM is a bit different (needs an entry per registry
> hostname).
>
>
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users