It seems that problem was caused by k8s 1.19.
When we deployed Flink operator on vanilla k8s 1.19 we got the same error
that we have on OKD 4.6.0 We are planing to update OKD to newer version
that will use more up to date k8s.

What is the minimal k8s version required for/supported by Flink operator?
I haven't found it in operator docs - is not there or I have missed it?

Thanks.
Krzysztof Chmielewski

śr., 20 wrz 2023 o 22:32 Krzysztof Chmielewski <
krzysiek.chmielew...@gmail.com> napisał(a):

> Thank you Zach,
> our flink-operator and flink deployments are in same namespace -> called
> "flink". We have executed what is described in [1] before my initial
> message.
> We are using OKD 4.6.0 that according to the doc is using k8s 1.19. the
> very same config is working fine on "vanilla" k8s, but for some reason it
> is failing on that system where we have OKD installed.
>
> I believe we do have proper roles/sa assigned, for example:
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> *oc get saNAME             SECRETS   AGEbuilder          2
> 6d22hdefault          2         6d22hdeployer         2         6d22hflink
>            2         6d19hflink-operator   2         17h################oc
> get roleNAME    CREATED ATflink   2023-09-13T11:53:42Zoc get
> rolebindingNAME                              ROLE
>                AGEflink-role-binding                Role/flink
>                        6d19h################*
>
> [1]
> https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-release-1.6/docs/operations/rbac/
>
>
> Thanks, Krzysztof Chmielewski
>
> śr., 20 wrz 2023 o 05:40 Zach Lorimer <zlori...@gmail.com> napisał(a):
>
>> I haven’t used OKD but it sounds like OLM. If that’s the case, I’m
>> assuming the operator was deployed to the “operators” namespace. In that
>> case, you’ll need to create the RBACs and such in the Flink namespace for
>> that deployment to work.
>>
>> For example this needs to be in each namespace that you want to have
>> Flink deployments in.
>>
>> kubectl apply -f - <<EOF
>> apiVersion: v1
>> kind: ServiceAccount
>> metadata:
>>   labels:
>>     app.kubernetes.io/name: flink-kubernetes-operator
>>     app.kubernetes.io/version: 1.5.0
>>   name: flink
>> ---
>> apiVersion: rbac.authorization.k8s.io/v1
>> kind: Role
>> metadata:
>>   labels:
>>     app.kubernetes.io/name: flink-kubernetes-operator
>>     app.kubernetes.io/version: 1.5.0
>>   name: flink
>> rules:
>> - apiGroups:
>>   - ""
>>   resources:
>>   - pods
>>   - configmaps
>>   verbs:
>>   - '*'
>> - apiGroups:
>>   - apps
>>   resources:
>>   - deployments
>>   - deployments/finalizers
>>   verbs:
>>   - '*'
>> ---
>> apiVersion: rbac.authorization.k8s.io/v1
>> kind: RoleBinding
>> metadata:
>>   labels:
>>     app.kubernetes.io/name: flink-kubernetes-operator
>>     app.kubernetes.io/version: 1.5.0
>>   name: flink-role-binding
>> roleRef:
>>   apiGroup: rbac.authorization.k8s.io
>>   kind: Role
>>   name: flink
>> subjects:
>> - kind: ServiceAccount
>>   name: flink
>> EOF
>>
>> Hopefully that helps.
>>
>>
>> On Tue, Sep 19, 2023 at 5:40 PM Krzysztof Chmielewski <
>> krzysiek.chmielew...@gmail.com> wrote:
>>
>>> Hi community,
>>> I was wondering if anyone tried to deploy Flink using Flink k8s operator
>>> on machine where OKD [1] is installed?
>>>
>>> We have tried to install Flink k8s operator version 1.6 which seems to
>>> succeed, however when we try to deploy simple Flink deployment we are
>>> getting an error.
>>>
>>> 2023-09-19 10:11:36,440 i.j.o.p.e.ReconciliationDispatcher
>>> [ERROR][flink/test] Error during event processing ExecutionScope{ resource
>>> id: ResourceID{name='test', namespace='flink'}, version: 684949788} failed.
>>>
>>> io.fabric8.kubernetes.client.KubernetesClientException: Failure
>>> executing: PUT at:
>>> https://172.30.0.1:443/apis/flink.apache.org/v1beta1/namespaces/flink/flinkdeployments/test.
>>> Message: FlinkDeployment.flink.apache.org "test" is invalid:
>>> [spec.ingress: Invalid value: "null": spec.ingress in body must be of type
>>> object: "null", spec.mode: Invalid value: "null": spec.mode in body must be
>>> of type string: "null", spec.mode: Unsupported value: "null": supported
>>> values: "native", "standalone", spec.logConfiguration: Invalid value:
>>> "null": spec.logConfiguration in body must be of type object: "null",
>>> spec.imagePullPolicy: Invalid value: "null": spec.imagePullPolicy in body
>>> must be of type string: "null", spec.jobManager.podTemplate: Invalid value:
>>> "null": spec.jobManager.podTemplate in body must be of type object: "null",
>>> spec.jobManager.resource.ephemeralStorage: Invalid value: "null":
>>> spec.jobManager.resource.ephemeralStorage in body must be of type string:
>>> "null", spec.podTemplate: Invalid value: "null": spec.podTemplate in body
>>> must be of type object: "null", spec.restartNonce: Invalid value: "null":
>>> spec.restartNonce in body must be of type integer: "null",
>>> spec.taskManager.replicas: Invalid value: "null": spec.taskManager.replicas
>>> in body must be of type integer: "null",
>>> spec.taskManager.resource.ephemeralStorage: Invalid value: "null":
>>> spec.taskManager.resource.ephemeralStorage in body must be of type string:
>>> "null", spec.taskManager.podTemplate: Invalid value: "null":
>>> spec.taskManager.podTemplate in body must be of type object: "null",
>>> spec.job: Invalid value: "null": spec.job in body must be of type object:
>>> "null", .spec.taskManager.replicas: Invalid value: 0:
>>> .spec.taskManager.replicas accessor error: <nil> is of the type <nil>,
>>> expected int64]. Received status: Status(apiVersion=v1, code=422,
>>> details=StatusDetails(causes=[StatusCause(field=spec.ingress,
>>> message=Invalid value: "null": spec.ingress in body must be of type object:
>>> "null", reason=FieldValueInvalid, additionalProperties={}),
>>> StatusCause(field=spec.mode, message=Invalid value: "null": spec.mode in
>>> body must be of type string: "null", reason=FieldValueInvalid,
>>> additionalProperties={}), StatusCause(field=spec.mode, message=Unsupported
>>> value: "null": supported values: "native", "standalone",
>>> reason=FieldValueNotSupported, additionalProperties={}),
>>> StatusCause(field=spec.logConfiguration, message=Invalid value: "null":
>>> spec.logConfiguration in body must be of type object: "null",
>>> reason=FieldValueInvalid, additionalProperties={}),
>>> StatusCause(field=spec.imagePullPolicy, message=Invalid value: "null":
>>> spec.imagePullPolicy in body must be of type string: "null",
>>> reason=FieldValueInvalid, additionalProperties={}),
>>> StatusCause(field=spec.jobManager.podTemplate, message=Invalid value:
>>> "null": spec.jobManager.podTemplate in body must be of type object: "null",
>>> reason=FieldValueInvalid, additionalProperties={}),
>>> StatusCause(field=spec.jobManager.resource.ephemeralStorage,
>>> message=Invalid value: "null": spec.jobManager.resource.ephemeralStorage in
>>> body must be of type string: "null", reason=FieldValueInvalid,
>>> additionalProperties={}), StatusCause(field=spec.podTemplate,
>>> message=Invalid value: "null": spec.podTemplate in body must be of type
>>> object: "null", reason=FieldValueInvalid, additionalProperties={}),
>>> StatusCause(field=spec.restartNonce, message=Invalid value: "null":
>>> spec.restartNonce in body must be of type integer: "null",
>>> reason=FieldValueInvalid, additionalProperties={}),
>>> StatusCause(field=spec.taskManager.replicas, message=Invalid value: "null":
>>> spec.taskManager.replicas in body must be of type integer: "null",
>>> reason=FieldValueInvalid, additionalProperties={}),
>>> StatusCause(field=spec.taskManager.resource.ephemeralStorage,
>>> message=Invalid value: "null": spec.taskManager.resource.ephemeralStorage
>>> in body must be of type string: "null", reason=FieldValueInvalid,
>>> additionalProperties={}), StatusCause(field=spec.taskManager.podTemplate,
>>> message=Invalid value: "null": spec.taskManager.podTemplate in body must be
>>> of type object: "null", reason=FieldValueInvalid, additionalProperties={}),
>>> StatusCause(field=spec.job, message=Invalid value: "null": spec.job in body
>>> must be of type object: "null", reason=FieldValueInvalid,
>>> additionalProperties={}), StatusCause(field=.spec.taskManager.replicas,
>>> message=Invalid value: 0: .spec.taskManager.replicas accessor error: <nil>
>>> is of the type <nil>, expected int64, reason=FieldValueInvalid,
>>> additionalProperties={})], group=flink.apache.org,
>>> kind=FlinkDeployment, name=test, retryAfterSeconds=null, uid=null,
>>> additionalProperties={}), kind=Status, message=
>>> FlinkDeployment.flink.apache.org "test" is invalid: [spec.ingress:
>>> Invalid value: "null": spec.ingress in body must be of type object: "null",
>>> spec.mode: Invalid value: "null": spec.mode in body must be of type string:
>>> "null", spec.mode: Unsupported value: "null": supported values: "native",
>>> "standalone", spec.logConfiguration: Invalid value: "null":
>>> spec.logConfiguration in body must be of type object: "null",
>>> spec.imagePullPolicy: Invalid value: "null": spec.imagePullPolicy in body
>>> must be of type string: "null", spec.jobManager.podTemplate: Invalid value:
>>> "null": spec.jobManager.podTemplate in body must be of type object: "null",
>>> spec.jobManager.resource.ephemeralStorage: Invalid value: "null":
>>> spec.jobManager.resource.ephemeralStorage in body must be of type string:
>>> "null", spec.podTemplate: Invalid value: "null": spec.podTemplate in body
>>> must be of type object: "null", spec.restartNonce: Invalid value: "null":
>>> spec.restartNonce in body must be of type integer: "null",
>>> spec.taskManager.replicas: Invalid value: "null": spec.taskManager.replicas
>>> in body must be of type integer: "null",
>>> spec.taskManager.resource.ephemeralStorage: Invalid value: "null":
>>> spec.taskManager.resource.ephemeralStorage in body must be of type string:
>>> "null", spec.taskManager.podTemplate: Invalid value: "null":
>>> spec.taskManager.podTemplate in body must be of type object: "null",
>>> spec.job: Invalid value: "null": spec.job in body must be of type object:
>>> "null", .spec.taskManager.replicas: Invalid value: 0:
>>> .spec.taskManager.replicas accessor error: <nil> is of the type <nil>,
>>> expected int64], metadata=ListMeta(_continue=null, remainingItemCount=null,
>>> resourceVersion=null, selfLink=null, additionalProperties={}),
>>> reason=Invalid, status=Failure, additionalProperties={}).
>>>
>>> at
>>> io.fabric8.kubernetes.client.KubernetesClientException.copyAsCause(KubernetesClientException.java:238)
>>>
>>> at
>>> io.fabric8.kubernetes.client.dsl.internal.OperationSupport.waitForResult(OperationSupport.java:518)
>>>
>>> at
>>> io.fabric8.kubernetes.client.dsl.internal.OperationSupport.handleResponse(OperationSupport.java:535)
>>>
>>> at
>>> io.fabric8.kubernetes.client.dsl.internal.OperationSupport.handleUpdate(OperationSupport.java:358)
>>>
>>> at
>>> io.fabric8.kubernetes.client.dsl.internal.BaseOperation.handleUpdate(BaseOperation.java:708)
>>>
>>> at
>>> io.fabric8.kubernetes.client.dsl.internal.HasMetadataOperation.lambda$handleReplace$0(HasMetadataOperation.java:185)
>>>
>>> at
>>> io.fabric8.kubernetes.client.dsl.internal.HasMetadataOperation.handleReplace(HasMetadataOperation.java:190)
>>>
>>> at
>>> io.fabric8.kubernetes.client.dsl.internal.HasMetadataOperation.replace(HasMetadataOperation.java:101)
>>>
>>> at
>>> io.fabric8.kubernetes.client.dsl.internal.HasMetadataOperation.replace(HasMetadataOperation.java:45)
>>>
>>> at
>>> io.javaoperatorsdk.operator.processing.event.ReconciliationDispatcher$CustomResourceFacade.updateResource(ReconciliationDispatcher.java:387)
>>>
>>> at
>>> io.javaoperatorsdk.operator.processing.event.ReconciliationDispatcher.conflictRetryingUpdate(ReconciliationDispatcher.java:343)
>>>
>>> at
>>> io.javaoperatorsdk.operator.processing.event.ReconciliationDispatcher.updateCustomResourceWithFinalizer(ReconciliationDispatcher.java:316)
>>>
>>> at
>>> io.javaoperatorsdk.operator.processing.event.ReconciliationDispatcher.handleReconcile(ReconciliationDispatcher.java:115)
>>>
>>> at
>>> io.javaoperatorsdk.operator.processing.event.ReconciliationDispatcher.handleDispatch(ReconciliationDispatcher.java:89)
>>>
>>> at
>>> io.javaoperatorsdk.operator.processing.event.ReconciliationDispatcher.handleExecution(ReconciliationDispatcher.java:62)
>>>
>>> at
>>> io.javaoperatorsdk.operator.processing.event.EventProcessor$ReconcilerExecutor.run(EventProcessor.java:414)
>>>
>>> at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown
>>> Source)
>>>
>>> at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown
>>> Source)
>>>
>>> at java.base/java.lang.Thread.run(Unknown Source)
>>>
>>> The deployment we are trying to run is this:
>>>
>>> apiVersion: flink.apache.org/v1beta1
>>>
>>> kind: FlinkDeployment
>>>
>>> metadata:
>>>
>>>   namespace: flink
>>>
>>>   name: test
>>>
>>> spec:
>>>
>>>   mode: native
>>>
>>>   image: flink:1.17
>>>
>>>   flinkVersion: v1_17
>>>
>>>   flinkConfiguration:
>>>
>>>     taskmanager.numberOfTaskSlots: "2"
>>>
>>>   serviceAccount: flink
>>>
>>>   jobManager:
>>>
>>>     resource:
>>>
>>>       memory: "2048m"
>>>
>>>       cpu: 1
>>>
>>>   taskManager:
>>>
>>>     resource:
>>>
>>>       memory: "2048m"
>>>
>>>       cpu: 1
>>>
>>> Regards,
>>> Krzysztof Chmielewski
>>>
>>>
>>> [1] https://lists.apache.org/thread/07d46txb6vttw7c8oyr6z4n676vgqh28
>>>
>>

Reply via email to