Hey!
We have not observed any issue so far, can you please share some error
information / log ?
Opening a jira ticket would be best
Thanks
Gyula
On Thu, 9 May 2024 at 21:18, Prasad, Neil
wrote:
> I am writing to report an issue with the Flink Kubernetes Operator version
> 1.8.0. The CRD is
Hey!
Let me first answer your questions then provide some actual solution
hopefully :)
1. The adaptive scheduler would not reduce the vertex desired parallelism
in this case but it should allow the job to start depending on the
lower/upper bound resource config. There have been some changes in
Hi Chetas,
The operator logic itself would normally call the rescale api during the
upgrade process, not the autoscaler module. The autoscaler module sets the
correct config with the parallelism overrides, and then the operator
performs the regular upgrade cycle (as when you yourself change
;last-state" upgrade mode. When you were saying "robust way", does it mean
> "sticky job id" in application mode?
>
>
> On Mon, Apr 29, 2024 at 10:28 PM Gyula Fóra wrote:
>
>> Hi Alan!
>>
>> I think it should be possible to address this gap for m
Hi Alan!
I think it should be possible to address this gap for most cases. We don't
have the same robust way of getting the last-state information for session
jobs as we do for applications, so it will be slightly less reliable
overall.
For session jobs the last checkpoint info has to be queried
failing job (autoscaler scales the job to
> zero for “gracefully” stopping it and then never starts it) or
> b) some jobs that keep restarting can be fixed by disabling HA for that job
>
> And ` *Cannot rescale the given pointwise partitioner.` *is also still a
> mystery.
>
> *Thank
Hi Maxim!
Regarding the status update error, it could be related to a problem that we
have discovered recently with the Flink Operator HA. Where during a
namespace change both leader and follower instances would start processing.
It has been fixed in the current master by updating the JOSDK
e code modifications on
> the mailbox executor.
>
>
> Best,
> Zakelly
>
> On Thu, Mar 14, 2024 at 9:15 PM Gyula Fóra wrote:
>
>> Thank you for the detailed analysis Zakelly.
>>
>> I think we should consider whether yield should process checkpoint
24,
> TimeUnit.HOURS,
> 1)
> .print();
> ```
> The checkpoint 1 can be normally finished after the "Complete one" log
> print.
>
> I guess the users have no means to solve this problem, we might optimize
> this later.
>
Hey all!
I encountered a strange and unexpected behaviour when trying to use
unaligned checkpoints with AsyncIO.
If the async operation queue is full and backpressures the pipeline
completely, then unaligned checkpoints cannot be completed. To me this
sounds counterintuitive because one of the
Hi Everyone!
I have discussed this with Sébastien Chevalley, he is going to prepare and
drive the FLIP while I will assist him along the way.
Thanks
Gyula
On Tue, Mar 5, 2024 at 9:57 AM wrote:
> I do agree with Ron Liu.
> This would definitely need a FLIP as it would impact SQL and extend it
It should be compatible. There is no compatibility matrix but it is
compatible with most versions that are in use (at the different
companies/users etc)
Gyula
On Thu, Feb 29, 2024 at 6:21 AM 吴圣运 wrote:
> Hi,
>
> I'm using flink-operator-1.5.0 and I need to deploy it to Kubernetes 1.20.
> I
Posting this to dev as well as it potentially has some implications on
development effort.
What seems to be the problem here is that we cannot control/override
Timestamps/Watermarks/Primary key on VIEWs. It's understandable that you
cannot create a PRIMARY KEY on the view but I think the temporal
Hi Niklas!
The best way to report the issue would be to open a JIRA ticket with the
same detailed information.
Otherwise I think your observations are correct and this is indeed a
frequent problem that comes up, it would be good to improve on it. In
addition to improving logging we could also
Could this be related to the issue reported here?
https://issues.apache.org/jira/browse/FLINK-34063
Gyula
On Wed, Jan 10, 2024 at 4:04 PM Yang LI wrote:
> Just to give more context, my setup uses Apache Flink 1.18 with the
> adaptive scheduler enabled, issues happen randomly particularly
>
gt;
>
> --
> Best!
> Xuyang
>
>
> 在 2024-01-11 16:10:47,"Giannis Polyzos" 写道:
>
> Hi Gyula,
> to the best of my knowledge, this is not feasible and you will have to do
> something like *CAST(NULL AS STRING)* to insert null values manually.
>
> Best,
&g
Hi All!
Is it possible to insert into a table without specifying all columns of the
target table?
In other words can we use the default / NULL values of the table when not
specified somehow?
For example:
Query schema: [a: STRING]
Sink schema: [a: STRING, b: STRING]
I would like to be able to
Please upgrade the operator to the latest release, and if the issue still
exists please open a Jira ticket with the details.
Gyula
On Fri, 22 Dec 2023 at 21:17, Ruibin Xing wrote:
> I wanted to talk about an issue we've hit recently with Flink Kubernetes
> Operator 1.6.1 and Flink 1.17.1.
>
>
k! With your permission, I plan to integrate
>>> the implementation into the flink-kubernetes-operator-autoscaler module to
>>> test it on my env. Subsequently, maybe contribute these changes back to the
>>> community by submitting a pull request to the GitHub repository in
Hi!
We recommend using the community supported Flink Kubernetes Operator:
https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-release-1.7/docs/try-flink-kubernetes-operator/quick-start/
Cheers,
Gyula
On Thu, Dec 7, 2023 at 6:33 PM Tauseef Janvekar
wrote:
> Hi Al,
>
> I am using
Hi!
I already answered your question on slack :
“The main reason is that this allows us to completely separate release
resources etc. much easier for the release process
But this could be improved in the future if there is a good proposal for
the process”
Please do not cross post questions
The Apache Flink community is very happy to announce the release of Apache
Flink Kubernetes Operator 1.7.0.
The Flink Kubernetes Operator allows users to manage their Apache Flink
applications and their lifecycle through native k8s tooling like kubectl.
Release highlights:
- Standalone
The Apache Flink community is very happy to announce the release of Apache
Flink Kubernetes Operator 1.7.0.
The Flink Kubernetes Operator allows users to manage their Apache Flink
applications and their lifecycle through native k8s tooling like kubectl.
Release highlights:
- Standalone
t;applying a kubectl patch to the FlinkDeployment CRD.
>
> By doing this we could achieve something similar to what we can do with a
> plugin system, Of course in this case I'll disable scaling of the flink
> operator, Do you think it could work?
>
> Best,
> Yang
>
Hey!
Bit of a tricky problem, as it's not really possible to know that the job
will be able to start with lower parallelism in some cases. Custom plugins
may work but that would be an extremely complex solution at this point.
The Kubernetes operator has a built-in rollback mechanism that can
0378a4b4d1 (allowing non restored state)
>> ...
>> 2023-10-21 10:25:47,703 INFO
>> org.apache.flink.runtime.dispatcher.StandaloneDispatcher [] - Job
>> ee4f7c678794ee16506f9b41425c244e reached terminal state FAILED.
>> org.apache.flink.runtime.client.JobInitializ
om/apache/flink-kubernetes-operator/pull/673
> This solved the problem
>
>
> --
> *От:* Tony Chen
> *Отправлено:* 19 октября 2023 г. 4:18:36
> *Кому:* Evgeniy Lyutikov
> *Копия:* user@flink.apache.org; Gyula Fóra
> *Тема:* Re: Flink kuberne
Hi!
Not sure if it’s the same but could you try picking up the fix from the
release branch and confirming that it solves the problem?
If it does we may consider a quick bug fix release.
Cheers
Gyula
On Wed, 18 Oct 2023 at 18:09, Tony Chen wrote:
> Hi Flink Community,
>
> Most of the Flink
very much for the update about the release schedule and for
> pointing me to the snapshot images. This is indeed very helpful and we will
> consider our options now.
>
> Regards,
> Niklas
>
> On 16. Oct 2023, at 17:56, Gyula Fóra wrote:
>
> Hi Niklas!
>
> We weren't pl
Hi Niklas!
We weren't planning a 1.6.1 release and instead we were focusing on
wrapping up changes for the 1.7.0 release coming in a month or so.
However if there is enough interest and we have some committers/PMC willing
to help with the release we can always do 1.6.1 but I personally don't
Hey,
We don’t have minimal supported version in the docs as we haven’t
experienced any issue specific to kubernetes versions so far.
We don’t really rely on any newer features
Cheers
Gyula
On Fri, 6 Oct 2023 at 06:02, Krzysztof Chmielewski <
krzysiek.chmielew...@gmail.com> wrote:
> It seems
Hi Tony!
There are still a few corner cases when the operator cannot upgrade /
rollback deployments due to the loss of HA metadata (and with that
checkpoint information).
Most of these issues are not related to the operator logic directly but to
how Flink handles certain failures and are related
Hi
Operator savepoint retention and savepoint upgrades have nothing to do with
each other I think. Retention is only for periodic savepoints triggered by
the operator itself.
I would upgrade to the latest 1.6.0 operator version before investigating
further.
Cheers
Gyula
On Sat, 23 Sep 2023 at
Hi!
The cluster-id for each FlinkDeployment is simply the name of the
deployment. So they are all different in a given namespace. (In other words
they are not fixed as your question suggests but set automatically)
So there should be no problem sharing the ZK cluster .
Cheers
Gyula
On Thu, 21
No, I think what he means is to trigger the checkpoint at slightly
different times at the different sources so the different parts of the
pipeline would not checkpoint at the same time.
Gyula
On Wed, Sep 13, 2023 at 10:32 AM Hangxiang Yu wrote:
> Hi, Matyas.
> Do you mean something like
, Sep 11, 2023 at 7:47 PM Gyula Fóra wrote:
> You don’t need it but you can really mess up clusters by rolling back CRD
> changes…
>
> On Mon, 11 Sep 2023 at 19:42, Evgeniy Lyutikov
> wrote:
>
>> Why we need to use latest CRD version wit
You don’t need it but you can really mess up clusters by rolling back CRD
changes…
On Mon, 11 Sep 2023 at 19:42, Evgeniy Lyutikov wrote:
> Why we need to use latest CRD version with older operator version?
> --
> *От:* Gyula Fóra
> *Отправлено:* 12 сен
11 сентября 2023 г. 23:50:26
> *Кому:* Gyula Fóra
>
> *Копия:* user@flink.apache.org
> *Тема:* Re: Flink kubernets operator delete HA metadata after resuming
> from suspend
>
>
> Hi!
> No, no one could restart jobmanager,
> I monitored the pods in real time, the
Hi!
I could not reproduce your issue, last-state suspend/restore seems to work
as before.
However these 2 logs seem very suspicious:
2023-09-11 06:02:07,481 o.a.f.k.o.o.d.ApplicationObserver [INFO
][rec-job/rec-job] Observing JobManager deployment. Previous status: MISSING
2023-09-11
link.apache.org/v1beta1
","metadata":{"generation":2},"firstDeployment":true}}'
It's a bit hidden but it should do the trick :)
We could discuss moving this to a more standardized status field if you
think that's worth the effort.
Gyula
On Sat, Sep 9, 2023 at 7:04 AM G
Hi!
The lastReconciledSpec field serves similar purpose . We also use the
generation in parts of the logic but not generically as observed generation
.
Could you give an example where this would be useful in addition to what we
already have?
Thanks
Gyula
On Sat, 9 Sep 2023 at 02:17, Tony Chen
task granularity and allows us to identify
>>bottleneck tasks.
>>3. Autoscaler feature currently only works for K8s opeartor + native
>>K8s mode.
>>
>>
>> Best,
>> Zhanghao Chen
>> --
>> *发件人:* Dennis Jung
ted when scaling. But job
> parallelism is the same after the number of TM has been changed.
>
> *Autoscaler + 'reactive' mode*:
> It can control numbers of TM by metric, and increase/decrease job
> parallelism by changing TM.
>
> Regards,
> Jung
>
> 2023년 9월 1일 (금) 오후 8
anagers can be added or removed from the Flink cluster.*
>> => Why is this only possible in 'reactive' mode? Seems this is more
>> related to 'autoscaler'. Are there some specific features/API which can
>> control TaskManager/Parallelism only in 'reactive' mode?
>>
&g
>
>
>
> 2023년 8월 18일 (금) 오후 7:51, Gyula Fóra 님이 작성:
>
>> Hi!
>>
>> I think what you need is probably not the reactive mode but a proper
>> autoscaler. The reactive mode as you say doesn't do anything in itself, you
>> need to build a lot of logic around i
Deployments: " + item);
> System.out.println("Number of TM replicas: " +
> item.getSpec().getTaskManager().getReplicas());
> }
> }
>
>
> Thanks,
> Krzysztof
>
> czw., 31 sie 2023 o 10:44 Gyula Fóra napisał(a):
>
I guess your question is in the context of the standalone integration
because native session deployments automatically add TMs on the fly as more
are necessary.
For standalone mode you should be able to configure
`spec.taskManager.replicas` and if I understand correctly that will not
shut down
ill have something to share with you.
>
> Nicolas
>
> On Wed, Aug 30, 2023 at 4:28 PM Gyula Fóra wrote:
>
>> Hey!
>>
>> I don't know if anyone has implemented this or not but one way to
>> approach this problem (and this may not be the right way, just an
ill also need to go through the documentation more on memory
>> configuration:
>> https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/ops/state/state_backends/
>>
>> On Wed, Aug 30, 2023 at 2:17 PM Gyula Fóra wrote:
>>
>>> Hi!
>>>
>>>
Hi!
Rocksdb is supported and every other state backend as well.
You can simply set this in you config like before :)
Cheers
Gyula
On Wed, 30 Aug 2023 at 19:22, Tony Chen wrote:
> Hi Flink Community,
>
> Does the flink-kubernetes-operator support RocksDB as the state backend
> for
Hey!
I don't know if anyone has implemented this or not but one way to approach
this problem (and this may not be the right way, just an idea :) ) is to
add a new Custom Resource type that sits on top of the FlinkDeployment /
FlinkSessionJob resources and add a small controller for this.
This
Hi!
I think what you need is probably not the reactive mode but a proper
autoscaler. The reactive mode as you say doesn't do anything in itself, you
need to build a lot of logic around it.
Check this instead:
The Apache Flink community is very happy to announce the release of Apache
Flink Kubernetes Operator 1.6.0.
The Flink Kubernetes Operator allows users to manage their Apache Flink
applications and their lifecycle through native k8s tooling like kubectl.
Release highlights:
- Improved rollback
The Apache Flink community is very happy to announce the release of Apache
Flink Kubernetes Operator 1.6.0.
The Flink Kubernetes Operator allows users to manage their Apache Flink
applications and their lifecycle through native k8s tooling like kubectl.
Release highlights:
- Improved rollback
The autoscaler only works for FlinkDeployments in Native mode. You should
turn off the reactive scheduler mode as well because that's something
completely different.
After that you can check the autoscaler logs for more info.
Gyula
On Tue, Aug 1, 2023 at 10:33 AM Raihan Sunny via user
wrote:
>
f one pod is added to the session cluster,
> the job running on will be rebalanced to the new one, is it correct?
>
> Thank you very much.
> Xiao Ma
>
> On Wed, Feb 1, 2023 at 10:56 AM Gyula Fóra wrote:
>
>> As I mentioned in the previous email, standalone mode is not on the
? I was wondering if there's a field in
> the kubernetes field where I can specify which checkpoint to start from.
> For some of our applications, we complete checkpoints more often
> than savepoints, and we would like these Flink applications to always start
> from the latest checkpoint.
&
Hi!
We don't have imagePullSecrets as part of the FlinkDeplyomentSpec at the
moment, however you can simply use the following built in Flink
configuration:
https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/deployment/config/#kubernetes-container-image-pull-secrets
ommunity be okay with us adding this feature to the GitHub
> repo eventually? I was going through this guide
> <https://flink.apache.org/how-to-contribute/contribute-code/>, and it
> looks like I need to get consensus first.
>
> Thanks,
> Tony
>
> On Wed, Jul 19, 202
yment gets deleted.
>
> Thanks,
> Tony
>
> On Wed, Jul 19, 2023 at 3:46 PM Gyula Fóra wrote:
>
>> Hey Tony,
>>
>> Please see:
>> https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-main/docs/custom-resource/job-management/#stateful-and-
Hey Tony,
Please see:
https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-main/docs/custom-resource/job-management/#stateful-and-stateless-application-upgrades
The operator is made especially to handle stateful application upgrades
robustly. In general any spec change that you make
Maybe you have inconsistent operator / CRC versions? In any case I highly
recommend upgrading to the lates operator version to get all the bug /
security fixes and improvements.
Gyula
On Wed, 12 Jul 2023 at 10:58, Paul Lam wrote:
> Hi,
>
> I’m using K8s operator 1.3.1 with Flink 1.15.2 on 2
The namespace and cluster id are automatically set based on the namespace
and name of the FlinkDeployment resource .
This is an important design choice that allows efficient management of the
applications.
Gyula
On Wed, 14 Jun 2023 at 19:31, Nathan Moderwell <
nathan.moderw...@robinhood.com>
rocess
> memory and the pod memory, which helped stability. It looks like it cannot
> be done with the k8s operator though and I wonder why the choice of
> removing this granularity in the settings
>
> Robin
>
> Le mer. 14 juin 2023 à 12:20, Gyula Fóra a écrit :
>
Basically what happens is that whatever you set to the
spec.taskManager.resource.memory will be set in the config as process
memory.
In Flink kubernetes the process is the pod so pod memory is always equal to
process memory.
So basically the spec is a config shorthand, there is no reason to
Hi!
I think you forgot to upgrade the operator CRD (which contains the updates
enum values).
Please see:
https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-main/docs/operations/upgrade/#1-upgrading-the-crd
Cheers
Gyula
On Mon, 12 Jun 2023 at 13:38, Liting Liu (litiliu)
wrote:
Hi!
I think you forgot to upgrade the operator CRD (which contains the updates
enum values).
Please see:
https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-main/docs/operations/upgrade/#1-upgrading-the-crd
Cheers
Gyula
On Mon, 12 Jun 2023 at 13:38, Liting Liu (litiliu)
wrote:
also saw core dump while using list state after triggering state
> migration and ttl compaction filter. Have you triggered the schema
> evolution ?
> It seems a bug of the rocksdb list state together with ttl compaction
> filter.
>
> On Wed, May 17, 2023 at 7:05 PM Gyula Fóra w
Hi Andrew!
I think you are completely right, this is a bug. The per namespace metrics
do not seem to filter per namespace and show the aggregated global count
for each namespace:
I opened a ticket:
https://issues.apache.org/jira/browse/FLINK-32164
Thanks for reporting this!
Gyula
On Mon, May
The Apache Flink community is very happy to announce the release of Apache
Flink Kubernetes Operator 1.5.0.
The Flink Kubernetes Operator allows users to manage their Apache Flink
applications and their lifecycle through native k8s tooling like kubectl.
Release highlights:
- Autoscaler
The Apache Flink community is very happy to announce the release of Apache
Flink Kubernetes Operator 1.5.0.
The Flink Kubernetes Operator allows users to manage their Apache Flink
applications and their lifecycle through native k8s tooling like kubectl.
Release highlights:
- Autoscaler
Hi All!
We are encountering an error on a larger stateful job (around 1 TB + state)
on restore from a rocksdb checkpoint. The taskmanagers keep crashing with a
segfault coming from the rocksdb native logic and seem to be related to the
FlinkCompactionFilter mechanism.
The gist with the full
There is no such feature currently, Kubernetes resources usually do not
delete themselves :)
The problem I see here is by deleting the resource you lose all information
about what happened, you won't know if it failed or completed etc.
What is the use-case you are thinking about?
If this is
Hey!
Sounds like a bug :) Could you please open a jira / PR (in case you fixed
this already)?
Thanks
Gyula
On Mon, 8 May 2023 at 22:20, Andrew Otto wrote:
> Hi,
>
> I'm trying to enable HA for flink-kubernetes-operator
>
There is only one kind of autoscaler in the Flink Kubernetes Operator. And
the docs can be found here:
https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-main/docs/custom-resource/autoscaler/
We usually refer to it as the Job Autoscaler (as it scales individual jobs)
but the
tion for this problem in the future.
>
>
> On Wed, Apr 26, 2023 at 7:20 AM Gyula Fóra wrote:
>
>> I think the behaviour is going to get a little weird because this would
>> actually defeat the purpose of the standby TM.
>> MAX - some offset will decrease once y
Hi!
It’s currently not possible to run the operator in parallel by simply
adding more replicas. However there are different things you can do to
scale both vertically and horizontally.
First of all you can run multiple operators each watching different set of
namespaces to partition the load.
I think the behaviour is going to get a little weird because this would
actually defeat the purpose of the standby TM.
MAX - some offset will decrease once you lose a TM so in this case we would
scale down to again have a spare (which we never actually use.)
Gyula
On Wed, Apr 26, 2023 at 4:02 PM
Hi!
Please opena JIRA ticket with the details of your log, config and operator
version and we will take a look!
Thanks
Gyula
On Mon, Apr 24, 2023 at 2:04 PM Sriram Ganesh wrote:
> Hi,
>
> I am trying the autoscale provided by the operator. I found that Autoscale
> keeps happening even after
Hi Alexis,
We have recently added support for canary deployments which allows the
liveness probe to detect general operator problems.
https://issues.apache.org/jira/browse/FLINK-31219
It's not completely automatic and you have to deploy the canaries yourself
but I think it will be helpful :)
Never seen this before but also you should not set the cluster-id in your
config as that should be controlled by the operator itself.
Gyula
On Fri, Mar 31, 2023 at 2:39 PM Pierre Bedoucha
wrote:
> Hi,
>
>
>
> We are trying to use Flink Kubernetes Operator 1.4.0 with Flink 1.16.
>
>
>
>
I think you forgot to upgrade the CRD during the upgrade process on your
cluster.
As you can see here:
https://github.com/apache/flink-kubernetes-operator/blob/release-1.4/helm/flink-kubernetes-operator/crds/flinkdeployments.flink.apache.org-v1.yml#L38-L44
The newer version already contains
Hey!
You are right, these fields could have been of the PodTemplate /
PodTemplateSpec type (probably PodTemplateSpec is actually better).
I think the reason why we used it is two fold:
- Simple oversight :)
- Flink itself "expects" the podtemplate in this form for the native
integration as you
The Apache Flink community is very happy to announce the release of Apache
Flink Kubernetes Operator 1.4.0.
The Flink Kubernetes Operator allows users to manage their Apache Flink
applications and their lifecycle through native k8s tooling like kubectl.
Release highlights:
- Flink Job
The Apache Flink community is very happy to announce the release of Apache
Flink Kubernetes Operator 1.4.0.
The Flink Kubernetes Operator allows users to manage their Apache Flink
applications and their lifecycle through native k8s tooling like kubectl.
Release highlights:
- Flink Job
If you are interested in helping to review this, here is the relevant
ticket and the PR I just opened:
https://issues.apache.org/jira/browse/FLINK-30786
https://github.com/apache/flink-kubernetes-operator/pull/535
Cheers,
Gyula
On Thu, Feb 23, 2023 at 2:10 PM Gyula Fóra wrote:
>
Hi!
The current array merging strategy in the operator is basically an
overwrite by position yes.
I actually have a pending improvement to make this configurable and allow
merging arrays by "name" attribute. This is generally more practical for
such cases.
Cheers,
Gyula
On Thu, Feb 23, 2023 at
ula.
> Is there a roadmap to support standalone session clusters to scale based
> on the jobs added/deleted and change in parallelism ?
>
> Regards,
> Swathi C
>
> ------
> *From:* Gyula Fóra
> *Sent:* Wednesday, February 1, 2023 8:54 PM
> *To:* S
The autoscaler currently only works with Native App clusters.
Native session clusters may be supported in the future but standalone is
not on our roadmap due to a different resource/scheduling model used.
Gyula
On Wed, Feb 1, 2023 at 4:22 PM Swathi Chandrashekar
wrote:
> Hi,
>
> I'm was
Ippolitov <
anton.ippoli...@datadoghq.com> wrote:
> I am using the Standalone Mode indeed, should've mentioned it right away.
> This fix looks exactly like what I need, thank you!!
>
> On Tue, Jan 31, 2023 at 9:16 AM Gyula Fóra wrote:
>
>> There is also a pending fix for t
iner_utils.go#L215>,
>> I thought
>> this would be a common issue but since you've never seen this error before,
>> not sure what to do 樂
>>
>> On Fri, Jan 27, 2023 at 10:59 PM Gyula Fóra wrote:
>>
>>> We never encountered this problem be
We never encountered this problem before but also we don't configure those
settings.
Can you simply try:
high-availability: kubernetes
And remove the other configs? I think that can only cause problems and
should not achieve anything :)
Gyula
On Fri, Jan 27, 2023 at 6:44 PM Anton Ippolitov via
Did you check the Python example?
https://github.com/apache/flink-kubernetes-operator/tree/main/examples/flink-python-example
Gyula
On Wed, Jan 25, 2023 at 2:54 PM Evgeniy Lyutikov
wrote:
> Hello
>
> Is there a way to run PyFlink jobs in k8s with flink kubernetes operator?
> And if not, is it
Hi Devs!
We noticed a very strange failure scenario a few times recently with the
Native Kubernetes integration.
The issue is triggered by a heartbeat timeout (a temporary network
problem). We observe the following behaviour:
===
3 pods (1 JM, 2 TMs), Flink 1.15
But of course the actual memory requirement will largely depend on the type
of job, statebackend , number of task slots etc
Production TM/JMs usually have much more resources allocated than 2gb/1cpu
as you never want to run out of it :)
Gyula
On Sat, 21 Jan 2023 at 11:17, Gyula Fóra wrote
Hi!
I think the examples allocate too many resources by default and we should
reduce it in the yamls.
1gb memory and 0.5 cpu should be more than enough , we could probably get
away with even less for example purposes.
Would you have time trying this out and maybe contributing this
improvement?
Hi Javier,
I will try to look into this as I have not personally seen this problem
while using the operator .
It would be great if you could reach out to me on slack or email directly
so we can discuss the issue and get to the bottom of it.
Cheer
Gyula
On Fri, 20 Jan 2023 at 23:53, Javier
.java#L43
>
> On Thu, Jan 19, 2023 at 1:59 PM Őrhidi Mátyás
> wrote:
>
>> On a side note, we should probably use a qualified label name instead of
>> the pretty common app here. WDYT Gyula?
>>
>> On Thu, Jan 19, 2023 at 1:48 PM Gyula Fóra wrote:
>>
&g
Hi!
The app label itself is used by Flink internally for a different purpose so
it’s overriden. This is completely expected.
I think it would be better to use some other label :)
Cheers,
Gyula
On Thu, 19 Jan 2023 at 19:02, Andrew Otto wrote:
> Hello!
>
> I'm seeing an unexpected label value
Please see the release announcements:
https://flink.apache.org/news/2022/10/07/release-kubernetes-operator-1.2.0.html
https://flink.apache.org/news/2022/12/14/release-kubernetes-operator-1.3.0.html
https://flink.apache.org/news/2023/01/10/release-kubernetes-operator-1.3.1.html
1 - 100 of 375 matches
Mail list logo