Operator/Autoscaler/Autotuner tuning behavior question

2024-05-08 Thread Maxim Senin via user
Hello. I have some questions about memory autotuning in the Operator. 1. Does the autotuner try to upgrade the job with more memory allocated if it intercepts OutOfMemoryError? Say I initially provided too little memory for TM `resource` - will the job fail and stop on initializing or will the

Re: [External] Regarding java.lang.IllegalStateException

2024-04-26 Thread Maxim Senin via user
My guess it’s a major known issue. Need a workaround. https://issues.apache.org/jira/browse/FLINK-32212 /Maxim From: prashant parbhane Date: Tuesday, April 23, 2024 at 11:09 PM To: user@flink.apache.org Subject: [External] Regarding java.lang.IllegalStateException Hello, We have been facing

Re: [External] Exception during autoscaling operation - Flink 1.18/Operator 1.8.0

2024-04-26 Thread Maxim Senin via user
oyment [INFO ][flink/f-d7681d0f-c093-5d8a-b5f5-2b66b4547bf6] Deleting Kubernetes HA metadata Any ideas? Thanks, Maxim From: Gyula Fóra Date: Friday, April 26, 2024 at 1:10 AM To: Maxim Senin Cc: Maxim Senin via user Subject: Re: [External] Exception during autoscaling operation - Flink 1.18/

Re: Regarding java.lang.IllegalStateException

2024-04-26 Thread Maxim Senin via user
We are also seeing something similar: 2024-04-26 16:30:44,401 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Source: Power Consumption:power_consumption -> Ingest Power Consumption -> PopSysFields -> WindowingWatermarkPreCheck (1/1)

Re: [External] Exception during autoscaling operation - Flink 1.18/Operator 1.8.0

2024-04-26 Thread Maxim Senin via user
still a mystery. Thanks, Maxim From: Gyula Fóra Date: Friday, April 26, 2024 at 1:10 AM To: Maxim Senin Cc: Maxim Senin via user Subject: Re: [External] Exception during autoscaling operation - Flink 1.18/Operator 1.8.0 Hi Maxim! Regarding the status update error, it could be related

Re: [External] Exception during autoscaling operation - Flink 1.18/Operator 1.8.0

2024-04-25 Thread Maxim Senin via user
: Maxim Senin via user Date: Thursday, April 25, 2024 at 12:01 PM To: Maxim Senin via user Subject: [External] Exception during autoscaling operation - Flink 1.18/Operator 1.8.0 Hi. I already asked before but never got an answer. My observation is that the operator, after collecting some stats

Exception during autoscaling operation - Flink 1.18/Operator 1.8.0

2024-04-25 Thread Maxim Senin via user
Hi. I already asked before but never got an answer. My observation is that the operator, after collecting some stats, is trying to restart one of the deployments. This includes taking a savepoint (`takeSavepointOnUpgrade: true`, `upgradeMode: savepoint`) and “gracefully” shutting down the

Job goes into FINISHED state after rescaling - link operator

2024-04-22 Thread Maxim Senin via user
Hi. My Flink Deployment is set to use savepoint for upgrades and for taking savepoint before stopping. When rescaling happens, for some reason it scales the JobManager to zero (“Scaling JobManager Deployment to zero with 300 seconds timeout”) and the job goes into FINISHED state. It doesn’t

Parallelism for auto-scaling, memory for auto-tuning - Flink operator

2024-04-17 Thread Maxim Senin via user
Hi. Does it make sense to specify `parallelism` for task managers or the `job`, and, similarly, to specify memory amount for the task managers, or it’s better to leave it to autoscaler and autotuner to pick the best values? How many times would the autoscaler need to restart task managers