Hi,

I have a workload that depends on the GPU. I have only 1 GPU card. As per
the documentation I have added the necessary configurations and can run the
GPU workload in standalone REACTIVE mode with as many taskmanager instances
as required.

I have set the number of task slots to 1 so that a raise in parallelism
causes a new pod to be created. I can scale up the job just fine in this
mode, however when I add autoscaling configurations to the FlinkDeployment
manifest, scaling up doesn't work. This is because with the autoscaling
manifest, there seems to be resource requests and limits are being
automatically set to the pods for the gpu. This is not the case with the
standalone mode which is why I guess scaling up doesn't cause any issues.

So, what can I do to get the autoscaler working? I'm using Flink version
1.17.1 with PyFlink and Flink Kubernetes Operator version 1.5.0.


Regards,
Sunny

-- 









SELISE Group
Zürich: The Circle 37, 8058 Zürich-Airport, 
Switzerland
Munich: Tal 44, 80331 München, Germany
Dubai: Building 3, 3rd 
Floor, Dubai Design District, Dubai, United Arab Emirates
Dhaka: Midas 
Center, Road 16, Dhanmondi, Dhaka 1209, Bangladesh
Thimphu: Bhutan 
Innovation Tech Center, Babesa, P.O. Box 633, Thimphu, Bhutan

Visit us: 
www.selisegroup.com <http://www.selisegroup.com>




-- 




*Important Note: This e-mail and any attachment are confidential and 
may contain trade secrets and may well also be legally privileged or 
otherwise protected from disclosure. If you have received it in error, you 
are on notice of its status. Please notify us immediately by reply e-mail 
and then delete this e-mail and any attachment from your system. If you are 
not the intended recipient please understand that you must not copy this 
e-mail or any attachment or disclose the contents to any other person. 
Thank you for your cooperation.*

Reply via email to