ddelange opened a new pull request #6506: [AIRFLOW-XXX] Fix unpacking of 
resources
URL: https://github.com/apache/airflow/pull/6506
 
 
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [ ] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references 
them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR"
     - https://issues.apache.org/jira/browse/AIRFLOW-XXX
     - In case you are fixing a typo in the documentation you can prepend your 
commit with \[AIRFLOW-XXX\], code changes always need a Jira issue.
     - In case you are proposing a fundamental code change, you need to create 
an Airflow Improvement Proposal 
([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)).
     - In case you are adding a dependency, check if the license complies with 
the [ASF 3rd Party License 
Policy](https://www.apache.org/legal/resolved.html#category-x).
   
   ### Description
   
   When trying to initialize a KubernetesPodOperator with some RAM and CPU 
limits, I followed [Kubernetes 
docs](https://kubernetes.io/docs/tasks/administer-cluster/manage-resources/cpu-default-namespace/#what-if-you-specify-a-container-s-limit-but-not-its-request),
 similar to  and added a dict of form similar to 
https://github.com/apache/airflow/blob/5096fa3e48b4f79a29e4ad430e426c8bc9462449/tests/kubernetes/test_pod_generator.py#L92-L101:
   ```
   resources = {
       "limits": {"ram": f"2000Mi", "cpus": "1.0"},
       "requests": {"ram": f"2000Mi", "cpus": "1.0"},
   }
   ```
   However, this structure seems to not properly propagate, so I tried to make 
the `resources` dict compatible with the BaseOperator inherited by the 
KubernetesPodOperator, and found this bug in the source code when using a 
simplified `resources` dict compatible with the BaseOperator:
   ```
   > 
/usr/local/lib/python3.7/site-packages/airflow/models/baseoperator.py(383)__init__()
       382         import ipdb;ipdb.set_trace()
   --> 383         self.resources = Resources(*resources) if resources is not 
None else None
       384         self.run_as_user = run_as_user
   
   ipdb> resources
   {'cpus': 1.0, 'ram': 2000}
   ipdb> Resources(*resources)
   *** TypeError: '<' not supported between instances of 'str' and 'int'
   ipdb> Resources(**resources)
   {'cpus': {'_name': 'CPU', '_units_str': 'core(s)', '_qty': 1.0}, 'ram': 
{'_name': 'RAM', '_units_str': 'MB', '_qty': 2000}, 'disk': {'_name': 'Disk', 
'_units_str': 'MB', '_qty': 512}, 'gpus': {'_name': 'GPU', '_units_str': 
'gpu(s)', '_qty': 0}}
   ipdb>
   ```
   
   I have a feeling this is the wrong kwarg to limit the resources of the Pod 
that the KubernetesPodOperator will spawn, since this code is not used at all: 
https://github.com/apache/airflow/blob/5096fa3e48b4f79a29e4ad430e426c8bc9462449/airflow/kubernetes/pod.py#L28
   Instead, it simply uses 
https://github.com/apache/airflow/blob/5096fa3e48b4f79a29e4ad430e426c8bc9462449/airflow/models/baseoperator.py#L397
   
   If someone can tell me how to properly set these resource limits for a 
KubernetesPod, please let me know.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to