fjetter commented on a change in pull request #16829:
URL: https://github.com/apache/airflow/pull/16829#discussion_r673227899
##########
File path: airflow/executors/dask_executor.py
##########
@@ -81,7 +81,22 @@ def airflow_run():
if not self.client:
raise AirflowException(NOT_STARTED_MESSAGE)
- future = self.client.submit(airflow_run, pure=False)
+ resources = None
+ if queue is not None:
+ avail_resources = self.client.run_on_scheduler(
+ lambda dask_scheduler: dask_scheduler.resources
+ )
Review comment:
This technically works and it is there to stay but we usually use the
`run_on_scheduler` method only for debugging purposes.
There is also the possibility to disable this method entirely on the
scheduler to prohibit execution of malicious code since this would otherwise
allow the execution of generic code.
I see two possibilities:
1. Contribute a `get_ressources` method to `distributed` itself. I think
this is a fair feature request
2. Use a scheduler plugin (see
https://distributed.dask.org/en/latest/plugins.html#distributed.diagnostics.plugin.SchedulerPlugin)
the scheduler plugin and contribution would use the same code it's just
about where the code lives. Should be as simple as
```python
class GetResourcesPlugin(SchedulerPlugin):
def __init__(self, scheduler):
self.scheduler = scheduler
self.scheduler.handlers["get-resources"] = self.get_resources
def get_resources(self):
return self.scheduler.resources
```
on client side it's then simply
```python
client.scheduler.get_resources
```
It's a bit boilerplate but this ensures that it will always work, even in
locked down clusters.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]