allebacco commented on issue #21380:
URL: https://github.com/apache/airflow/issues/21380#issuecomment-1034734902


   I think that this is an interesting feature. Since there is not a direct 
connection between Databricks and Airflow, the id of the job could have changed 
due to re-creation, but the name could be more stable than the id.
   
   A possible solution could be to update the execute in this way:
   ```python
   # ... in the DatabricksRunNowOperator
   def execute(self, context):
       hook = self._get_hook()
   
       if 'job_name' in self.json:
           # If job name has been provided, find its id
           self.json['job_id'] = hook.find_job_id_by_name(self.json['job_name'])
           del self.json['job_name']
   
       self.run_id = hook.run_now(self.json)
       _handle_databricks_operator_execution(self, hook, self.log, context)
   
   # ... in the DatabricksHook
   def find_job_id_by_name(self, name) -> int:
       # List all the jobs to find the one with the specified name
       # if none or more than 1 job with the specified name, raise exception
   ```
   
   See the dbx tool (by Databricks) for an example implementation of the 
`find_job_id_by_name` in the 
https://github.com/databrickslabs/dbx/blob/main/dbx/utils/job_listing.py file.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to