potiuk commented on code in PR #38531:
URL: https://github.com/apache/airflow/pull/38531#discussion_r1556118149


##########
airflow/models/taskinstance.py:
##########
@@ -1287,7 +1287,7 @@ class TaskInstance(Base, LoggingMixin):
     queued_dttm = Column(UtcDateTime)
     queued_by_job_id = Column(Integer)
     pid = Column(Integer)
-    executor_config = Column(ExecutorConfigType(pickler=dill))
+    executor_config = Column(ExecutorConfigType(pickler=cloudpickle))

Review Comment:
   I believe - we cannot use pickle. There are certain executors (K8S executor 
is the main culprit) that can use configuration that is not picklable because 
it can use V1Pod and similar classes for custom pod template. 
   
   See the example here where you provide custom Pod defintion via 
executor-config passed to the task. Such config is serialized when the DAG is 
parsed and stored in the DB, and deserialized by the executor when it attempts 
to execute it:
   
   
https://airflow.apache.org/docs/apache-airflow-providers-cncf-kubernetes/stable/kubernetes_executor.html#pod-override
   
   It's likely however that we could swap the default for `cloudpickle` (or 
maybe even choose whichever pickling class is available and importable - we 
could detect it, but then there is an issue for conversion for past task in 
case of changeover - as @hussein-awala mentioned and implemented for encrypted 
kwargs: https://github.com/apache/airflow/pull/38358
   
   I still think it's possibe to have the libraries as optional for core (and 
pre-install both in airflow reference image), but It needs a bit of 
deliberation and right errors/messaging when there are already task in the DB 
which have been serialized using one of them, but when the library is missing - 
basically handling the case where someone had tasks created in the DB using 
`dill` and starts airflow without `dill` - we should raise an error and tell 
the user to install `dill` to make the on-the-flight conversion of old values). 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to