dstandish commented on issue #6210: [AIRFLOW-5567] BaseReschedulePokeOperator
URL: https://github.com/apache/airflow/pull/6210#issuecomment-557931240
 
 
   So there were concerns with @Fokko's xcom change re idempotency.
   
   I think it makes sense to create second table, very similar to xcom, but 
designed specifically to support stateful tasks.  The table could perhaps be 
called TaskState.  
   This task state should not be pegged to a specific execution date, because 
execution date only really makes sense for non-stateful tasks.  And execution 
date can be out of sequence with actual run time.
   I think it might make sense to make it so we don't do updates: when state 
changes, we insert a new record with the current state.  Primary key would be 
dag id / task id / timestamp.  To get current state, we get the last record for 
the dag / task.  It's possible we could allow state to be namespaced under task 
id with a column `key` like is done with XCom but I don't think it's necessary. 
   
   I previously shared the concern, why create another table that is almost 
identical to xcom.  But the reality is XCom is problematic for stateful tasks 
in a number of ways.  Obviously there is the clearing / idempotency issue. But 
additionally if you use trigger dag, with XCom your next scheduled run won't 
get current state because it sorts by execution_date.
   
   WDYT?
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to