potiuk commented on PR #24085:
URL: https://github.com/apache/airflow/pull/24085#issuecomment-1145982469

   Hey @uranusjr - actually I needed to do quite a number of other changes to 
make the tests pass. I am not sure however if some of those changes are how 
they should - be some of them were based rather on intuition than facts as the 
current hierarchy of operators and how it interacts with Serialization is 
well,, convoluted to say the least (most of it obviously due to history of 
adding MappedOperator on top of existing BaseOperator).
   
   * I believe we had qute a bit of inconsistency in how we treated hash/eq of 
operators for BasedOperator and MappedOperator. I  think I fixed it - though 
when I tried to move it level up (to AbstractOperator) I had some strange 
"MapperOperator" is not hashable even if it derived from AbstractOperator - but 
likely some magic of serializing/deserializing which I don't fully understand 
caused it. 
   
   * operators_extra_links in baseOperator were )_ rather than [] and it cause 
comparision with serialized version wrong (this is the one I have most 
reservations about - likely it shoudl be changed elsewhere
   
   * XComArg() is recreated when serializing and since is passed as value in a 
dictionary, the field comparision flagged identical instances as different. I 
implemented proper hash there to make it hashable and usable in set 
comparision, and had to make a bit more complex comparision for dicts - to 
compare values by equality rather than identity.
   
   * when serializing /deserializing we loose "DecoratedMappedOperator" - 
likely for a good reason, that's why I had to relax the check in __eq for type 
otherwise we would have DecoratedMappedOperator != MappedOperator. I think it 
might not matter that we are loosing Decorated (because I think they are 
equivalent after deserializing) but maybe it's worth looking at. 
   
   * I excluded some fields from comparision, because they were wildly 
different after serialize/deserialize. I guess it does not matter as those are 
only used to before mapping  (serializing happens already for fully mapped task 
I think) but it woudl be great to double check if that is the right approach 
and whether we have not lost anything 
   
   For dags:
         * 'partial_kwargs',
         * 'operator_class',
   For tasks:
         * 'operator_class',
         * 'partial_kwargs',
         * 'mapped_op_kwargs',
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to