potiuk commented on PR #24085:
URL: https://github.com/apache/airflow/pull/24085#issuecomment-1145982469
Hey @uranusjr - actually I needed to do quite a number of other changes to
make the tests pass. I am not sure however if some of those changes are how
they should - be some of them were based rather on intuition than facts as the
current hierarchy of operators and how it interacts with Serialization is
well,, convoluted to say the least (most of it obviously due to history of
adding MappedOperator on top of existing BaseOperator).
* I believe we had qute a bit of inconsistency in how we treated hash/eq of
operators for BasedOperator and MappedOperator. I think I fixed it - though
when I tried to move it level up (to AbstractOperator) I had some strange
"MapperOperator" is not hashable even if it derived from AbstractOperator - but
likely some magic of serializing/deserializing which I don't fully understand
caused it.
* operators_extra_links in baseOperator were )_ rather than [] and it cause
comparision with serialized version wrong (this is the one I have most
reservations about - likely it shoudl be changed elsewhere
* XComArg() is recreated when serializing and since is passed as value in a
dictionary, the field comparision flagged identical instances as different. I
implemented proper hash there to make it hashable and usable in set
comparision, and had to make a bit more complex comparision for dicts - to
compare values by equality rather than identity.
* when serializing /deserializing we loose "DecoratedMappedOperator" -
likely for a good reason, that's why I had to relax the check in __eq for type
otherwise we would have DecoratedMappedOperator != MappedOperator. I think it
might not matter that we are loosing Decorated (because I think they are
equivalent after deserializing) but maybe it's worth looking at.
* I excluded some fields from comparision, because they were wildly
different after serialize/deserialize. I guess it does not matter as those are
only used to before mapping (serializing happens already for fully mapped task
I think) but it woudl be great to double check if that is the right approach
and whether we have not lost anything
For dags:
* 'partial_kwargs',
* 'operator_class',
For tasks:
* 'operator_class',
* 'partial_kwargs',
* 'mapped_op_kwargs',
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]