sjyangkevin commented on PR #53690:
URL: https://github.com/apache/airflow/pull/53690#issuecomment-3157230656

   Hi @potiuk , @bolkedebruin , @uranusjr , and @ashb , thank you very much for 
your time and patient in reviewing this PR. I've pushed some small updates to 
see if it could resolve the concerns in our discussion.
   
   ### Updates to `numpy.py`
   1. The `import_string` is removed from `numpy.py` and the redundant check in 
the `deserializer` method is removed.
   2. `hasattr(np, "bool") and isinstance(o, np.bool) or isinstance(o, 
np.bool_)` is used as a version compatible check for both `numpy.bool_` (which 
is used as the NumPy scalar in NumPy version 1.x) and `numpy.bool` (which is 
used as the NumPy scalar in NumPy version 2.x, and `numpy.bool_` becomes an 
alias of `numpy.bool`)
   3. `np.float32` is added to the list of serializers (i.e., `serializers`), 
as mentioned in 
https://github.com/apache/airflow/pull/53690#discussion_r2230037816 and 
https://github.com/apache/airflow/pull/53690#issuecomment-3134483966.
   
   More on the third bullet point, if the `numpy.float32` is not in the list of 
serializers, we will get the following error. However, since it is part of the 
check `if isinstance(o, (np.float16, np.float32, np.float64, np.complex64, 
np.complex128)):`, probably make sense to have it in the list.
   <img width="1037" height="398" alt="Screenshot from 2025-08-05 20-35-18" 
src="https://github.com/user-attachments/assets/f2a1a2a0-5c85-46af-b766-e54af3da7811";
 />
   
   ### Test Cases
   
   @uranusjr , I would like to elaborate more on the modifications to 
`test_serializers_importable_and_str` to see if it make sense to generate those 
test cases dynamically.
   
   
https://github.com/apache/airflow/blob/9b1c16c2c847a17a40e404856524d596db4b81d2/airflow-core/tests/unit/serialization/test_serde.py#L389-L411
   
   The above code is the current implementation in main branch. It is a 
**single test case** with some "special handles" for either compatibility 
considerations or optional dependencies. Basically, I think the test attempts 
to simulate the process of registering serializers and check if each serializer 
is importable.
   
   I modified the test case by making it "parameterized" and the parameters are 
generated by the function `generate_serializers_importable_tests`. Therefore, 
for each serializer, a test case will be created, and we can have a control on 
each individual test from the test generation process, such as skipping some 
test case due to compatibility issue.
   
   I can then run the test in breeze environment with different version of 
NumPy installed.
   
   with NumPy 1.x installed. The `numpy.bool` test will be skipped because it 
is deprecated in NumPy 1.x. 
   <img width="1232" height="517" alt="Screenshot from 2025-08-05 21-14-17" 
src="https://github.com/user-attachments/assets/d6cface9-deaf-4473-93a2-df4e48129d68";
 />
   
   with NumPy 2.x installed. Both `numpy.bool` and `numpy.bool_` test cases 
will be executed because both are valid.
   <img width="1050" height="245" alt="Screenshot from 2025-08-05 21-12-42" 
src="https://github.com/user-attachments/assets/ec293dcf-f6c9-48da-838f-38fb5f8b5a5c";
 />
   
   When a new serializer is added, this function can also capture those 
automatically to test for import, so it is less likely to miss any serializer. 
I also try to improve the documentation to see if it could help to provide more 
clarity. But, I am open for further feedback you have regarding this 
implementation.
   
   ### Issue related to https://github.com/apache/airflow/issues/54097
   
   Looks like the issue is resolved through 
https://github.com/apache/airflow/pull/54139. @potiuk please let me know if 
there is anything related to the issue that is still needed to be resolved here.
   
   Thanks.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to