sjyangkevin commented on PR #53690: URL: https://github.com/apache/airflow/pull/53690#issuecomment-3157230656
Hi @potiuk , @bolkedebruin , @uranusjr , and @ashb , thank you very much for your time and patient in reviewing this PR. I've pushed some small updates to see if it could resolve the concerns in our discussion. ### Updates to `numpy.py` 1. The `import_string` is removed from `numpy.py` and the redundant check in the `deserializer` method is removed. 2. `hasattr(np, "bool") and isinstance(o, np.bool) or isinstance(o, np.bool_)` is used as a version compatible check for both `numpy.bool_` (which is used as the NumPy scalar in NumPy version 1.x) and `numpy.bool` (which is used as the NumPy scalar in NumPy version 2.x, and `numpy.bool_` becomes an alias of `numpy.bool`) 3. `np.float32` is added to the list of serializers (i.e., `serializers`), as mentioned in https://github.com/apache/airflow/pull/53690#discussion_r2230037816 and https://github.com/apache/airflow/pull/53690#issuecomment-3134483966. More on the third bullet point, if the `numpy.float32` is not in the list of serializers, we will get the following error. However, since it is part of the check `if isinstance(o, (np.float16, np.float32, np.float64, np.complex64, np.complex128)):`, probably make sense to have it in the list. <img width="1037" height="398" alt="Screenshot from 2025-08-05 20-35-18" src="https://github.com/user-attachments/assets/f2a1a2a0-5c85-46af-b766-e54af3da7811" /> ### Test Cases @uranusjr , I would like to elaborate more on the modifications to `test_serializers_importable_and_str` to see if it make sense to generate those test cases dynamically. https://github.com/apache/airflow/blob/9b1c16c2c847a17a40e404856524d596db4b81d2/airflow-core/tests/unit/serialization/test_serde.py#L389-L411 The above code is the current implementation in main branch. It is a **single test case** with some "special handles" for either compatibility considerations or optional dependencies. Basically, I think the test attempts to simulate the process of registering serializers and check if each serializer is importable. I modified the test case by making it "parameterized" and the parameters are generated by the function `generate_serializers_importable_tests`. Therefore, for each serializer, a test case will be created, and we can have a control on each individual test from the test generation process, such as skipping some test case due to compatibility issue. I can then run the test in breeze environment with different version of NumPy installed. with NumPy 1.x installed. The `numpy.bool` test will be skipped because it is deprecated in NumPy 1.x. <img width="1232" height="517" alt="Screenshot from 2025-08-05 21-14-17" src="https://github.com/user-attachments/assets/d6cface9-deaf-4473-93a2-df4e48129d68" /> with NumPy 2.x installed. Both `numpy.bool` and `numpy.bool_` test cases will be executed because both are valid. <img width="1050" height="245" alt="Screenshot from 2025-08-05 21-12-42" src="https://github.com/user-attachments/assets/ec293dcf-f6c9-48da-838f-38fb5f8b5a5c" /> When a new serializer is added, this function can also capture those automatically to test for import, so it is less likely to miss any serializer. I also try to improve the documentation to see if it could help to provide more clarity. But, I am open for further feedback you have regarding this implementation. ### Issue related to https://github.com/apache/airflow/issues/54097 Looks like the issue is resolved through https://github.com/apache/airflow/pull/54139. @potiuk please let me know if there is anything related to the issue that is still needed to be resolved here. Thanks. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
