[PR] feat(python): guide cython to optimize code generation [fury]

via GitHub Fri, 25 Oct 2024 00:08:44 -0700


penguin-wwy opened a new pull request, #1905:
URL: https://github.com/apache/fury/pull/1905


   <!--
   **Thanks for contributing to Fury.**
   
   **If this is your first time opening a PR on fury, you can refer to 
[CONTRIBUTING.md](https://github.com/apache/fury/blob/main/CONTRIBUTING.md).**
   
   Contribution Checklist
   
       - The **Apache Fury (incubating)** community has restrictions on the 
naming of pr titles. You can also find instructions in 
[CONTRIBUTING.md](https://github.com/apache/fury/blob/main/CONTRIBUTING.md).
   
       - Fury has a strong focus on performance. If the PR you submit will have 
an impact on performance, please benchmark it first and provide the benchmark 
result here.
   -->
   
   ## What does this PR do?
   
   Optimize the C++ code generated by Cython by modifying the function 
implementations of the Buffer class.
   
   Using `get_int32` as an example, the following C++ code is generated before 
modification.
   
   ```c++
   static PyObject *__pyx_pf_6pyfury_5_util_6Buffer_26get_int8(struct 
__pyx_obj_6pyfury_5_util_Buffer *__pyx_v_self, uint32_t __pyx_v_offset) {
     ...
     __pyx_t_1 = __pyx_f_6pyfury_5_util_6Buffer_get_int8(__pyx_v_self, 
__pyx_v_offset, 1);
     if (unlikely(PyErr_Occurred())) __PYX_ERR(0, 119, __pyx_L1_error)
     __pyx_t_2 = __Pyx_PyInt_From_int8_t(__pyx_t_1);
     if (unlikely(!__pyx_t_2)) __PYX_ERR(0, 119, __pyx_L1_error)
     ...
   ```
   
   Since get_int32 returns a C++ type, the cython needs to generate calling 
`PyLong_FromLong`, and guard code such as calling `PyErr_Occurred`.
   However, it is known that int32 can always be used to generate a 
`PyLongObject` with `PyLong_FromLong`. Therefore, when we manually call this, 
the cython only needs to check if the return value is null, as shown in the 
following code.
   
   ```c++
     __pyx_t_1 = __pyx_f_6pyfury_5_util_6Buffer_get_int8(__pyx_v_self, 
__pyx_v_offset, 1);
     if (unlikely(!__pyx_t_1)) __PYX_ERR(0, 130, __pyx_L1_error)
   ```
   
   
   ## Related issues
   
   - #1887 
   
   ## Does this PR introduce any user-facing change?
   
   <!--
   If any user-facing interface changes, please [open an 
issue](https://github.com/apache/fury/issues/new/choose) describing the need to 
do so and update the document if necessary.
   -->
   
   - [x] Does this PR introduce any public API change?
   - [ ] Does this PR introduce any binary protocol compatibility change?
   
   ## Benchmark
   
   Microbenchmark case:
   ```
   def benchmark_get_bool(buffer):
       buffer.get_bool(0)
       buffer.get_bool(1)
       buffer.get_bool(100)
       buffer.get_bool(1000)
   
   def benchmark_get_int32(buffer):
       buffer.get_int32(0)
       buffer.get_int32(1)
       buffer.get_int32(100)
       buffer.get_int32(1000)
   
   
   def benchmark_get_float(buffer):
       buffer.get_float(0)
       buffer.get_float(1)
       buffer.get_float(100)
       buffer.get_float(1000)
   
   
   def benchmark_read(buffer):
       buffer.reader_index = 0
       buffer.read_int8()
       buffer.read_int16()
       buffer.read_int24()
       buffer.read_int32()
       buffer.read_int64()
       buffer.read_float()
       buffer.read_double()
   ```
   
   Result:
   ```
   # before
   python benchmark.py --affinity 0
   .....................
   benchmark_get_bool: Mean +- std dev: 220 ns +- 4 ns
   .....................
   benchmark_get_int32: Mean +- std dev: 276 ns +- 10 ns
   .....................
   benchmark_get_float: Mean +- std dev: 254 ns +- 2 ns
   .....................
   benchmark_read: Mean +- std dev: 409 ns +- 11 ns
   
   # after
   python benchmark.py --affinity 0                                             
                                                                                
                                                                                
                                                                       
   .....................
   benchmark_get_bool: Mean +- std dev: 215 ns +- 4 ns
   .....................
   benchmark_get_int32: Mean +- std dev: 264 ns +- 9 ns
   .....................
   benchmark_get_float: Mean +- std dev: 243 ns +- 6 ns
   .....................
   benchmark_read: Mean +- std dev: 380 ns +- 4 ns
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[PR] feat(python): guide cython to optimize code generation [fury]

Reply via email to