pandalee99 opened a new pull request, #1720:
URL: https://github.com/apache/fury/pull/1720

   ## What does this PR do?
   ref: https://arxiv.org/pdf/1902.08318.pdf
   ref: https://github.com/simdutf/simdutf
   I learned about the related simd technology, as well as this paper and 
project implementation.
   Using SIMD technique for string detection.
   First, I need to implement the logic and complete the latin character 
detection
   ``` c++
   // Baseline implementation
   bool isLatin_Baseline(const std::string& str) {
       for (char c : str) {
           if (static_cast<unsigned char>(c) >= 128) {
               return false;
           }
       }
       return true;
   }
   ```
   <img width="393" alt="image" 
src="https://raw.githubusercontent.com/pandalee99/image_store/master/hexo/simd_base_line_test1.png";>
   Then, I tried to use SSE2 to speed it up, which is obviously a little bit 
faster, the logic is to read multiple characters at once and then do the bit 
arithmetic
   Obviously, there was a speed boost, but I didn't think it was enough, so I 
tried it again with AVX2
   <img width="493" alt="image" 
   
src="https://raw.githubusercontent.com/pandalee99/image_store/master/hexo/simd_test_all_1.png";>
   I think in terms of efficiency, it's already much faster than before. 
   But how do you prove that it's also logically true?
   I added test samples to verify
   
   ``` C++
   TEST(StringUtilTest, TestIsLatinLogic)
   ```
   
   Finally, I ran the test
   <img width="493" alt="image" 
   
src="https://raw.githubusercontent.com/pandalee99/image_store/master/hexo/simd_ubantu_test_1.png";>
   done.
   
   
   <!-- Describe the purpose of this PR. -->
   
   
   ## Related issues
   Closes #313 
   
   <!--
   Is there any related issue? Please attach here.
   
   - #xxxx0
   - #xxxx1
   - #xxxx2
   -->
   
   
   ## Does this PR introduce any user-facing change?
   
   <!--
   If any user-facing interface changes, please [open an 
issue](https://github.com/apache/fury/issues/new/choose) describing the need to 
do so and update the document if necessary.
   -->
   
   - [ ] Does this PR introduce any public API change?
   - [ ] Does this PR introduce any binary protocol compatibility change?
   
   
   ## Benchmark
   
   <!--
   When the PR has an impact on performance (if you don't know whether the PR 
will have an impact on performance, you can submit the PR first, and if it will 
have impact on performance, the code reviewer will explain it), be sure to 
attach a benchmark data here.
   -->
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to