RemHero commented on issue #39365:
URL: https://github.com/apache/arrow/issues/39365#issuecomment-1871199198

   Thank you very much for your suggestions! I conducted some further 
experiments based on your advice and found that the `batchSize` has a 
significant impact on the computation of expressions. 
   
   Additionally, I implemented the computation using 
`arrow::compute::Expression`. However, the performance still falls short 
compared to the row-based implementation. Regarding the 'memory-boundary' you 
mentioned, I understand that my calculation process involves a lot of memory 
allocation and copying due to storing intermediate results. To address this, I 
tested only a multiplication operation, eliminating unnecessary intermediate 
result copies, and the results showed that the performance difference between 
the two approaches was not significant. 
   
   Base on this, I'm not certain whether the arrow-based Function computation 
utilizes SIMD optimization? It might require further investigation through 
disassembly. Going forward, I plan to continue experimenting with the 
`Array-wise ('vector') functions `mentioned in the official documentation and 
follow your advice to hand-code the computation expressions above 
arrow-columanr. 
   
   I have referred to the documentation you provided, and I may also try 
setting the appropriate SIMD optimization level during compilation.
   There might be a result within the next couple of days. @mapleFU 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to