Dandandan commented on pull request #2124:
URL: 
https://github.com/apache/arrow-datafusion/pull/2124#issuecomment-1082673469


   > Thanks again @Dandandan for working on JIT! I hope I can have time to work 
with you on jit soon. 
   > 
   > FWIW, one of our blockers is Cranelift has no ability to inline assembly, 
so it causes performance regression for columnar to row conversion (as 
recognized in https://github.com/apache/arrow-datafusion/pull/1975). A possible 
solution is to use LLVM instead of Cranelift because PostgreSQL and Impala have 
proved that inline is feasible in LLVM. I would pick it up if it's not taken 
already at the time.
   
   Cool, thanks for the context.
   
   I was thinking for generating the code for compiling expressions to 
cranelift jit, this would not be bad, as we could generate the whole loop 
instead to generate the new array contents which doesn't need to call any 
function. I am not sure if we run into the same problem as that thread, as it 
is about accessors for datastructures? It seems if we can operate on arrays 
with primitive datatypes this wouldn't be problematic?
   
   I was thinking of the following strategy:
   
   * Get the start address, array length and the bytes for the primitive type.
   * Pre-allocate an output array, pass the address
   * Use this info to generate the full loop where the loop uses the length, 
the assignment uses the addresses and increases the pointer with the number of 
bytes.
   
   Is this somewhere going to get stuck?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to