wesm commented on PR #13753:
URL: https://github.com/apache/arrow/pull/13753#issuecomment-1206908421

   I do not think the implementation is less robust — the only change is that 
ephemeral `DataType` instances cannot be used to instantiate `InputType` 
without creating a TypeMatcher that wraps them. If the developer does create a 
KernelSignature / InputType with an ephemeral `shared_ptr<DataType>`, the 
signature would be unusable so any unit test which attempts to use the kernel 
would segfault. The probability of an Arrow user ever facing a bug related to 
this essentially zero — it would amount to us merging untested kernels code 
(because tests would immediately reveal the programming error). 
   
   It also comes down to a cost philosophy in this code:
   
   * Pay the cost of using shared_ptr always — both in generated code (because 
most shared_ptr operations are inlined by the compiler) and in performance
   * Pay the cost only when it's needed
   
   It turns out that in this code, that retaining a `shared_ptr<DataType>` for 
kernel function signatures and type validation is rarely needed. Why pay the 
cost for > 95% of cases (or whatever the number is?) when the shared_ptr is 
only actually needed in < 5% of cases? I recall that we've already seen 
shared_ptr contention show up unfavorably in performance profiles. 
   
   I will take the time when I can to write some benchmarks (for me, the 
present-and-future savings in generated code is evidence enough) that will 
hopefully provide some additional evidence to support this. In the meantime I 
hope that the rebase conflicts will not be too annoying to keep up with. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to