tobixdev commented on issue #17488:
URL: https://github.com/apache/datafusion/issues/17488#issuecomment-3271661746

   Could it be that the function is called more often on smaller pieces of 
data? If yes, this could be the problem for 2.
   
   Here is a minimal example that does something similar (even though I could 
not emulate the UDF yet). Unfortunately, I could only test it in debug mode, as 
I am now in a hurry. If you need more, I can help you tomorrow of the day after.
   
   Just change the branch to `branch-49` in the Cargo.toml to see the 
difference.
   
   DataFusion 50:
   
   ```
   ________________________________________________________
   Executed in    8.17 secs    fish           external
      usr time    8.05 secs  220.00 micros    8.05 secs
      sys time    0.12 secs  561.00 micros    0.12 secs
   ```
   
   DataFusion 49:
   
   ```
   ________________________________________________________
   Executed in    7.40 secs      fish           external
      usr time  132.61 millis    0.04 millis  132.57 millis
      sys time  111.55 millis    1.08 millis  110.48 millis
   ```
   
   Here is a zip of a profile (Debug Mode) and the zip of the source: Hopefully 
this works. Sorry I couldn't really test it due to the time constraints.
   
   Thanks for looking into this!
   
   [repro.zip](https://github.com/user-attachments/files/22239587/repro.zip)
   
   
[profile.json.gz](https://github.com/user-attachments/files/22239569/profile.json.gz)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to