tobixdev commented on issue #17488:
URL: https://github.com/apache/datafusion/issues/17488#issuecomment-3271661746
Could it be that the function is called more often on smaller pieces of
data? If yes, this could be the problem for 2.
Here is a minimal example that does something similar (even though I could
not emulate the UDF yet). Unfortunately, I could only test it in debug mode, as
I am now in a hurry. If you need more, I can help you tomorrow of the day after.
Just change the branch to `branch-49` in the Cargo.toml to see the
difference.
DataFusion 50:
```
________________________________________________________
Executed in 8.17 secs fish external
usr time 8.05 secs 220.00 micros 8.05 secs
sys time 0.12 secs 561.00 micros 0.12 secs
```
DataFusion 49:
```
________________________________________________________
Executed in 7.40 secs fish external
usr time 132.61 millis 0.04 millis 132.57 millis
sys time 111.55 millis 1.08 millis 110.48 millis
```
Here is a zip of a profile (Debug Mode) and the zip of the source: Hopefully
this works. Sorry I couldn't really test it due to the time constraints.
Thanks for looking into this!
[repro.zip](https://github.com/user-attachments/files/22239587/repro.zip)
[profile.json.gz](https://github.com/user-attachments/files/22239569/profile.json.gz)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]