Hi guys, I have written a couple of custom UDFS (specifically WEEK() and WEEKYEAR() to get that date information out of timestamps).
I sampled two queries (on approx. 11 million records in Parquet files) select count(*) from `table` group by extract(day from `timestamp`) 750ms select count(*) from `table` group by week(`timestamp`) 2100ms The code for the WEEK() function is not far from the code from the source for the EXTRACT(DAY) function. Furthermore, even if I copy the exact code for the EXTRACT(DAY) function into that, it has the same performance detriments. My question is, why would a UDF be so much slower? Is this by design or is there something I'm missing? Happy to attach the source code of the function if that helps.
