Could you include the physical plan generated for each query? Since you say you tried copying the exact code from Drill's EXTRACT function, you should see the same performance, unless for some reason the plan is different. There is no difference whatsoever between UDFs and built in functions. Built in functions are simply UDFs that happen to be packaged with Drill, but otherwise there is nothing special about them.
On Tue, May 26, 2015 at 8:04 PM, Ted Dunning <[email protected]> wrote: > On Tue, May 26, 2015 at 7:26 PM, Adam Gilmore <[email protected]> > wrote: > > > The code for the WEEK() function is not far from the code from the source > > for the EXTRACT(DAY) function. Furthermore, even if I copy the exact > code > > for the EXTRACT(DAY) function into that, it has the same performance > > detriments. > > > > My question is, why would a UDF be so much slower? Is this by design or > is > > there something I'm missing? > > > > Happy to attach the source code of the function if that helps. > > > > Well, you might want to try exactly copying the source of the extract > function. I would expect that you would get just hte same performance > since UDF's use the same mechanism as physical operators. > > Two possibilities are: > > 1) the Java optimizer has seen something subtle about your code or the > built in code that allows for economical implementation > > 2) the Drill optimizer has some kind of special trick that it has figured > out > > 3) there is some sort of data type conversion that your code has forced the > Drill optimizer to insert a conversion > > (the third option is a bonus, just for you) > > > The fourth option that I don't know about is also quite a likely > possibility. > > Seeing your code (put it in a gist, don't attach it) would help a lot. > Seeing queries and query plans would help as well. > -- Steven Phillips Software Engineer mapr.com
