Hi, I've been fiddeling around and I have something that seems to work on my machine ...
I did a few things: 1) Split the project into two modules: the function and the tests. I did this because of the special packaging requirements that force the source of the function to be included. https://github.com/nielsbasjes/yauaa/blob/DrillUDF/udfs/drill/ https://github.com/nielsbasjes/yauaa/blob/DrillUDF/udfs/drill-tests/ 2) I simplified the function to have two strings in and one string out. Status: I documented the current status here: https://github.com/nielsbasjes/yauaa/blob/DrillUDF/README-Drill.md The UDF works but I truly dislike the current code, looking at all the examples I have seen this seems to be the code style that is required. As a consequence the original method that returns a map is untested. I have been able to write a kind of test for the "one value" variant but not for the "Map" variant. See: https://github.com/nielsbasjes/yauaa/blob/DrillUDF/udfs/drill-tests/src/test/java/nl/basjes/parse/useragent/drill/TestParseUserAgentFunctionField.java#L80 Any assistance towards a 'clean' code is much appreciated. Niels Basjes On Wed, Jan 24, 2018 at 7:40 PM, Paul Rogers <[email protected]> wrote: > Hi Niels, > > Good questions. I can answer a few right off, some others will need a bit > more research. > > > (simple) UDF Charles created > > Charles, do you know if your example still works? Nothing changed in the > UDF API so, it probably should... I took a quick look and everything looks > OK. (Have not tried to run the UDF, however.) > > > Work in progress > > Your function uses a complex writer. This is one area I have not yet > personally explored. I'll need to do a bit more homework before I can > answer. > > > consistently fails with this error > > Drill uses a specific version of Netty and adds custom classes to the > Netty name space. Be sure you are using the same Netty version as Drill. In > fact, you should build with Drill's own dependencies: bring in Drill as the > dependency for your project and Drill will bring in everything you need. > > It is best to think of a UDF as an extension to Drill rather than a > separate project built using Drill API. Drill is very sensitive to Netty, > Guava and other versions, so your UDF is, essentially, part of Drill. > That's why I did my experiments based on Drill sources in the Drill Netty > project. Plus, doing that is really the only effective way to unit test > your code. > > > I could really use examples that show me how to do the same things with > NullableVarCharHolder > > There is a page for that in the Wiki post that I mentioned previously. > Walks through an example using VarChar as input and output. (If anything is > missing or unclear, let me know and I'll fix the material.) > > > ... and BaseWriter.ComplexWriter > > That's what I still need to research. > > > I have downloaded the drill codebase > > My suggestion for now: follow the suggestions in the Wiki to develop your > code within the Drill project (again, let me know if any of the material is > missing or unclear). That is known to work and does allow testing. Then, > later, move the code to a separate project and work out how to get it to > build there. Finally, work out how to add your UDF to a running Drill > server. > > > it is still very hard. > > The "UDF" mechanism is, really, just Drill's own internal function > mechanism with a nice name. It was designed to be used within the Drill > project. What we're trying to do is figure out how to make creating UDFs as > painless as possible, but within the constraints that UDFs are a > specialized part of Drill. > > Please continue to give us feedback about your experience so we can help > you succeed, but also get this written up for other users as well. > > Thanks, > > - Paul > > > > On Wednesday, January 24, 2018, 3:19:04 AM PST, Niels Basjes < > [email protected]> wrote: > > > Paul, Charles, > > Thanks for the pointers. > These really helped in getting started. > > I'm trying to fully understand the (simple) UDF Charles created a while ago > for a java library of mine > https://github.com/cgivre/drill-useragent-function > > I noticed that it has not been updated for a long time so I decided to pull > it into my project. > This way I have a reason to dive into Drill and get to know it a bit better > and to make the UDF better maintained. > I already have UDFs for Pig, Hive, Flink and Beam so Drill would be a > sensible addition. > > So as a first step I pulled the existing code in an tried to create a few > tests based on the examples and documentation you guys sent me. > > Work in progress (which fails on the tests): > https://github.com/nielsbasjes/yauaa/tree/DrillUDF/udfs/drill > > Some of the things I ran into that confuse me: > *1)* For some reason the netty version packaged with Drill 1.12.0 > consistently fails with this error at the root cause when trying to run > something: > *Caused by: java.lang.NoSuchMethodError: > io.netty.buffer.PooledByteBufAllocatorL$InnerAllocator.threadCache() > Lio/netty/buffer/PoolThreadCache;* > > I found that the io.netty.buffer.PooledByteBufAllocatorL inherits > from io.netty.buffer.PooledByteBufAllocator which in the version I get > from > the drill dependencies does not have that method. > My workaround was to exclude 'netty-all' from the 'drill-java-exec' > dependency and re-add 'netty-all' version 4.0.28.Final to my project. > This works but feels bad. > > My first impression is that this is a bug in Drill 0.12.0. > > *2)* I always see this warning when running: > 11:49:38,556 [WARN ] GuavaPatcher : 40: Unable > to patch Guava classes. > javassist.CannotCompileException: by java.lang.LinkageError: loader > (instance of sun/misc/Launcher$AppClassLoader): attempted duplicate > class > definition for name: "com/google/common/base/Stopwatch" > at javassist.ClassPool.toClass(ClassPool.java:1099) > at javassist.ClassPool.toClass(ClassPool.java:1042) > at javassist.ClassPool.toClass(ClassPool.java:1000) > at javassist.CtClass.toClass(CtClass.java:1140) > at > org.apache.drill.exec.util.GuavaPatcher.patchStopwatch( > GuavaPatcher.java:66) > at org.apache.drill.exec.util.GuavaPatcher.patch(GuavaPatcher.java:36) > > > My project does not include any guava so this also seems like a drill > issue. > > *3)* In the unit test examples I see input/output using simple types like > > float and int. > I could really use examples that show me how to do the same things with > NullableVarCharHolder, BaseWriter.ComplexWriter. > Especially how do I create an instance of DrillBuf to set the input of a > (Nullable)VarCharHolder? > > So far I have downloaded the drill codebase to figure out how to do this > but I have to say that at this point it is still very hard. > > > Any help with these point is greatly appreciated. > Thanks. > > Niels Basjes > > > On Mon, Jan 22, 2018 at 4:35 PM, Paul Rogers <[email protected]> > wrote: > > > Hi Niels, > > > > You can find detailed suggestions here: https://github.com/paul- > > rogers/drill/wiki/UDFs-Background-Information > > > > In particular, see the page on debugging UDFs. > > > > - Paul > > > > Sent from my iPhone > > > > > On Jan 22, 2018, at 7:09 AM, Charles Givre <[email protected]> wrote: > > > > > > Hi Niels, > > > Take a look at this file: > > > > > > https://github.com/cgivre/drill/blob/67804e9d8a4634df8dbf60e848533a > > 62d64dee64/exec/java-exec/src/test/java/org/apache/drill/exec/fn/impl/ > > TestNetworkFunctions.java <https://github.com/cgivre/drill/blob/ > > 67804e9d8a4634df8dbf60e848533a62d64dee64/exec/java-exec/src/ > > test/java/org/apache/drill/exec/fn/impl/TestNetworkFunctions.java> > > > > > > This should be a pretty good template for unit tests for a function. > > > — C > > > > > >> On Jan 22, 2018, at 10:01, Niels Basjes <[email protected]> wrote: > > >> > > >> Hi, > > >> > > >> I was reading through the tutorial page on how to write a custom > > function > > >> for Drill and I've alos looked at some functions I found on the > > internet. > > >> > > >> There is one thing I would really like to know that I have not yet > > found so > > >> far: How do I unut test if I did it right? > > >> > > >> Especially the effects of the various annotations and the fact that > you > > >> need to specify all classes "fully" without an import. > > >> > > >> -- > > >> Best regards / Met vriendelijke groeten, > > > >> > > >> Niels Basjes > > > > > > > > > -- > Best regards / Met vriendelijke groeten, > > Niels Basjes > -- Best regards / Met vriendelijke groeten, Niels Basjes
