Hi,

I've been fiddeling around and I have something that seems to work on my
machine  ...

I did a few things:
1) Split the project into two modules: the function and the tests. I did
this because of the special packaging requirements that force the source of
the function to be included.
https://github.com/nielsbasjes/yauaa/blob/DrillUDF/udfs/drill/
https://github.com/nielsbasjes/yauaa/blob/DrillUDF/udfs/drill-tests/

2) I simplified the function to have two strings in and one string out.

Status:
I documented the current status here:
https://github.com/nielsbasjes/yauaa/blob/DrillUDF/README-Drill.md

The UDF works but I truly dislike the current code, looking at all the
examples I have seen this seems to be the code style that is required.

As a consequence the original method that returns a map is untested. I have
been able to write a kind of test for the "one value" variant but not for
the "Map" variant.

See:
https://github.com/nielsbasjes/yauaa/blob/DrillUDF/udfs/drill-tests/src/test/java/nl/basjes/parse/useragent/drill/TestParseUserAgentFunctionField.java#L80

Any assistance towards a 'clean' code is much appreciated.

Niels Basjes


On Wed, Jan 24, 2018 at 7:40 PM, Paul Rogers <[email protected]> wrote:

> Hi Niels,
>
> Good questions. I can answer a few right off, some others will need a bit
> more research.
>
> > (simple) UDF Charles created
>
> Charles, do you know if your example still works? Nothing changed in the
> UDF API so, it probably should... I took a quick look and everything looks
> OK. (Have not tried to run the UDF, however.)
>
> > Work in progress
>
> Your function uses a complex writer. This is one area I have not yet
> personally explored. I'll need to do a bit more homework before I can
> answer.
>
> > consistently fails with this error
>
> Drill uses a specific version of Netty and adds custom classes to the
> Netty name space. Be sure you are using the same Netty version as Drill. In
> fact, you should build with Drill's own dependencies: bring in Drill as the
> dependency for your project and Drill will bring in everything you need.
>
> It is best to think of a UDF as an extension to Drill rather than a
> separate project built using Drill API. Drill is very sensitive to Netty,
> Guava and other versions, so your UDF is, essentially, part of Drill.
> That's why I did my experiments based on Drill sources in the Drill Netty
> project. Plus, doing that is really the only effective way to unit test
> your code.
>
> > I could really use examples that show me how to do the same things with
> NullableVarCharHolder
>
> There is a page for that in the Wiki post that I mentioned previously.
> Walks through an example using VarChar as input and output. (If anything is
> missing or unclear, let me know and I'll fix the material.)
>
> > ... and BaseWriter.ComplexWriter
>
> That's what I still need to research.
>
> > I have downloaded the drill codebase
>
> My suggestion for now: follow the suggestions in the Wiki to develop your
> code within the Drill project (again, let me know if any of the material is
> missing or unclear). That is known to work and does allow testing. Then,
> later, move the code to a separate project and work out how to get it to
> build there. Finally, work out how to add your UDF to a running Drill
> server.
>
> > it is still very hard.
>
> The "UDF" mechanism is, really, just Drill's own internal function
> mechanism with a nice name. It was designed to be used within the Drill
> project. What we're trying to do is figure out how to make creating UDFs as
> painless as possible, but within the constraints that UDFs are a
> specialized part of Drill.
>
> Please continue to give us feedback about your experience so we can help
> you succeed, but also get this written up for other users as well.
>
> Thanks,
>
> - Paul
>
>
>
> On Wednesday, January 24, 2018, 3:19:04 AM PST, Niels Basjes <
> [email protected]> wrote:
>
>
> Paul, Charles,
>
> Thanks for the pointers.
> These really helped in getting started.
>
> I'm trying to fully understand the (simple) UDF Charles created a while ago
> for a java library of mine
> https://github.com/cgivre/drill-useragent-function
>
> I noticed that it has not been updated for a long time so I decided to pull
> it into my project.
> This way I have a reason to dive into Drill and get to know it a bit better
> and to make the UDF better maintained.
> I already have UDFs for Pig, Hive, Flink and Beam so Drill would be a
> sensible addition.
>
> So as a first step I pulled the existing code in an tried to create a few
> tests based on the examples and documentation you guys sent me.
>
> Work in progress (which fails on the tests):
> https://github.com/nielsbasjes/yauaa/tree/DrillUDF/udfs/drill
>
> Some of the things I ran into that confuse me:
> *1)* For some reason the netty version packaged with Drill 1.12.0
> consistently fails with this error at the root cause when trying to run
> something:
> *Caused by: java.lang.NoSuchMethodError:
> io.netty.buffer.PooledByteBufAllocatorL$InnerAllocator.threadCache()
> Lio/netty/buffer/PoolThreadCache;*
>
> I found that the io.netty.buffer.PooledByteBufAllocatorL inherits
> from io.netty.buffer.PooledByteBufAllocator which in the version I get
> from
> the drill dependencies does not have that method.
> My workaround was to exclude 'netty-all' from the 'drill-java-exec'
> dependency and re-add 'netty-all' version 4.0.28.Final to my project.
> This works but feels bad.
>
> My first impression is that this is a bug in Drill 0.12.0.
>
> *2)* I always see this warning when running:
> 11:49:38,556 [WARN ] GuavaPatcher                            :  40: Unable
> to patch Guava classes.
> javassist.CannotCompileException: by java.lang.LinkageError: loader
> (instance of  sun/misc/Launcher$AppClassLoader): attempted  duplicate
> class
> definition for name: "com/google/common/base/Stopwatch"
> at javassist.ClassPool.toClass(ClassPool.java:1099)
> at javassist.ClassPool.toClass(ClassPool.java:1042)
> at javassist.ClassPool.toClass(ClassPool.java:1000)
> at javassist.CtClass.toClass(CtClass.java:1140)
> at
> org.apache.drill.exec.util.GuavaPatcher.patchStopwatch(
> GuavaPatcher.java:66)
> at org.apache.drill.exec.util.GuavaPatcher.patch(GuavaPatcher.java:36)
>
>
> My project does not include any guava so this also seems like a drill
> issue.
>
> *3)* In the unit test examples I see input/output using simple types like
>
> float and int.
> I could really use examples that show me how to do the same things with
> NullableVarCharHolder, BaseWriter.ComplexWriter.
> Especially how do I create an instance of DrillBuf to set the input of a
> (Nullable)VarCharHolder?
>
> So far I have downloaded the drill codebase to figure out how to do this
> but I have to say that at this point it is still very hard.
>
>
> Any help with these point is greatly appreciated.
> Thanks.
>
> Niels Basjes
>
>
> On Mon, Jan 22, 2018 at 4:35 PM, Paul Rogers <[email protected]>
> wrote:
>
> > Hi Niels,
> >
> > You can find detailed suggestions here: https://github.com/paul-
> > rogers/drill/wiki/UDFs-Background-Information
> >
> > In particular, see the page on debugging UDFs.
> >
> > - Paul
> >
> > Sent from my iPhone
> >
> > > On Jan 22, 2018, at 7:09 AM, Charles Givre <[email protected]> wrote:
> > >
> > > Hi Niels,
> > > Take a look at this file:
> > >
> > > https://github.com/cgivre/drill/blob/67804e9d8a4634df8dbf60e848533a
> > 62d64dee64/exec/java-exec/src/test/java/org/apache/drill/exec/fn/impl/
> > TestNetworkFunctions.java <https://github.com/cgivre/drill/blob/
> > 67804e9d8a4634df8dbf60e848533a62d64dee64/exec/java-exec/src/
> > test/java/org/apache/drill/exec/fn/impl/TestNetworkFunctions.java>
> > >
> > > This should be a pretty good template for unit tests for a function.
> > > — C
> > >
> > >> On Jan 22, 2018, at 10:01, Niels Basjes <[email protected]> wrote:
> > >>
> > >> Hi,
> > >>
> > >> I was reading through the tutorial page on how to write a custom
> > function
> > >> for Drill and I've alos looked at some functions I found on the
> > internet.
> > >>
> > >> There is one thing I would really like to know that I have not yet
> > found so
> > >> far: How do I unut test if I did it right?
> > >>
> > >> Especially the effects of the various annotations and the fact that
> you
> > >> need to specify all classes "fully" without an import.
> > >>
> > >> --
> > >> Best regards / Met vriendelijke groeten,
>
> > >>
> > >> Niels Basjes
> > >
> >
>
>
>
> --
> Best regards / Met vriendelijke groeten,
>
> Niels Basjes
>



-- 
Best regards / Met vriendelijke groeten,

Niels Basjes

Reply via email to