Re: Drill not picking up a UDF
Thank you. I understand the import limitation and the whole custom-drillbit premise. Will make the changes to see if this starts running :). Regards, -Stefan On Mon, Jul 20, 2015 at 1:25 AM, Jim Bates jba...@maprtech.com wrote: I pulled out your udf class and threw it into my package. It worked for me with a few modifications. You can not have imported classes or method references in your eval or setup methods as the code will get pulled out and executed somewhere else and it won't be able to find it. With that in mind, Period will need to be org.joda.time.Period, DateTime will need to be org.joda.time.DateTime and roundTimeStamp will need to be com.activitystream.drill.udfs.ASUserDefinedFunctions.roundTimeStamp. On Sun, Jul 19, 2015 at 7:02 PM, Stefán Baxter ste...@activitystream.com wrote: Hi, The project can be found here: https://github.com/acmeguy/asdrill Thank you, -Stefán On Sun, Jul 19, 2015 at 11:57 PM, Stefán Baxter ste...@activitystream.com wrote: Hi, I'm more than happy to share the little that is there (I will publish it on github and send link tomorrow). I ended up copying my UDF (singl-file.java) into the simple-drill-function project where it got picked up. Then I discovered a whole new set of dependencies/limitations - The UDF are recompiled - any imports are invalid or at least overwritten - import org.joda.time.Period; (means that Period class is not resolved on runtime) - Error: SYSTEM ERROR: CompileException: Line 71, Column 26: Cannot determine simple type name Period - Calling any outside functions, like I was calling a static function of the new class (same file), leads to an errors as they are not not resolved - Error: SYSTEM ERROR: CompileException: Line 70, Column 35: A method named roundTimeStamp is not declared in any enclosing class nor any supertype, nor through a static import Perhaps this was mentioned in the documentation but this is, at the very least, not straight forward and super-inviting. Thank you for your assistance, we will keep trying :) Regards, -Stefan On Sun, Jul 19, 2015 at 11:34 PM, Tugdual Grall tugd...@gmail.com wrote: Hi Stefan, Do you think you can share your complete project ? This will help to debug it for you. T On Sunday, July 19, 2015, Stefán Baxter ste...@activitystream.com wrote: Hi Ted, I fetched this, built it and deployed it without problems. I can not see any real difference other than this deploys two .jar (I tried that as well earlier). I'm still trying to figure out why Drill is not picking up my UDFs Regards, -Stefán On Sun, Jul 19, 2015 at 10:45 PM, Ted Dunning ted.dunn...@gmail.com javascript:; wrote: Stefan, Have you seen this github project: https://github.com/mapr-demos/simple-drill-functions ? On Sun, Jul 19, 2015 at 2:14 PM, Stefán Baxter ste...@activitystream.com javascript:; wrote: Hi Jim, I'm still not able to make this work. Do you have a sample .jar file with a small example that you are running? Regards, -Stefan On Sun, Jul 19, 2015 at 6:46 PM, Ted Dunning ted.dunn...@gmail.com javascript:; wrote: Sounds like a fine example, not because of sophistication but because it deals with dates. Check the drill logs. It is likely that drill is grumpy about something in your udf or packaging. Also, feel free to snitch the pom from the simple examples in order to get the pieces assembled and packaged correctly. Sent from my iPhone On Jul 19, 2015, at 11:25, Stefán Baxter ste...@activitystream.com javascript:; wrote: Hi Jim, My UDF is currently so simple that I'm not sure you need it (or want it). It basically just rounds a timestamp value with ISO 8601 periods asRoundTimestamp(timestampvalue,'PT10M'). I would be more than happy to contribute to your project rather than building our own :). Is the repo public? Regards, -Stefan On Sun, Jul 19, 2015 at 6:18 PM, Jim Bates jba...@maprtech.com javascript:; wrote: Maven will typically create a jar for class and a jar for source when told to do so. I just include the source files in the same jar as the class files. There is a github example drill udf project we are working on to include several examples to simplify the learning curve. If your interested... I'd
Re: Drill not picking up a UDF
Hi Jim, I have made those changes and I'm wondering if you can runs this be using the two .jar files that the mvn package places in the target directory? I have tried to have Drill pick those up but the error now is: Error: SYSTEM ERROR: UnsupportedOperationException Fragment 0:0 [Error Id: da589dd4-4cfd-4659-8b93-219074ab8c72 on localhost:31010] (state=,code=0) It seems to indicate that it's picking up the functions but that they can not be run. Regards, - Stefán On Mon, Jul 20, 2015 at 1:25 AM, Jim Bates jba...@maprtech.com wrote: I pulled out your udf class and threw it into my package. It worked for me with a few modifications. You can not have imported classes or method references in your eval or setup methods as the code will get pulled out and executed somewhere else and it won't be able to find it. With that in mind, Period will need to be org.joda.time.Period, DateTime will need to be org.joda.time.DateTime and roundTimeStamp will need to be com.activitystream.drill.udfs.ASUserDefinedFunctions.roundTimeStamp. On Sun, Jul 19, 2015 at 7:02 PM, Stefán Baxter ste...@activitystream.com wrote: Hi, The project can be found here: https://github.com/acmeguy/asdrill Thank you, -Stefán On Sun, Jul 19, 2015 at 11:57 PM, Stefán Baxter ste...@activitystream.com wrote: Hi, I'm more than happy to share the little that is there (I will publish it on github and send link tomorrow). I ended up copying my UDF (singl-file.java) into the simple-drill-function project where it got picked up. Then I discovered a whole new set of dependencies/limitations - The UDF are recompiled - any imports are invalid or at least overwritten - import org.joda.time.Period; (means that Period class is not resolved on runtime) - Error: SYSTEM ERROR: CompileException: Line 71, Column 26: Cannot determine simple type name Period - Calling any outside functions, like I was calling a static function of the new class (same file), leads to an errors as they are not not resolved - Error: SYSTEM ERROR: CompileException: Line 70, Column 35: A method named roundTimeStamp is not declared in any enclosing class nor any supertype, nor through a static import Perhaps this was mentioned in the documentation but this is, at the very least, not straight forward and super-inviting. Thank you for your assistance, we will keep trying :) Regards, -Stefan On Sun, Jul 19, 2015 at 11:34 PM, Tugdual Grall tugd...@gmail.com wrote: Hi Stefan, Do you think you can share your complete project ? This will help to debug it for you. T On Sunday, July 19, 2015, Stefán Baxter ste...@activitystream.com wrote: Hi Ted, I fetched this, built it and deployed it without problems. I can not see any real difference other than this deploys two .jar (I tried that as well earlier). I'm still trying to figure out why Drill is not picking up my UDFs Regards, -Stefán On Sun, Jul 19, 2015 at 10:45 PM, Ted Dunning ted.dunn...@gmail.com javascript:; wrote: Stefan, Have you seen this github project: https://github.com/mapr-demos/simple-drill-functions ? On Sun, Jul 19, 2015 at 2:14 PM, Stefán Baxter ste...@activitystream.com javascript:; wrote: Hi Jim, I'm still not able to make this work. Do you have a sample .jar file with a small example that you are running? Regards, -Stefan On Sun, Jul 19, 2015 at 6:46 PM, Ted Dunning ted.dunn...@gmail.com javascript:; wrote: Sounds like a fine example, not because of sophistication but because it deals with dates. Check the drill logs. It is likely that drill is grumpy about something in your udf or packaging. Also, feel free to snitch the pom from the simple examples in order to get the pieces assembled and packaged correctly. Sent from my iPhone On Jul 19, 2015, at 11:25, Stefán Baxter ste...@activitystream.com javascript:; wrote: Hi Jim, My UDF is currently so simple that I'm not sure you need it (or want it). It basically just rounds a timestamp value with ISO 8601 periods asRoundTimestamp(timestampvalue,'PT10M'). I would be more than happy to contribute to your project rather than building our own :). Is the repo public? Regards, -Stefan On Sun, Jul 19, 2015 at 6:18 PM, Jim Bates jba...@maprtech.com javascript:; wrote: Maven will typically create a jar for class and a jar for
Re: Drill not picking up a UDF
Hi, This is working now. This, unless someone corrects me, is the code needed to get a string parameter for a UDF: - input2.buffer.toString(input2.start, input2.end-input2.start,java.nio.charset.Charset.defaultCharset()) Regards, -Stefan (not the happiest camper) On Mon, Jul 20, 2015 at 2:07 PM, Stefán Baxter ste...@activitystream.com wrote: Hi, After going through the log this is clear what is happening (once the Drill picked up the UDF a bit earlier this morning). I'm calling the VarCharHolder.toString() to get the text value for the parameter and that is throwing this exception. Two observations: 1. Calling a deprecated function, even though that its not optimal, usually does not cause such drastic results. 2. As far as I can see there is no easy way to get the string value of a property without carving out a piece of the buffer. What am I missing here? Regards, -Stefan On Mon, Jul 20, 2015 at 12:58 PM, Jacques Nadeau jacq...@apache.org wrote: Can you enable verbose errors at the session level? It may reveal more about what is failing. On Jul 20, 2015 5:32 AM, Stefán Baxter ste...@activitystream.com wrote: Hi Jim, I have made those changes and I'm wondering if you can runs this be using the two .jar files that the mvn package places in the target directory? I have tried to have Drill pick those up but the error now is: Error: SYSTEM ERROR: UnsupportedOperationException Fragment 0:0 [Error Id: da589dd4-4cfd-4659-8b93-219074ab8c72 on localhost:31010] (state=,code=0) It seems to indicate that it's picking up the functions but that they can not be run. Regards, - Stefán On Mon, Jul 20, 2015 at 1:25 AM, Jim Bates jba...@maprtech.com wrote: I pulled out your udf class and threw it into my package. It worked for me with a few modifications. You can not have imported classes or method references in your eval or setup methods as the code will get pulled out and executed somewhere else and it won't be able to find it. With that in mind, Period will need to be org.joda.time.Period, DateTime will need to be org.joda.time.DateTime and roundTimeStamp will need to be com.activitystream.drill.udfs.ASUserDefinedFunctions.roundTimeStamp. On Sun, Jul 19, 2015 at 7:02 PM, Stefán Baxter ste...@activitystream.com wrote: Hi, The project can be found here: https://github.com/acmeguy/asdrill Thank you, -Stefán On Sun, Jul 19, 2015 at 11:57 PM, Stefán Baxter ste...@activitystream.com wrote: Hi, I'm more than happy to share the little that is there (I will publish it on github and send link tomorrow). I ended up copying my UDF (singl-file.java) into the simple-drill-function project where it got picked up. Then I discovered a whole new set of dependencies/limitations - The UDF are recompiled - any imports are invalid or at least overwritten - import org.joda.time.Period; (means that Period class is not resolved on runtime) - Error: SYSTEM ERROR: CompileException: Line 71, Column 26: Cannot determine simple type name Period - Calling any outside functions, like I was calling a static function of the new class (same file), leads to an errors as they are not not resolved - Error: SYSTEM ERROR: CompileException: Line 70, Column 35: A method named roundTimeStamp is not declared in any enclosing class nor any supertype, nor through a static import Perhaps this was mentioned in the documentation but this is, at the very least, not straight forward and super-inviting. Thank you for your assistance, we will keep trying :) Regards, -Stefan On Sun, Jul 19, 2015 at 11:34 PM, Tugdual Grall tugd...@gmail.com wrote: Hi Stefan, Do you think you can share your complete project ? This will help to debug it for you. T On Sunday, July 19, 2015, Stefán Baxter ste...@activitystream.com wrote: Hi Ted, I fetched this, built it and deployed it without problems. I can not see any real difference other than this deploys two .jar (I tried that as well earlier). I'm still trying to figure out why Drill is not picking up my UDFs Regards, -Stefán On Sun, Jul 19, 2015 at 10:45 PM, Ted Dunning ted.dunn...@gmail.com javascript:; wrote: Stefan, Have you seen this github project: https://github.com/mapr-demos/simple-drill-functions ? On Sun, Jul 19, 2015 at 2:14 PM, Stefán Baxter ste...@activitystream.com javascript:; wrote: Hi Jim, I'm still not
Re: Drill not picking up a UDF
That's an apt description. The Holders that have a fixed size value objects are simple to get at but the ones that have variable length objects are pulled via the buffer. You can also use the StringFunctionHelpers. org.apache.drill.exec.expr.fn.impl.StringFunctionHelpers.toStringFromUTF8(in .start, in.end, in.buffer) Where 'in' is @Param NullableVarCharHolder in or @Param VarCharHolder in On Mon, Jul 20, 2015 at 9:07 AM, Stefán Baxter ste...@activitystream.com wrote: Hi, After going through the log this is clear what is happening (once the Drill picked up the UDF a bit earlier this morning). I'm calling the VarCharHolder.toString() to get the text value for the parameter and that is throwing this exception. Two observations: 1. Calling a deprecated function, even though that its not optimal, usually does not cause such drastic results. 2. As far as I can see there is no easy way to get the string value of a property without carving out a piece of the buffer. What am I missing here? Regards, -Stefan On Mon, Jul 20, 2015 at 12:58 PM, Jacques Nadeau jacq...@apache.org wrote: Can you enable verbose errors at the session level? It may reveal more about what is failing. On Jul 20, 2015 5:32 AM, Stefán Baxter ste...@activitystream.com wrote: Hi Jim, I have made those changes and I'm wondering if you can runs this be using the two .jar files that the mvn package places in the target directory? I have tried to have Drill pick those up but the error now is: Error: SYSTEM ERROR: UnsupportedOperationException Fragment 0:0 [Error Id: da589dd4-4cfd-4659-8b93-219074ab8c72 on localhost:31010] (state=,code=0) It seems to indicate that it's picking up the functions but that they can not be run. Regards, - Stefán On Mon, Jul 20, 2015 at 1:25 AM, Jim Bates jba...@maprtech.com wrote: I pulled out your udf class and threw it into my package. It worked for me with a few modifications. You can not have imported classes or method references in your eval or setup methods as the code will get pulled out and executed somewhere else and it won't be able to find it. With that in mind, Period will need to be org.joda.time.Period, DateTime will need to be org.joda.time.DateTime and roundTimeStamp will need to be com.activitystream.drill.udfs.ASUserDefinedFunctions.roundTimeStamp. On Sun, Jul 19, 2015 at 7:02 PM, Stefán Baxter ste...@activitystream.com wrote: Hi, The project can be found here: https://github.com/acmeguy/asdrill Thank you, -Stefán On Sun, Jul 19, 2015 at 11:57 PM, Stefán Baxter ste...@activitystream.com wrote: Hi, I'm more than happy to share the little that is there (I will publish it on github and send link tomorrow). I ended up copying my UDF (singl-file.java) into the simple-drill-function project where it got picked up. Then I discovered a whole new set of dependencies/limitations - The UDF are recompiled - any imports are invalid or at least overwritten - import org.joda.time.Period; (means that Period class is not resolved on runtime) - Error: SYSTEM ERROR: CompileException: Line 71, Column 26: Cannot determine simple type name Period - Calling any outside functions, like I was calling a static function of the new class (same file), leads to an errors as they are not not resolved - Error: SYSTEM ERROR: CompileException: Line 70, Column 35: A method named roundTimeStamp is not declared in any enclosing class nor any supertype, nor through a static import Perhaps this was mentioned in the documentation but this is, at the very least, not straight forward and super-inviting. Thank you for your assistance, we will keep trying :) Regards, -Stefan On Sun, Jul 19, 2015 at 11:34 PM, Tugdual Grall tugd...@gmail.com wrote: Hi Stefan, Do you think you can share your complete project ? This will help to debug it for you. T On Sunday, July 19, 2015, Stefán Baxter ste...@activitystream.com wrote: Hi Ted, I fetched this, built it and deployed it without problems. I can not see any real difference other than this deploys two .jar (I tried that as well earlier). I'm still trying to figure out why Drill is not picking up my UDFs Regards, -Stefán On Sun, Jul 19, 2015 at 10:45 PM, Ted Dunning ted.dunn...@gmail.com javascript:; wrote: Stefan, Have you seen this github
Re: Drill not picking up a UDF
Hi, Just wanted to thank those that helped. While I'm happy that the UDF is running I feel like it could have taken a lost shorter. I will contribute to the documentation but here is my main takeaway: - Source code needs to be included in the jar - it's used when the drillbits/queries are built/orchistrated - Always use full class qualifiers in the Eval() segment of your UDF - Don't forget to add the drill-module.conf to the resources folder of your project (should end up in th root of the jar) - Adding your udf package to drill-override.conf does not seem to matter - just copy the jar(s) with the .class and .java files to the jars/3rdparty directory - Feel free to user this code anyway you wish: - https://github.com/acmeguy/asdrill - Know about this project: - https://github.com/mapr-demos/simple-drill-functions Regards, -Stefán On Mon, Jul 20, 2015 at 2:23 PM, Jim Bates jba...@maprtech.com wrote: That's an apt description. The Holders that have a fixed size value objects are simple to get at but the ones that have variable length objects are pulled via the buffer. You can also use the StringFunctionHelpers. org.apache.drill.exec.expr.fn.impl.StringFunctionHelpers.toStringFromUTF8(in .start, in.end, in.buffer) Where 'in' is @Param NullableVarCharHolder in or @Param VarCharHolder in On Mon, Jul 20, 2015 at 9:07 AM, Stefán Baxter ste...@activitystream.com wrote: Hi, After going through the log this is clear what is happening (once the Drill picked up the UDF a bit earlier this morning). I'm calling the VarCharHolder.toString() to get the text value for the parameter and that is throwing this exception. Two observations: 1. Calling a deprecated function, even though that its not optimal, usually does not cause such drastic results. 2. As far as I can see there is no easy way to get the string value of a property without carving out a piece of the buffer. What am I missing here? Regards, -Stefan On Mon, Jul 20, 2015 at 12:58 PM, Jacques Nadeau jacq...@apache.org wrote: Can you enable verbose errors at the session level? It may reveal more about what is failing. On Jul 20, 2015 5:32 AM, Stefán Baxter ste...@activitystream.com wrote: Hi Jim, I have made those changes and I'm wondering if you can runs this be using the two .jar files that the mvn package places in the target directory? I have tried to have Drill pick those up but the error now is: Error: SYSTEM ERROR: UnsupportedOperationException Fragment 0:0 [Error Id: da589dd4-4cfd-4659-8b93-219074ab8c72 on localhost:31010] (state=,code=0) It seems to indicate that it's picking up the functions but that they can not be run. Regards, - Stefán On Mon, Jul 20, 2015 at 1:25 AM, Jim Bates jba...@maprtech.com wrote: I pulled out your udf class and threw it into my package. It worked for me with a few modifications. You can not have imported classes or method references in your eval or setup methods as the code will get pulled out and executed somewhere else and it won't be able to find it. With that in mind, Period will need to be org.joda.time.Period, DateTime will need to be org.joda.time.DateTime and roundTimeStamp will need to be com.activitystream.drill.udfs.ASUserDefinedFunctions.roundTimeStamp. On Sun, Jul 19, 2015 at 7:02 PM, Stefán Baxter ste...@activitystream.com wrote: Hi, The project can be found here: https://github.com/acmeguy/asdrill Thank you, -Stefán On Sun, Jul 19, 2015 at 11:57 PM, Stefán Baxter ste...@activitystream.com wrote: Hi, I'm more than happy to share the little that is there (I will publish it on github and send link tomorrow). I ended up copying my UDF (singl-file.java) into the simple-drill-function project where it got picked up. Then I discovered a whole new set of dependencies/limitations - The UDF are recompiled - any imports are invalid or at least overwritten - import org.joda.time.Period; (means that Period class is not resolved on runtime) - Error: SYSTEM ERROR: CompileException: Line 71, Column 26: Cannot determine simple type name Period - Calling any outside functions, like I was calling a static function of the new class (same file), leads to an errors as they are not not resolved - Error: SYSTEM ERROR: CompileException: Line 70, Column 35: A method named roundTimeStamp is not declared in any enclosing class nor any supertype, nor through a static import
Re: Recursive CTE Support in Drill
Thanks for more elaboration Ted, Jacques and Jason! @Ted that is a very cool idea. I tried the cross join but figured cross join is not supported in drill yet but we have DRILL-786 for it. The new method looks very promising. It seems it is an implicit cross join, isn't it? I just tried it out and it worked like a charm. I will go on with this method. @Jaques, yes as Jason said, we discussed this before and I have talked to my colleagues to help me with modifying the ODBC driver so it sends a plan. Also thanks for the query. I tied it out for tow tables and it worked find but extending it to three tables gives me a syntax error. select * from ((select column1, 1 as join_keyb from (SELECT column1, 1 as join_key FROM `hive43`.`default`.`double_table`) t1 Join (SELECT column1, 1 as join_key FROM `hive43`.`default`.`double_table`) t2 on t1.join_key=t2.join_key) t12 Join (SELECT column1, 1 as join_key FROM `hive43`.`default`.`double_table`) t3 on t12.join_keyb=t3.join_key) *The other syntax was easier for me to use the join three times so I could test it with * select t1.column1 from (SELECT column1, 1 as join_key FROM `hive43`.`default`.`double_table`) t1, (SELECT column1, 1 as join_key FROM `hive43`.`default`.`double_table`) t2, (SELECT column1, 1 as join_key FROM `hive43`.`default`.`double_table`) t3 where t1.join_key=t2.join_key and t1.join_key=t3.join_key Thank you very much for your time Ted, Jacques and Jason! Thanks, Alex On Fri, Jul 17, 2015 at 2:09 PM, Jason Altekruse altekruseja...@gmail.com wrote: Jacques, Alexander has brought up this problem previously in one of the hangouts and said that submitting a physical plan was not possible through ODBC. If he is able to modify the driver code to make it possible to submit one, that would be an option, as I believe the C++ client is capable of submitting plans. The issue I seem to recall him mentioning is that the ODBC driver was running a little sanity checking on they sql query to try to prevent submitting complete garbage queries to a server. I think he had concerns that a JSON formatted physical plan would fail these checks and he would have to disable them along with trying to allow submitting two types of queries from ODBC. On Fri, Jul 17, 2015 at 8:52 AM, Jacques Nadeau jacq...@dremio.com wrote: Removing cross posting Alexander, There is currently no way for Drill to generate a large amount of data using SQL. However, you can generate large generic data by using the MockStoragePlugin if you submit a plan. You can find an example plan using this at [1]. I heard someone might be working on extending the MockStoragePlugin to support SQL which would provide the outcome you requested. [1] https://github.com/apache/drill/blob/master/exec/java-exec/src/test/resources/mock-scan.json On Thu, Jul 16, 2015 at 10:16 PM, Ted Dunning ted.dunn...@gmail.com wrote: Also, just doing a Cartesian join of three copies of 1000 records will give you a billion records with negligible I/o. Sent from my iPhone On Jul 16, 2015, at 15:43, Jason Altekruse altekruseja...@gmail.com wrote: @Alexander If you want to test the speed of the ODBC driver you can do that without a new storage plugin. If you get the entire dataset into memory, it will be returned from Drill a quickly as we can possibly send it to the client. One way to do this is to insert a sort; we cannot send along any of the data until the compete sort is done. As long as you don't read so much data that we will start spilling the sort to disk, all of the records will be in memory. To take the read and sort time out of your test, just make sure to record the time you first receive data from Drill, not the query start time. There is one gotcha here. To make the BI tools more responsive, we implemented a feature that will send along one empty batch of records with the schema information populated. This schema is generated by applying all of the transformations that happen throughout the query. For example, the join operator handles this schema population by sending along the schema merged from the two sides of the join, project will similarly add or remove column based on the expressions and columns requested. You will want to make sure you record your start time when you receive the first batch with actual records. This can give you an accurate measurement of the ODBC performance, removing the bottleneck of the disk. On Thu, Jul 16, 2015 at 3:24 PM, Alexander Zarei alexanderz.si...@gmail.com wrote: Thanks for the answers. @Ted my only goal is to pump a large amount of data without having to read from Hard Disk. I am measuring the ODBC driver performance and I need a higher data transfer rate. So any method that helps
Re: Drill not picking up a UDF
On Mon, Jul 20, 2015 at 8:22 AM, Andrew Brust andrew.br...@bluebadgeinsights.com wrote: 1. It seems to me like Drill is at a point where, if you thread the needle perfectly, things generally work as advertised. That’s certainly an advance over the old, old days, where stuff that should have worked sometimes just didn’t. 2. Threading that needle can be super-hard, even for an experienced Java developer. This is definitely true for the problem of *extending* Drill. It is much less of a problem for *using* Drill, which is what the team has spent much more effort on. See the Drill in 10 minutes article in the docs. To my mind, this prioritization is correct. But it shouldn't be exclusive either. This is not to say that user-level fit and finish is done by any means, just that people who extend the system will get lots more splinters.
Re: Drill not picking up a UDF
Yes, agreed. Fair point and apologies for not articulating same in my comment. On 7/20/15, 4:10 PM, Ted Dunning ted.dunn...@gmail.com wrote: On Mon, Jul 20, 2015 at 8:22 AM, Andrew Brust andrew.br...@bluebadgeinsights.com wrote: 1. It seems to me like Drill is at a point where, if you thread the needle perfectly, things generally work as advertised. That’s certainly an advance over the old, old days, where stuff that should have worked sometimes just didn’t. 2. Threading that needle can be super-hard, even for an experienced Java developer. This is definitely true for the problem of *extending* Drill. It is much less of a problem for *using* Drill, which is what the team has spent much more effort on. See the Drill in 10 minutes article in the docs. To my mind, this prioritization is correct. But it shouldn't be exclusive either. This is not to say that user-level fit and finish is done by any means, just that people who extend the system will get lots more splinters.
combining the results of two queries (union) before grouping (derived grouping / re-grouping)
Hi, Does Drill support grouping on post union result sets (derived)? I'm fetching data from two sources and currently all groups, that can be found in both sets, are twice, understandably with different counts etc., in the final output. Regards, -Stefan
Re: Drill not picking up a UDF
Hey Stefan, Can you propose some edits/updates to the documentation? The doc is maintained on github [1]. The key thing to understand is that a UDF is slightly incomplete Java. This is because Drill actually rips apart the functionality of the UDF and recomposes it directly within expression evaluation code. This gives a substantial performance and memory benefit but also creates some challenges. As such, there are some key rules one should follow: - Don't use imports - Both class file and source file have to be on the classpath (yes, Drill uses the source of the function) - Any JAR files holding UDF resources must include drill-module.conf marker file so that Drill knows to include that JAR in consideration for UDF loading - ValueHolders should be treated as structs. As such, don't call any methods on ValueHolders If you fork the documentation and provide a pull request, doc people (Kristine and Bridget) will work to include your feedback so that others don't find as many challenges as you did. thanks, Jacques [1] https://github.com/apache/drill/tree/gh-pages/_docs/develop-custom-functions On Mon, Jul 20, 2015 at 7:19 AM, Stefán Baxter ste...@activitystream.com wrote: Hi, This is working now. This, unless someone corrects me, is the code needed to get a string parameter for a UDF: - input2.buffer.toString(input2.start, input2.end-input2.start,java.nio.charset.Charset.defaultCharset()) Regards, -Stefan (not the happiest camper) On Mon, Jul 20, 2015 at 2:07 PM, Stefán Baxter ste...@activitystream.com wrote: Hi, After going through the log this is clear what is happening (once the Drill picked up the UDF a bit earlier this morning). I'm calling the VarCharHolder.toString() to get the text value for the parameter and that is throwing this exception. Two observations: 1. Calling a deprecated function, even though that its not optimal, usually does not cause such drastic results. 2. As far as I can see there is no easy way to get the string value of a property without carving out a piece of the buffer. What am I missing here? Regards, -Stefan On Mon, Jul 20, 2015 at 12:58 PM, Jacques Nadeau jacq...@apache.org wrote: Can you enable verbose errors at the session level? It may reveal more about what is failing. On Jul 20, 2015 5:32 AM, Stefán Baxter ste...@activitystream.com wrote: Hi Jim, I have made those changes and I'm wondering if you can runs this be using the two .jar files that the mvn package places in the target directory? I have tried to have Drill pick those up but the error now is: Error: SYSTEM ERROR: UnsupportedOperationException Fragment 0:0 [Error Id: da589dd4-4cfd-4659-8b93-219074ab8c72 on localhost:31010] (state=,code=0) It seems to indicate that it's picking up the functions but that they can not be run. Regards, - Stefán On Mon, Jul 20, 2015 at 1:25 AM, Jim Bates jba...@maprtech.com wrote: I pulled out your udf class and threw it into my package. It worked for me with a few modifications. You can not have imported classes or method references in your eval or setup methods as the code will get pulled out and executed somewhere else and it won't be able to find it. With that in mind, Period will need to be org.joda.time.Period, DateTime will need to be org.joda.time.DateTime and roundTimeStamp will need to be com.activitystream.drill.udfs.ASUserDefinedFunctions.roundTimeStamp. On Sun, Jul 19, 2015 at 7:02 PM, Stefán Baxter ste...@activitystream.com wrote: Hi, The project can be found here: https://github.com/acmeguy/asdrill Thank you, -Stefán On Sun, Jul 19, 2015 at 11:57 PM, Stefán Baxter ste...@activitystream.com wrote: Hi, I'm more than happy to share the little that is there (I will publish it on github and send link tomorrow). I ended up copying my UDF (singl-file.java) into the simple-drill-function project where it got picked up. Then I discovered a whole new set of dependencies/limitations - The UDF are recompiled - any imports are invalid or at least overwritten - import org.joda.time.Period; (means that Period class is not resolved on runtime) - Error: SYSTEM ERROR: CompileException: Line 71, Column 26: Cannot determine simple type name Period - Calling any outside functions, like I was calling a static function of the new class (same file), leads to an errors as they are not not resolved - Error: SYSTEM ERROR: CompileException: Line 70, Column 35: A method named roundTimeStamp is not declared in any enclosing class nor any supertype, nor through a