Re: Drill not picking up a UDF

2015-07-20 Thread Stefán Baxter
Thank you.

I understand the import limitation and the whole custom-drillbit premise.
Will make the changes to see if this starts running :).

Regards,
 -Stefan

On Mon, Jul 20, 2015 at 1:25 AM, Jim Bates jba...@maprtech.com wrote:

 I pulled out your udf class and threw it into my package. It worked for me
 with a few modifications.

 You can not have imported classes or method references in your eval or
 setup methods as the code will get pulled out and executed somewhere else
 and it won't be able to find it. With that in mind, Period  will need to be
 org.joda.time.Period, DateTime will need to be org.joda.time.DateTime and
 roundTimeStamp  will need to be
 com.activitystream.drill.udfs.ASUserDefinedFunctions.roundTimeStamp.



 On Sun, Jul 19, 2015 at 7:02 PM, Stefán Baxter ste...@activitystream.com
 wrote:

  Hi,
 
  The project can be found here:
  https://github.com/acmeguy/asdrill
 
  Thank you,
   -Stefán
 
  On Sun, Jul 19, 2015 at 11:57 PM, Stefán Baxter 
 ste...@activitystream.com
  
  wrote:
 
   Hi,
  
   I'm more than happy to share the little that is there (I will publish
 it
   on github and send link tomorrow).
  
   I ended up copying my UDF (singl-file.java) into the
  simple-drill-function
   project where it got picked up.
  
   Then I discovered a whole new set of dependencies/limitations
  
  - The UDF are recompiled - any imports are invalid or at least
  overwritten
  -  import org.joda.time.Period; (means that Period class is not
  resolved on runtime)
  - Error: SYSTEM ERROR: CompileException: Line 71, Column 26: Cannot
  determine simple type name Period
  
  - Calling any outside functions, like I was calling a static
  function of the new class (same file), leads to an errors as they
 are
  not
  not resolved
  -  Error: SYSTEM ERROR: CompileException: Line 70, Column 35: A
 method
  named roundTimeStamp is not declared in any enclosing class nor
 any
  supertype, nor through a static import
  
   Perhaps this was mentioned in the documentation but this is, at the
 very
   least, not straight forward and super-inviting.
  
   Thank you for your assistance, we will keep trying :)
  
   Regards,
-Stefan
  
  
   On Sun, Jul 19, 2015 at 11:34 PM, Tugdual Grall tugd...@gmail.com
  wrote:
  
   Hi Stefan,
  
   Do you think you can share your complete project ?
  
   This will help to debug it for you.
  
   T
  
   On Sunday, July 19, 2015, Stefán Baxter ste...@activitystream.com
   wrote:
  
Hi Ted,
   
I fetched this, built it and deployed it without problems.
I can not see any real difference other than this deploys two .jar
 (I
   tried
that as well earlier).
   
I'm still trying to figure out why Drill is not picking up my UDFs
   
Regards,
 -Stefán
   
On Sun, Jul 19, 2015 at 10:45 PM, Ted Dunning 
 ted.dunn...@gmail.com
javascript:; wrote:
   
 Stefan,

 Have you seen this github project:

 https://github.com/mapr-demos/simple-drill-functions

 ?


 On Sun, Jul 19, 2015 at 2:14 PM, Stefán Baxter 
ste...@activitystream.com javascript:;
 wrote:

  Hi Jim,
 
  I'm still not able to make this work. Do you have a sample .jar
  file
 with a
  small example that you are running?
 
  Regards,
   -Stefan
 
  On Sun, Jul 19, 2015 at 6:46 PM, Ted Dunning 
  ted.dunn...@gmail.com
javascript:;
  wrote:
 
  
   Sounds like a fine example, not because of sophistication but
   because
 it
   deals with dates.
  
   Check the drill logs.  It is likely that drill is grumpy about
 something
   in your udf or packaging.
  
   Also, feel free to snitch the pom from the simple examples in
   order
to
  get
   the pieces assembled and packaged correctly.
  
   Sent from my iPhone
  
On Jul 19, 2015, at 11:25, Stefán Baxter 
ste...@activitystream.com javascript:;
   wrote:
   
Hi Jim,
   
My UDF is currently so simple that I'm not sure you need it
  (or
want
  it).
   
It basically just rounds a timestamp value with ISO 8601
  periods
asRoundTimestamp(timestampvalue,'PT10M').
   
I would be more than happy to contribute to your project
  rather
than
building our own :).
   
Is the repo public?
   
Regards,
-Stefan
   
   
   
On Sun, Jul 19, 2015 at 6:18 PM, Jim Bates 
   jba...@maprtech.com
javascript:;
  wrote:
   
Maven will typically create a jar for class and a jar for
   source
 when
   told
to do so. I just include the source files in the same jar
 as
   the
 class
files. There is a github example drill udf project we are
   working
on
  to
include several examples to simplify the learning curve. If
   your
interested... I'd 

Re: Drill not picking up a UDF

2015-07-20 Thread Stefán Baxter
Hi Jim,

I have made those changes and I'm wondering if you can runs this be using
the two .jar files that the mvn package places in the target directory?

I have tried to have Drill pick those up but the error now is:

Error: SYSTEM ERROR: UnsupportedOperationException
Fragment 0:0
[Error Id: da589dd4-4cfd-4659-8b93-219074ab8c72 on localhost:31010]
(state=,code=0)

It seems to indicate that it's picking up the functions but that they can
not be run.

Regards,
 - Stefán

On Mon, Jul 20, 2015 at 1:25 AM, Jim Bates jba...@maprtech.com wrote:

 I pulled out your udf class and threw it into my package. It worked for me
 with a few modifications.

 You can not have imported classes or method references in your eval or
 setup methods as the code will get pulled out and executed somewhere else
 and it won't be able to find it. With that in mind, Period  will need to be
 org.joda.time.Period, DateTime will need to be org.joda.time.DateTime and
 roundTimeStamp  will need to be
 com.activitystream.drill.udfs.ASUserDefinedFunctions.roundTimeStamp.



 On Sun, Jul 19, 2015 at 7:02 PM, Stefán Baxter ste...@activitystream.com
 wrote:

  Hi,
 
  The project can be found here:
  https://github.com/acmeguy/asdrill
 
  Thank you,
   -Stefán
 
  On Sun, Jul 19, 2015 at 11:57 PM, Stefán Baxter 
 ste...@activitystream.com
  
  wrote:
 
   Hi,
  
   I'm more than happy to share the little that is there (I will publish
 it
   on github and send link tomorrow).
  
   I ended up copying my UDF (singl-file.java) into the
  simple-drill-function
   project where it got picked up.
  
   Then I discovered a whole new set of dependencies/limitations
  
  - The UDF are recompiled - any imports are invalid or at least
  overwritten
  -  import org.joda.time.Period; (means that Period class is not
  resolved on runtime)
  - Error: SYSTEM ERROR: CompileException: Line 71, Column 26: Cannot
  determine simple type name Period
  
  - Calling any outside functions, like I was calling a static
  function of the new class (same file), leads to an errors as they
 are
  not
  not resolved
  -  Error: SYSTEM ERROR: CompileException: Line 70, Column 35: A
 method
  named roundTimeStamp is not declared in any enclosing class nor
 any
  supertype, nor through a static import
  
   Perhaps this was mentioned in the documentation but this is, at the
 very
   least, not straight forward and super-inviting.
  
   Thank you for your assistance, we will keep trying :)
  
   Regards,
-Stefan
  
  
   On Sun, Jul 19, 2015 at 11:34 PM, Tugdual Grall tugd...@gmail.com
  wrote:
  
   Hi Stefan,
  
   Do you think you can share your complete project ?
  
   This will help to debug it for you.
  
   T
  
   On Sunday, July 19, 2015, Stefán Baxter ste...@activitystream.com
   wrote:
  
Hi Ted,
   
I fetched this, built it and deployed it without problems.
I can not see any real difference other than this deploys two .jar
 (I
   tried
that as well earlier).
   
I'm still trying to figure out why Drill is not picking up my UDFs
   
Regards,
 -Stefán
   
On Sun, Jul 19, 2015 at 10:45 PM, Ted Dunning 
 ted.dunn...@gmail.com
javascript:; wrote:
   
 Stefan,

 Have you seen this github project:

 https://github.com/mapr-demos/simple-drill-functions

 ?


 On Sun, Jul 19, 2015 at 2:14 PM, Stefán Baxter 
ste...@activitystream.com javascript:;
 wrote:

  Hi Jim,
 
  I'm still not able to make this work. Do you have a sample .jar
  file
 with a
  small example that you are running?
 
  Regards,
   -Stefan
 
  On Sun, Jul 19, 2015 at 6:46 PM, Ted Dunning 
  ted.dunn...@gmail.com
javascript:;
  wrote:
 
  
   Sounds like a fine example, not because of sophistication but
   because
 it
   deals with dates.
  
   Check the drill logs.  It is likely that drill is grumpy about
 something
   in your udf or packaging.
  
   Also, feel free to snitch the pom from the simple examples in
   order
to
  get
   the pieces assembled and packaged correctly.
  
   Sent from my iPhone
  
On Jul 19, 2015, at 11:25, Stefán Baxter 
ste...@activitystream.com javascript:;
   wrote:
   
Hi Jim,
   
My UDF is currently so simple that I'm not sure you need it
  (or
want
  it).
   
It basically just rounds a timestamp value with ISO 8601
  periods
asRoundTimestamp(timestampvalue,'PT10M').
   
I would be more than happy to contribute to your project
  rather
than
building our own :).
   
Is the repo public?
   
Regards,
-Stefan
   
   
   
On Sun, Jul 19, 2015 at 6:18 PM, Jim Bates 
   jba...@maprtech.com
javascript:;
  wrote:
   
Maven will typically create a jar for class and a jar for
 

Re: Drill not picking up a UDF

2015-07-20 Thread Stefán Baxter
Hi,

This is working now.

This, unless someone corrects me, is the code needed to get a string
parameter for a UDF:


   - input2.buffer.toString(input2.start,
input2.end-input2.start,java.nio.charset.Charset.defaultCharset())

Regards,
 -Stefan (not the happiest camper)


On Mon, Jul 20, 2015 at 2:07 PM, Stefán Baxter ste...@activitystream.com
wrote:

 Hi,

 After going through the log this is clear what is happening (once the
 Drill picked up the UDF a bit earlier this morning).

 I'm calling the VarCharHolder.toString()  to get the text value for the
 parameter and that is throwing this exception.

 Two observations:

1. Calling a deprecated function, even though that its not optimal,
usually does not cause such drastic results.
2. As far as I can see there is no easy way to get the string value of
a property without carving out a piece of the buffer.

 What am I missing here?

 Regards,
  -Stefan


 On Mon, Jul 20, 2015 at 12:58 PM, Jacques Nadeau jacq...@apache.org
 wrote:

 Can you enable verbose errors at the session level? It may reveal more
 about what is failing.
 On Jul 20, 2015 5:32 AM, Stefán Baxter ste...@activitystream.com
 wrote:

  Hi Jim,
 
  I have made those changes and I'm wondering if you can runs this be
 using
  the two .jar files that the mvn package places in the target
 directory?
 
  I have tried to have Drill pick those up but the error now is:
 
  Error: SYSTEM ERROR: UnsupportedOperationException
  Fragment 0:0
  [Error Id: da589dd4-4cfd-4659-8b93-219074ab8c72 on localhost:31010]
  (state=,code=0)
 
  It seems to indicate that it's picking up the functions but that they
 can
  not be run.
 
  Regards,
   - Stefán
 
  On Mon, Jul 20, 2015 at 1:25 AM, Jim Bates jba...@maprtech.com wrote:
 
   I pulled out your udf class and threw it into my package. It worked
 for
  me
   with a few modifications.
  
   You can not have imported classes or method references in your eval or
   setup methods as the code will get pulled out and executed somewhere
 else
   and it won't be able to find it. With that in mind, Period  will need
 to
  be
   org.joda.time.Period, DateTime will need to be org.joda.time.DateTime
 and
   roundTimeStamp  will need to be
   com.activitystream.drill.udfs.ASUserDefinedFunctions.roundTimeStamp.
  
  
  
   On Sun, Jul 19, 2015 at 7:02 PM, Stefán Baxter 
  ste...@activitystream.com
   wrote:
  
Hi,
   
The project can be found here:
https://github.com/acmeguy/asdrill
   
Thank you,
 -Stefán
   
On Sun, Jul 19, 2015 at 11:57 PM, Stefán Baxter 
   ste...@activitystream.com


wrote:
   
 Hi,

 I'm more than happy to share the little that is there (I will
 publish
   it
 on github and send link tomorrow).

 I ended up copying my UDF (singl-file.java) into the
simple-drill-function
 project where it got picked up.

 Then I discovered a whole new set of dependencies/limitations

- The UDF are recompiled - any imports are invalid or at least
overwritten
-  import org.joda.time.Period; (means that Period class is not
resolved on runtime)
- Error: SYSTEM ERROR: CompileException: Line 71, Column 26:
  Cannot
determine simple type name Period

- Calling any outside functions, like I was calling a static
function of the new class (same file), leads to an errors as
 they
   are
not
not resolved
-  Error: SYSTEM ERROR: CompileException: Line 70, Column 35: A
   method
named roundTimeStamp is not declared in any enclosing class
 nor
   any
supertype, nor through a static import

 Perhaps this was mentioned in the documentation but this is, at
 the
   very
 least, not straight forward and super-inviting.

 Thank you for your assistance, we will keep trying :)

 Regards,
  -Stefan


 On Sun, Jul 19, 2015 at 11:34 PM, Tugdual Grall 
 tugd...@gmail.com
wrote:

 Hi Stefan,

 Do you think you can share your complete project ?

 This will help to debug it for you.

 T

 On Sunday, July 19, 2015, Stefán Baxter 
 ste...@activitystream.com
 wrote:

  Hi Ted,
 
  I fetched this, built it and deployed it without problems.
  I can not see any real difference other than this deploys two
 .jar
   (I
 tried
  that as well earlier).
 
  I'm still trying to figure out why Drill is not picking up my
 UDFs
 
  Regards,
   -Stefán
 
  On Sun, Jul 19, 2015 at 10:45 PM, Ted Dunning 
   ted.dunn...@gmail.com
  javascript:; wrote:
 
   Stefan,
  
   Have you seen this github project:
  
   https://github.com/mapr-demos/simple-drill-functions
  
   ?
  
  
   On Sun, Jul 19, 2015 at 2:14 PM, Stefán Baxter 
  ste...@activitystream.com javascript:;
   wrote:
  
Hi Jim,
   
I'm still not 

Re: Drill not picking up a UDF

2015-07-20 Thread Jim Bates
That's an apt description. The Holders that have a fixed size value objects
are simple to get at but the ones that have variable length objects are
pulled via the buffer.

You can also use the StringFunctionHelpers.

org.apache.drill.exec.expr.fn.impl.StringFunctionHelpers.toStringFromUTF8(in
.start, in.end, in.buffer)

Where 'in' is

@Param NullableVarCharHolder in

or

@Param VarCharHolder in

On Mon, Jul 20, 2015 at 9:07 AM, Stefán Baxter ste...@activitystream.com
wrote:

 Hi,

 After going through the log this is clear what is happening (once the Drill
 picked up the UDF a bit earlier this morning).

 I'm calling the VarCharHolder.toString()  to get the text value for the
 parameter and that is throwing this exception.

 Two observations:

1. Calling a deprecated function, even though that its not optimal,
usually does not cause such drastic results.
2. As far as I can see there is no easy way to get the string value of a
property without carving out a piece of the buffer.

 What am I missing here?

 Regards,
  -Stefan


 On Mon, Jul 20, 2015 at 12:58 PM, Jacques Nadeau jacq...@apache.org
 wrote:

  Can you enable verbose errors at the session level? It may reveal more
  about what is failing.
  On Jul 20, 2015 5:32 AM, Stefán Baxter ste...@activitystream.com
  wrote:
 
   Hi Jim,
  
   I have made those changes and I'm wondering if you can runs this be
 using
   the two .jar files that the mvn package places in the target
 directory?
  
   I have tried to have Drill pick those up but the error now is:
  
   Error: SYSTEM ERROR: UnsupportedOperationException
   Fragment 0:0
   [Error Id: da589dd4-4cfd-4659-8b93-219074ab8c72 on localhost:31010]
   (state=,code=0)
  
   It seems to indicate that it's picking up the functions but that they
 can
   not be run.
  
   Regards,
- Stefán
  
   On Mon, Jul 20, 2015 at 1:25 AM, Jim Bates jba...@maprtech.com
 wrote:
  
I pulled out your udf class and threw it into my package. It worked
 for
   me
with a few modifications.
   
You can not have imported classes or method references in your eval
 or
setup methods as the code will get pulled out and executed somewhere
  else
and it won't be able to find it. With that in mind, Period  will need
  to
   be
org.joda.time.Period, DateTime will need to be org.joda.time.DateTime
  and
roundTimeStamp  will need to be
com.activitystream.drill.udfs.ASUserDefinedFunctions.roundTimeStamp.
   
   
   
On Sun, Jul 19, 2015 at 7:02 PM, Stefán Baxter 
   ste...@activitystream.com
wrote:
   
 Hi,

 The project can be found here:
 https://github.com/acmeguy/asdrill

 Thank you,
  -Stefán

 On Sun, Jul 19, 2015 at 11:57 PM, Stefán Baxter 
ste...@activitystream.com
 
 wrote:

  Hi,
 
  I'm more than happy to share the little that is there (I will
  publish
it
  on github and send link tomorrow).
 
  I ended up copying my UDF (singl-file.java) into the
 simple-drill-function
  project where it got picked up.
 
  Then I discovered a whole new set of dependencies/limitations
 
 - The UDF are recompiled - any imports are invalid or at least
 overwritten
 -  import org.joda.time.Period; (means that Period class is
 not
 resolved on runtime)
 - Error: SYSTEM ERROR: CompileException: Line 71, Column 26:
   Cannot
 determine simple type name Period
 
 - Calling any outside functions, like I was calling a static
 function of the new class (same file), leads to an errors as
  they
are
 not
 not resolved
 -  Error: SYSTEM ERROR: CompileException: Line 70, Column 35:
 A
method
 named roundTimeStamp is not declared in any enclosing class
  nor
any
 supertype, nor through a static import
 
  Perhaps this was mentioned in the documentation but this is, at
 the
very
  least, not straight forward and super-inviting.
 
  Thank you for your assistance, we will keep trying :)
 
  Regards,
   -Stefan
 
 
  On Sun, Jul 19, 2015 at 11:34 PM, Tugdual Grall 
 tugd...@gmail.com
  
 wrote:
 
  Hi Stefan,
 
  Do you think you can share your complete project ?
 
  This will help to debug it for you.
 
  T
 
  On Sunday, July 19, 2015, Stefán Baxter 
  ste...@activitystream.com
  wrote:
 
   Hi Ted,
  
   I fetched this, built it and deployed it without problems.
   I can not see any real difference other than this deploys two
  .jar
(I
  tried
   that as well earlier).
  
   I'm still trying to figure out why Drill is not picking up my
  UDFs
  
   Regards,
-Stefán
  
   On Sun, Jul 19, 2015 at 10:45 PM, Ted Dunning 
ted.dunn...@gmail.com
   javascript:; wrote:
  
Stefan,
   
Have you seen this github 

Re: Drill not picking up a UDF

2015-07-20 Thread Stefán Baxter
Hi,

Just wanted to thank those that helped.

While I'm happy that the UDF is running I feel like it could have taken a
lost shorter.

I will contribute to the documentation but here is my main takeaway:

   - Source code needs to be included in the jar
   - it's used when the drillbits/queries are built/orchistrated

   - Always use full class qualifiers in the Eval() segment of your UDF

   - Don't forget to add the drill-module.conf to the resources folder of
   your project (should end up in th root of the jar)

   - Adding your udf package to drill-override.conf does not seem to matter
   - just copy the jar(s) with the .class and .java files to the
   jars/3rdparty directory

   - Feel free to user this code anyway you wish:
   - https://github.com/acmeguy/asdrill

   - Know about this project:
   - https://github.com/mapr-demos/simple-drill-functions


Regards,
 -Stefán

On Mon, Jul 20, 2015 at 2:23 PM, Jim Bates jba...@maprtech.com wrote:

 That's an apt description. The Holders that have a fixed size value objects
 are simple to get at but the ones that have variable length objects are
 pulled via the buffer.

 You can also use the StringFunctionHelpers.


 org.apache.drill.exec.expr.fn.impl.StringFunctionHelpers.toStringFromUTF8(in
 .start, in.end, in.buffer)

 Where 'in' is

 @Param NullableVarCharHolder in

 or

 @Param VarCharHolder in

 On Mon, Jul 20, 2015 at 9:07 AM, Stefán Baxter ste...@activitystream.com
 wrote:

  Hi,
 
  After going through the log this is clear what is happening (once the
 Drill
  picked up the UDF a bit earlier this morning).
 
  I'm calling the VarCharHolder.toString()  to get the text value for the
  parameter and that is throwing this exception.
 
  Two observations:
 
 1. Calling a deprecated function, even though that its not optimal,
 usually does not cause such drastic results.
 2. As far as I can see there is no easy way to get the string value
 of a
 property without carving out a piece of the buffer.
 
  What am I missing here?
 
  Regards,
   -Stefan
 
 
  On Mon, Jul 20, 2015 at 12:58 PM, Jacques Nadeau jacq...@apache.org
  wrote:
 
   Can you enable verbose errors at the session level? It may reveal more
   about what is failing.
   On Jul 20, 2015 5:32 AM, Stefán Baxter ste...@activitystream.com
   wrote:
  
Hi Jim,
   
I have made those changes and I'm wondering if you can runs this be
  using
the two .jar files that the mvn package places in the target
  directory?
   
I have tried to have Drill pick those up but the error now is:
   
Error: SYSTEM ERROR: UnsupportedOperationException
Fragment 0:0
[Error Id: da589dd4-4cfd-4659-8b93-219074ab8c72 on localhost:31010]
(state=,code=0)
   
It seems to indicate that it's picking up the functions but that they
  can
not be run.
   
Regards,
 - Stefán
   
On Mon, Jul 20, 2015 at 1:25 AM, Jim Bates jba...@maprtech.com
  wrote:
   
 I pulled out your udf class and threw it into my package. It worked
  for
me
 with a few modifications.

 You can not have imported classes or method references in your eval
  or
 setup methods as the code will get pulled out and executed
 somewhere
   else
 and it won't be able to find it. With that in mind, Period  will
 need
   to
be
 org.joda.time.Period, DateTime will need to be
 org.joda.time.DateTime
   and
 roundTimeStamp  will need to be

 com.activitystream.drill.udfs.ASUserDefinedFunctions.roundTimeStamp.



 On Sun, Jul 19, 2015 at 7:02 PM, Stefán Baxter 
ste...@activitystream.com
 wrote:

  Hi,
 
  The project can be found here:
  https://github.com/acmeguy/asdrill
 
  Thank you,
   -Stefán
 
  On Sun, Jul 19, 2015 at 11:57 PM, Stefán Baxter 
 ste...@activitystream.com
  
  wrote:
 
   Hi,
  
   I'm more than happy to share the little that is there (I will
   publish
 it
   on github and send link tomorrow).
  
   I ended up copying my UDF (singl-file.java) into the
  simple-drill-function
   project where it got picked up.
  
   Then I discovered a whole new set of dependencies/limitations
  
  - The UDF are recompiled - any imports are invalid or at
 least
  overwritten
  -  import org.joda.time.Period; (means that Period class is
  not
  resolved on runtime)
  - Error: SYSTEM ERROR: CompileException: Line 71, Column 26:
Cannot
  determine simple type name Period
  
  - Calling any outside functions, like I was calling a
 static
  function of the new class (same file), leads to an errors as
   they
 are
  not
  not resolved
  -  Error: SYSTEM ERROR: CompileException: Line 70, Column
 35:
  A
 method
  named roundTimeStamp is not declared in any enclosing
 class
   nor
 any
  supertype, nor through a static import
   

Re: Recursive CTE Support in Drill

2015-07-20 Thread Alexander Zarei
Thanks for more elaboration Ted, Jacques and Jason!

@Ted that is a very cool idea. I tried the cross join but figured cross
join is not supported in drill yet but we have DRILL-786 for it. The new
method looks very promising. It seems it is an implicit cross join, isn't
it? I just tried it out and it worked like a charm. I will go on with this
method.

@Jaques, yes as Jason said, we discussed this before and I have talked to
my colleagues to help me with modifying the ODBC driver so it sends a plan.
Also thanks for the query. I tied it out for tow tables and it worked find
but extending it to three tables gives me a syntax error.

 select * from

((select column1, 1 as join_keyb from

 (SELECT column1, 1 as join_key FROM `hive43`.`default`.`double_table`) t1

  Join

 (SELECT column1, 1 as join_key FROM `hive43`.`default`.`double_table`) t2

  on t1.join_key=t2.join_key) t12

Join

(SELECT column1, 1 as join_key FROM `hive43`.`default`.`double_table`) t3

 on t12.join_keyb=t3.join_key)



*The other syntax was easier for me to use the join three times so I could
test it with *


 select t1.column1 from

 (SELECT column1, 1 as join_key FROM `hive43`.`default`.`double_table`) t1,

 (SELECT column1, 1 as join_key FROM `hive43`.`default`.`double_table`) t2,

 (SELECT column1, 1 as join_key FROM `hive43`.`default`.`double_table`) t3

 where

 t1.join_key=t2.join_key and t1.join_key=t3.join_key


Thank you very much for your time Ted, Jacques and Jason!

Thanks,
Alex

On Fri, Jul 17, 2015 at 2:09 PM, Jason Altekruse altekruseja...@gmail.com
wrote:

 Jacques,

 Alexander has brought up this problem previously in one of the hangouts and
 said that submitting a physical plan was not possible through ODBC. If he
 is able to modify the driver code to make it possible to submit one, that
 would be an option, as I believe the C++ client is capable of submitting
 plans. The issue I seem to recall him mentioning is that the ODBC driver
 was running a little sanity checking on they sql query to try to prevent
 submitting complete garbage queries to a server. I think he had concerns
 that a JSON formatted physical plan would fail these checks and he would
 have to disable them along with trying to allow submitting two types of
 queries from ODBC.

 On Fri, Jul 17, 2015 at 8:52 AM, Jacques Nadeau jacq...@dremio.com
 wrote:

  Removing cross posting
 
  Alexander,
 
  There is currently no way for Drill to generate a large amount of data
  using SQL.  However, you can generate large generic data by using the
  MockStoragePlugin if you submit a plan.  You can find an example plan
 using
  this at [1].
 
  I heard someone might be working on extending the MockStoragePlugin to
  support SQL which would provide the outcome you requested.
 
  [1]
 
 
 https://github.com/apache/drill/blob/master/exec/java-exec/src/test/resources/mock-scan.json
 
  On Thu, Jul 16, 2015 at 10:16 PM, Ted Dunning ted.dunn...@gmail.com
  wrote:
 
  
   Also, just doing a Cartesian join of three copies of 1000 records will
   give you a billion records with negligible I/o.
  
   Sent from my iPhone
  
On Jul 16, 2015, at 15:43, Jason Altekruse altekruseja...@gmail.com
 
   wrote:
   
@Alexander If you want to test the speed of the ODBC driver you can
 do
   that
without a new storage plugin.
   
If you get the entire dataset into memory, it will be returned from
   Drill a
quickly as we can possibly send it to the client. One way to do this
 is
   to
insert a sort; we cannot send along any of the data until the compete
   sort
is done. As long as you don't read so much data that we will start
   spilling
the sort to disk, all of the records will be in memory. To take the
  read
and sort time out of your test, just make sure to record the time you
   first
receive data from Drill, not the query start time.
   
There is one gotcha here. To make the BI tools more responsive, we
implemented a feature that will send along one empty batch of records
   with
the schema information populated. This schema is generated by
 applying
   all
of the transformations that happen throughout the query. For example,
  the
join operator handles this schema population by sending along the
  schema
merged from the two sides of the join, project will similarly add or
   remove
column based on the expressions and columns requested. You will want
 to
make sure you record your start time when you receive the first batch
   with
actual records. This can give you an accurate measurement of the ODBC
performance, removing the bottleneck of the disk.
   
On Thu, Jul 16, 2015 at 3:24 PM, Alexander Zarei 
   alexanderz.si...@gmail.com
wrote:
   
Thanks for the answers.
   
@Ted my only goal is to pump a large amount of data without having
 to
   read
from Hard Disk. I am measuring the ODBC driver performance and I
 need
  a
higher data transfer rate. So any method that helps 

Re: Drill not picking up a UDF

2015-07-20 Thread Ted Dunning
On Mon, Jul 20, 2015 at 8:22 AM, Andrew Brust 
andrew.br...@bluebadgeinsights.com wrote:

 1. It seems to me like Drill is at a point where, if you thread the needle
 perfectly, things generally work as advertised.  That’s certainly an
 advance over the old, old days, where stuff that should have worked
 sometimes just didn’t.
 2. Threading that needle can be super-hard, even for an experienced Java
 developer.


This is definitely true for the problem of *extending* Drill.

It is much less of a problem for *using* Drill, which is what the team has
spent much more effort on. See the Drill in 10 minutes article in the docs.

To my mind, this prioritization is correct.  But it shouldn't be exclusive
either.

This is not to say that user-level fit and finish is done by any means,
just that people who extend the system will get lots more splinters.


Re: Drill not picking up a UDF

2015-07-20 Thread Andrew Brust
Yes, agreed.  Fair point and apologies for not articulating same in my comment.




On 7/20/15, 4:10 PM, Ted Dunning ted.dunn...@gmail.com wrote:

On Mon, Jul 20, 2015 at 8:22 AM, Andrew Brust 
andrew.br...@bluebadgeinsights.com wrote:

 1. It seems to me like Drill is at a point where, if you thread the needle
 perfectly, things generally work as advertised.  That’s certainly an
 advance over the old, old days, where stuff that should have worked
 sometimes just didn’t.
 2. Threading that needle can be super-hard, even for an experienced Java
 developer.


This is definitely true for the problem of *extending* Drill.

It is much less of a problem for *using* Drill, which is what the team has
spent much more effort on. See the Drill in 10 minutes article in the docs.

To my mind, this prioritization is correct.  But it shouldn't be exclusive
either.

This is not to say that user-level fit and finish is done by any means,
just that people who extend the system will get lots more splinters.


combining the results of two queries (union) before grouping (derived grouping / re-grouping)

2015-07-20 Thread Stefán Baxter
Hi,

Does Drill support grouping on post union result sets (derived)?

I'm fetching data from two sources and currently all groups, that can be
found in both sets, are twice, understandably with different counts etc.,
in the final output.

Regards,
 -Stefan


Re: Drill not picking up a UDF

2015-07-20 Thread Jacques Nadeau
Hey Stefan,

Can you propose some edits/updates to the documentation?  The doc is
maintained on github [1].

The key thing to understand is that a UDF is slightly incomplete Java.
This is because Drill actually rips apart the functionality of the UDF and
recomposes it directly within expression evaluation code.  This gives a
substantial performance and memory benefit but also creates some
challenges.  As such, there are some key rules one should follow:

 - Don't use imports
 - Both class file and source file have to be on the classpath (yes, Drill
uses the source of the function)
 - Any JAR files holding UDF resources must include drill-module.conf
marker file so that Drill knows to include that JAR in consideration for
UDF loading
 - ValueHolders should be treated as structs.  As such, don't call any
methods on ValueHolders

If you fork the documentation and provide a pull request, doc people
(Kristine and Bridget) will work to include your feedback so that others
don't find as many challenges as you did.

thanks,
Jacques

[1]
https://github.com/apache/drill/tree/gh-pages/_docs/develop-custom-functions

On Mon, Jul 20, 2015 at 7:19 AM, Stefán Baxter ste...@activitystream.com
wrote:

 Hi,

 This is working now.

 This, unless someone corrects me, is the code needed to get a string
 parameter for a UDF:


- input2.buffer.toString(input2.start,
 input2.end-input2.start,java.nio.charset.Charset.defaultCharset())

 Regards,
  -Stefan (not the happiest camper)


 On Mon, Jul 20, 2015 at 2:07 PM, Stefán Baxter ste...@activitystream.com
 wrote:

  Hi,
 
  After going through the log this is clear what is happening (once the
  Drill picked up the UDF a bit earlier this morning).
 
  I'm calling the VarCharHolder.toString()  to get the text value for the
  parameter and that is throwing this exception.
 
  Two observations:
 
 1. Calling a deprecated function, even though that its not optimal,
 usually does not cause such drastic results.
 2. As far as I can see there is no easy way to get the string value of
 a property without carving out a piece of the buffer.
 
  What am I missing here?
 
  Regards,
   -Stefan
 
 
  On Mon, Jul 20, 2015 at 12:58 PM, Jacques Nadeau jacq...@apache.org
  wrote:
 
  Can you enable verbose errors at the session level? It may reveal more
  about what is failing.
  On Jul 20, 2015 5:32 AM, Stefán Baxter ste...@activitystream.com
  wrote:
 
   Hi Jim,
  
   I have made those changes and I'm wondering if you can runs this be
  using
   the two .jar files that the mvn package places in the target
  directory?
  
   I have tried to have Drill pick those up but the error now is:
  
   Error: SYSTEM ERROR: UnsupportedOperationException
   Fragment 0:0
   [Error Id: da589dd4-4cfd-4659-8b93-219074ab8c72 on localhost:31010]
   (state=,code=0)
  
   It seems to indicate that it's picking up the functions but that they
  can
   not be run.
  
   Regards,
- Stefán
  
   On Mon, Jul 20, 2015 at 1:25 AM, Jim Bates jba...@maprtech.com
 wrote:
  
I pulled out your udf class and threw it into my package. It worked
  for
   me
with a few modifications.
   
You can not have imported classes or method references in your eval
 or
setup methods as the code will get pulled out and executed somewhere
  else
and it won't be able to find it. With that in mind, Period  will
 need
  to
   be
org.joda.time.Period, DateTime will need to be
 org.joda.time.DateTime
  and
roundTimeStamp  will need to be
com.activitystream.drill.udfs.ASUserDefinedFunctions.roundTimeStamp.
   
   
   
On Sun, Jul 19, 2015 at 7:02 PM, Stefán Baxter 
   ste...@activitystream.com
wrote:
   
 Hi,

 The project can be found here:
 https://github.com/acmeguy/asdrill

 Thank you,
  -Stefán

 On Sun, Jul 19, 2015 at 11:57 PM, Stefán Baxter 
ste...@activitystream.com
 
 
 wrote:

  Hi,
 
  I'm more than happy to share the little that is there (I will
  publish
it
  on github and send link tomorrow).
 
  I ended up copying my UDF (singl-file.java) into the
 simple-drill-function
  project where it got picked up.
 
  Then I discovered a whole new set of dependencies/limitations
 
 - The UDF are recompiled - any imports are invalid or at
 least
 overwritten
 -  import org.joda.time.Period; (means that Period class is
 not
 resolved on runtime)
 - Error: SYSTEM ERROR: CompileException: Line 71, Column 26:
   Cannot
 determine simple type name Period
 
 - Calling any outside functions, like I was calling a
 static
 function of the new class (same file), leads to an errors as
  they
are
 not
 not resolved
 -  Error: SYSTEM ERROR: CompileException: Line 70, Column
 35: A
method
 named roundTimeStamp is not declared in any enclosing class
  nor
any
 supertype, nor through a