Re: Mismatched output with 2 UDFs in a query

Anchal Agrawal Tue, 11 Aug 2015 18:08:46 -0700

Hi all,
To reiterate, I've been getting mismatched output when I use two different UDFs 
on the pk column in the same query statement. If two UDFs are in the SELECT 
clause, only the first one is picked up. If there's a UDF in the WHERE clause, 
it's picked up instead of the UDF in the SELECT clause. Details are at 
PHOENIX-2151.
Can anyone reproduce this issue? Rajeshbabu, I'd appreciate your input.
Thank you,Anchal



     On Thursday, August 6, 2015 2:40 PM, Anchal Agrawal <[email protected]> 
wrote:
   

 Hi Nicholas,
Do you have any updates about this issue? I noticed that Rajeshbabu reopened 
the ticket yesterday.
Thanks,Anchal
 


     On Wednesday, August 5, 2015 7:02 PM, Nicholas Whitehead 
<[email protected]> wrote:
   

 Not sure, actually. I think it's got something to do with differences in the 
implementation of the UDF's evaluate method. In retrospect, my test case 
implemented a practical passthrough where as my original observation was on a 
UDF that did something more complex. I'll dig it up.
//Nicholas
On Wed, Aug 5, 2015 at 9:35 PM, Anchal Agrawal <[email protected]> wrote:

Sure, Nicholas. I'll add in the details. According to the ticket, it looks like 
you were able to reproduce the issue initially. Do you know what changed 
between the time you reproduced it and when you couldn't? Or was it just a 
false positive?
Thanks,Anchal 


     On Wednesday, August 5, 2015 5:48 PM, Nicholas Whitehead 
<[email protected]> wrote:
   

 I will go back and redo the test case and hopefully reproduce. And reopen the 
ticket. Perhaps you could attach the details of your case too.Plan ?On Aug 5, 
2015 8:04 PM, "Anchal Agrawal" <[email protected]> wrote:

Hi Nicholas,
Yes, I'm getting the same issue (HBase 0.98.8). On my setup, if I run:select 
pk, udf1(pk), udf2(pk) from "mytable" I get pk, udf1(pk), udf1(pk)

And if I run:select pk, udf2(pk), udf1(pk) from "mytable" I get pk, udf2(pk), 
udf2(pk)

It appears to be picking up the first UDF. However, in Query 3 (from my 
previous email), when the second UDF is in the WHERE clause, the second UDF is 
picked up instead of the first one.
Sincerely,Anchal
 


     On Wednesday, August 5, 2015 4:45 PM, Nicholas Whitehead 
<[email protected]> wrote:
   

 Hmm... I opened a jira ticket on that, but then my simplified test case could 
not 
reproduce.https://issues.apache.org/jira/browse/PHOENIX-2151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanelLook
 the same ?Hi,
I'm using v4.4.0. I'm getting mismatched output when I use two UDFs in a query.

Phoenix view of existing HBase table: create view "mytable" (pk VARBINARY 
PRIMARY KEY, "cf"."col" UNSIGNED_LONG);
UDF1: create function udf1(VARBINARY) returns UNSIGNED_LONG as 
'mypkg.GetX';UDF2: create function udf2(VARBINARY) returns INTEGER as 
'mypkg.GetY';Query1: select udf1(pk), udf2(pk) from "mytable";Query2: select 
udf2(pk), udf1(pk) from "mytable";Query3: select udf1(pk), "col" from "mytable" 
where udf2(pk) > 0;
Query 1: The output has two columns, but they're both udf1(pk) so both columns 
have the same rows in the output.Query 2: Same as Query 1, except that both 
columns are udf2(pk).
Query 3: The output has two columns, udf2(pk) and "col", instead of udf1(pk) 
and "col". 

If I have just one UDF in a query, like so: select pk, udf2(pk) from "mytable"; 
then the output is as expected.

I'm not sure what I'm missing. Rajeshbabu, is there a caveat associated with 
using two UDFs in one query? I appreciate your help.

Thank you,Anchal

Re: Mismatched output with 2 UDFs in a query

Reply via email to