Spark SQL UDF returning a list?

2014-12-03 Thread Jerry Raj

Hi,
Can a UDF return a list of values that can be used in a WHERE clause? 
Something like:


sqlCtx.registerFunction(myudf, {
  Array(1, 2, 3)
})

val sql = select doc_id, doc_value from doc_table where doc_id in 
myudf()


This does not work:

Exception in thread main java.lang.RuntimeException: [1.57] failure: 
``('' expected but identifier myudf found


I also tried returning a List of Ints, that did not work either. Is 
there a way to write a UDF that returns a list?


Thanks
-Jerry

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: Spark SQL UDF returning a list?

2014-12-03 Thread Tobias Pfeiffer
Hi,

On Wed, Dec 3, 2014 at 4:31 PM, Jerry Raj jerry@gmail.com wrote:

 Exception in thread main java.lang.RuntimeException: [1.57] failure:
 ``('' expected but identifier myudf found

 I also tried returning a List of Ints, that did not work either. Is there
 a way to write a UDF that returns a list?


You seem to be hitting a parser limitation before your function is even
called. The message you are seeing is saying there must be an opening
bracket here, and I am afraid you won't get around this whatever function
you write... (maybe the HiveContext provides a possibility, though).

Tobias


RE: Spark SQL UDF returning a list?

2014-12-03 Thread Cheng, Hao
Yes I agree, and it may also be ambiguous in semantic. A list of objects V.S. A 
list with single List Object.

I’ve also tested that, seems

a.  There is a bug in registerFunction, which doesn’t support the UDF 
without argument. ( I just create a PR for this: 
https://github.com/apache/spark/pull/3595 )

b.  It expects the function return type to be immutable.Seq[XX] for List, 
immutable.Map[X, X] for Map, scala.Product for Struct, and only Array[Byte] for 
binary. The Array[_] is not supported.

Cheng Hao

From: Tobias Pfeiffer [mailto:t...@preferred.jp]
Sent: Thursday, December 4, 2014 9:05 AM
To: Jerry Raj
Cc: user
Subject: Re: Spark SQL UDF returning a list?

Hi,

On Wed, Dec 3, 2014 at 4:31 PM, Jerry Raj 
jerry@gmail.commailto:jerry@gmail.com wrote:
Exception in thread main java.lang.RuntimeException: [1.57] failure: ``('' 
expected but identifier myudf found

I also tried returning a List of Ints, that did not work either. Is there a way 
to write a UDF that returns a list?

You seem to be hitting a parser limitation before your function is even called. 
The message you are seeing is saying there must be an opening bracket here, 
and I am afraid you won't get around this whatever function you write... (maybe 
the HiveContext provides a possibility, though).

Tobias