If you scroll further down in the documentation, you will see that callUDF
does have a version which takes (String, Column...) as arguments: *callUDF
<https://spark.apache.org/docs/latest/api/java/org/apache/spark/sql/functions.html#callUDF(java.lang.String,%20org.apache.spark.sql.Column...)>*
(java.lang.String udfName, Column
<https://spark.apache.org/docs/latest/api/java/org/apache/spark/sql/Column.html>
... cols)

Unfortunately the link I posted above doesn't seem to work because of the
punctuation in the URL but it is there. If you use "callUdf" from Java with
a string argument, which is what you seem to be doing, it expects a
Seq<Column> because of the way it is defined in scala. That's also a
deprecated method anyways.

The reason you're getting the exception is not because that's the wrong
method to call. It's because the percentile_approx UDF is never registered.
If you're passing in a UDF by name, you must register it with your SQL
context as follows (example taken from the documentation of the above
referenced method):

  import org.apache.spark.sql._

  val df = Seq(("id1", 1), ("id2", 4), ("id3", 5)).toDF("id", "value")
  val sqlContext = df.sqlContext
  sqlContext.udf.register("simpleUDF", (v: Int) => v * v)
  df.select($"id", callUDF("simpleUDF", $"value"))




On Mon, Dec 28, 2015 at 11:08 AM Umesh Kacha <umesh.ka...@gmail.com> wrote:

> Hi thanks you understood question incorrectly. First of all I am passing
> UDF name as String and if you see callUDF arguments then it does not take
> string as first argument and if I use callUDF it will throw me exception
> saying percentile_approx function not found. And another thing I mentioned
> is that it works in Spark scala console so it does not have any problem of
> calling it in not expected way. Hope now question is clear.
>
> On Mon, Dec 28, 2015 at 9:21 PM, Hamel Kothari <hamelkoth...@gmail.com>
> wrote:
>
>> Also, if I'm reading correctly, it looks like you're calling "callUdf"
>> when what you probably want is "callUDF" (notice the subtle capitalization
>> difference). Docs:
>> https://spark.apache.org/docs/latest/api/java/org/apache/spark/sql/functions.html#callUDF(java.lang.String,%20org.apache.spark.sql.Column..
>> .)
>>
>> On Mon, Dec 28, 2015 at 10:48 AM Hamel Kothari <hamelkoth...@gmail.com>
>> wrote:
>>
>>> Would you mind sharing more of your code? I can't really see the code
>>> that well from the attached screenshot but it appears that "Lit" is
>>> capitalized. Not sure what this method actually refers to but the
>>> definition in functions.scala is lowercased.
>>>
>>> Even if that's not it, some more code would be helpful to solving this.
>>> Also, since it's a compilation error, if you could share the compilation
>>> error that would be very useful.
>>>
>>> -Hamel
>>>
>>> On Mon, Dec 28, 2015 at 10:26 AM unk1102 <umesh.ka...@gmail.com> wrote:
>>>
>>>> <
>>>> http://apache-spark-user-list.1001560.n3.nabble.com/file/n25821/Screen_Shot_2015-12-28_at_8.jpg
>>>> >
>>>>
>>>> Hi I am trying to invoke Hive UDF using
>>>> dataframe.select(callUdf("percentile_approx",col("C1"),lit(0.25))) but
>>>> it
>>>> does not compile however same call works in Spark scala console I dont
>>>> understand why. I am using Spark 1.5.2 maven source in my Java code. I
>>>> have
>>>> also explicitly added maven dependency hive-exec-1.2.1.spark.jar where
>>>> percentile_approx is located but still does not compile code please
>>>> check
>>>> attached code image. Please guide. Thanks in advance.
>>>>
>>>>
>>>>
>>>> --
>>>> View this message in context:
>>>> http://apache-spark-user-list.1001560.n3.nabble.com/Spark-DataFrame-callUdf-does-not-compile-tp25821.html
>>>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>>>> For additional commands, e-mail: user-h...@spark.apache.org
>>>>
>>>>
>

Reply via email to