If you scroll further down in the documentation, you will see that callUDF does have a version which takes (String, Column...) as arguments: *callUDF <https://spark.apache.org/docs/latest/api/java/org/apache/spark/sql/functions.html#callUDF(java.lang.String,%20org.apache.spark.sql.Column...)>* (java.lang.String udfName, Column <https://spark.apache.org/docs/latest/api/java/org/apache/spark/sql/Column.html> ... cols)
Unfortunately the link I posted above doesn't seem to work because of the punctuation in the URL but it is there. If you use "callUdf" from Java with a string argument, which is what you seem to be doing, it expects a Seq<Column> because of the way it is defined in scala. That's also a deprecated method anyways. The reason you're getting the exception is not because that's the wrong method to call. It's because the percentile_approx UDF is never registered. If you're passing in a UDF by name, you must register it with your SQL context as follows (example taken from the documentation of the above referenced method): import org.apache.spark.sql._ val df = Seq(("id1", 1), ("id2", 4), ("id3", 5)).toDF("id", "value") val sqlContext = df.sqlContext sqlContext.udf.register("simpleUDF", (v: Int) => v * v) df.select($"id", callUDF("simpleUDF", $"value")) On Mon, Dec 28, 2015 at 11:08 AM Umesh Kacha <umesh.ka...@gmail.com> wrote: > Hi thanks you understood question incorrectly. First of all I am passing > UDF name as String and if you see callUDF arguments then it does not take > string as first argument and if I use callUDF it will throw me exception > saying percentile_approx function not found. And another thing I mentioned > is that it works in Spark scala console so it does not have any problem of > calling it in not expected way. Hope now question is clear. > > On Mon, Dec 28, 2015 at 9:21 PM, Hamel Kothari <hamelkoth...@gmail.com> > wrote: > >> Also, if I'm reading correctly, it looks like you're calling "callUdf" >> when what you probably want is "callUDF" (notice the subtle capitalization >> difference). Docs: >> https://spark.apache.org/docs/latest/api/java/org/apache/spark/sql/functions.html#callUDF(java.lang.String,%20org.apache.spark.sql.Column.. >> .) >> >> On Mon, Dec 28, 2015 at 10:48 AM Hamel Kothari <hamelkoth...@gmail.com> >> wrote: >> >>> Would you mind sharing more of your code? I can't really see the code >>> that well from the attached screenshot but it appears that "Lit" is >>> capitalized. Not sure what this method actually refers to but the >>> definition in functions.scala is lowercased. >>> >>> Even if that's not it, some more code would be helpful to solving this. >>> Also, since it's a compilation error, if you could share the compilation >>> error that would be very useful. >>> >>> -Hamel >>> >>> On Mon, Dec 28, 2015 at 10:26 AM unk1102 <umesh.ka...@gmail.com> wrote: >>> >>>> < >>>> http://apache-spark-user-list.1001560.n3.nabble.com/file/n25821/Screen_Shot_2015-12-28_at_8.jpg >>>> > >>>> >>>> Hi I am trying to invoke Hive UDF using >>>> dataframe.select(callUdf("percentile_approx",col("C1"),lit(0.25))) but >>>> it >>>> does not compile however same call works in Spark scala console I dont >>>> understand why. I am using Spark 1.5.2 maven source in my Java code. I >>>> have >>>> also explicitly added maven dependency hive-exec-1.2.1.spark.jar where >>>> percentile_approx is located but still does not compile code please >>>> check >>>> attached code image. Please guide. Thanks in advance. >>>> >>>> >>>> >>>> -- >>>> View this message in context: >>>> http://apache-spark-user-list.1001560.n3.nabble.com/Spark-DataFrame-callUdf-does-not-compile-tp25821.html >>>> Sent from the Apache Spark User List mailing list archive at Nabble.com. >>>> >>>> --------------------------------------------------------------------- >>>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >>>> For additional commands, e-mail: user-h...@spark.apache.org >>>> >>>> >