Hi Ted I checked hive-exec-1.2.1.spark.jar contains the following required classes but still it doesn't compile I don't understand why is this Jar getting overwritten in scope
org/apache/hadoop/hive/ql/udf/generic/GenericUDAFPercentileApprox$GenericUDAFMultiplePercentileApproxEvaluator.class Please guide. On Mon, Oct 19, 2015 at 4:30 PM, Umesh Kacha <umesh.ka...@gmail.com> wrote: > Hi Ted thanks much for your help really appreciate it. I tried to use > maven dependencies you mentioned but still callUdf is not compiling please > find snap shot of my intellij editor. I am sorry you may have to zoom > pictures as I can't share code. Thanks again. > On Oct 19, 2015 8:32 AM, "Ted Yu" <yuzhih...@gmail.com> wrote: > >> Umesh: >> >> $ jar tvf >> /home/hbase/.m2/repository/org/spark-project/hive/hive-exec/1.2.1.spark/hive-exec-1.2.1.spark.jar >> | grep GenericUDAFPercentile >> 2143 Fri Jul 31 23:51:48 PDT 2015 >> org/apache/hadoop/hive/ql/udf/generic/GenericUDAFPercentileApprox$1.class >> 4602 Fri Jul 31 23:51:48 PDT 2015 >> org/apache/hadoop/hive/ql/udf/generic/GenericUDAFPercentileApprox$GenericUDAFMultiplePercentileApproxEvaluator.class >> >> As long as the following dependency is in your pom.xml: >> [INFO] +- org.spark-project.hive:hive-exec:jar:1.2.1.spark:compile >> >> You should be able to invoke percentile_approx >> >> Cheers >> >> On Sun, Oct 18, 2015 at 8:58 AM, Umesh Kacha <umesh.ka...@gmail.com> >> wrote: >> >>> Thanks much Ted so when do we get to use this sparkUdf in Java code >>> using maven code dependencies?? You said JIRA 10671 is not pushed as >>> part of 1.5.1 so it should be released in 1.6.0 as mentioned in the JIRA >>> right? >>> >>> On Sun, Oct 18, 2015 at 9:20 PM, Ted Yu <yuzhih...@gmail.com> wrote: >>> >>>> The udf is defined in GenericUDAFPercentileApprox of hive. >>>> >>>> When spark-shell runs, it has access to the above class which is >>>> packaged >>>> in assembly/target/scala-2.10/spark-assembly-1.6.0-SNAPSHOT-hadoop2.7.0.jar >>>> : >>>> >>>> 2143 Fri Oct 16 15:02:26 PDT 2015 >>>> org/apache/hadoop/hive/ql/udf/generic/GenericUDAFPercentileApprox$1.class >>>> 4602 Fri Oct 16 15:02:26 PDT 2015 >>>> org/apache/hadoop/hive/ql/udf/generic/GenericUDAFPercentileApprox$GenericUDAFMultiplePercentileApproxEvaluator.class >>>> 1697 Fri Oct 16 15:02:26 PDT 2015 >>>> org/apache/hadoop/hive/ql/udf/generic/GenericUDAFPercentileApprox$GenericUDAFPercentileApproxEvaluator$PercentileAggBuf.class >>>> 6570 Fri Oct 16 15:02:26 PDT 2015 >>>> org/apache/hadoop/hive/ql/udf/generic/GenericUDAFPercentileApprox$GenericUDAFPercentileApproxEvaluator.class >>>> 4334 Fri Oct 16 15:02:26 PDT 2015 >>>> org/apache/hadoop/hive/ql/udf/generic/GenericUDAFPercentileApprox$GenericUDAFSinglePercentileApproxEvaluator.class >>>> 6293 Fri Oct 16 15:02:26 PDT 2015 >>>> org/apache/hadoop/hive/ql/udf/generic/GenericUDAFPercentileApprox.class >>>> >>>> That was the cause for different behavior. >>>> >>>> FYI >>>> >>>> On Sun, Oct 18, 2015 at 12:10 AM, unk1102 <umesh.ka...@gmail.com> >>>> wrote: >>>> >>>>> Hi starting new thread following old thread looks like code for >>>>> compiling >>>>> callUdf("percentile_approx",col("mycol"),lit(0.25)) is not merged in >>>>> spark >>>>> 1.5.1 source but I dont understand why this function call works in >>>>> Spark >>>>> 1.5.1 spark-shell/bin. Please guide. >>>>> >>>>> ---------- Forwarded message ---------- >>>>> From: "Ted Yu" <yuzhih...@gmail.com> >>>>> Date: Oct 14, 2015 3:26 AM >>>>> Subject: Re: How to calculate percentile of a column of DataFrame? >>>>> To: "Umesh Kacha" <umesh.ka...@gmail.com> >>>>> Cc: "Michael Armbrust" <mich...@databricks.com>, >>>>> "<saif.a.ell...@wellsfargo.com>" <saif.a.ell...@wellsfargo.com>, >>>>> "user" <user@spark.apache.org> >>>>> >>>>> I modified DataFrameSuite, in master branch, to call percentile_approx >>>>> instead of simpleUDF : >>>>> >>>>> - deprecated callUdf in SQLContext >>>>> - callUDF in SQLContext *** FAILED *** >>>>> org.apache.spark.sql.AnalysisException: undefined function >>>>> percentile_approx; >>>>> at >>>>> >>>>> org.apache.spark.sql.catalyst.analysis.SimpleFunctionRegistry$$anonfun$2.apply(FunctionRegistry.scala:64) >>>>> at >>>>> >>>>> org.apache.spark.sql.catalyst.analysis.SimpleFunctionRegistry$$anonfun$2.apply(FunctionRegistry.scala:64) >>>>> at scala.Option.getOrElse(Option.scala:120) >>>>> at >>>>> >>>>> org.apache.spark.sql.catalyst.analysis.SimpleFunctionRegistry.lookupFunction(FunctionRegistry.scala:63) >>>>> at >>>>> >>>>> org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveFunctions$$anonfun$apply$10$$anonfun$applyOrElse$5$$anonfun$applyOrElse$24.apply(Analyzer.scala:506) >>>>> at >>>>> >>>>> org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveFunctions$$anonfun$apply$10$$anonfun$applyOrElse$5$$anonfun$applyOrElse$24.apply(Analyzer.scala:506) >>>>> at >>>>> >>>>> org.apache.spark.sql.catalyst.analysis.package$.withPosition(package.scala:48) >>>>> at >>>>> >>>>> org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveFunctions$$anonfun$apply$10$$anonfun$applyOrElse$5.applyOrElse(Analyzer.scala:505) >>>>> at >>>>> >>>>> org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveFunctions$$anonfun$apply$10$$anonfun$applyOrElse$5.applyOrElse(Analyzer.scala:502) >>>>> at >>>>> >>>>> org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$3.apply(TreeNode.scala:227) >>>>> >>>>> SPARK-10671 is included. >>>>> For 1.5.1, I guess the absence of SPARK-10671 means that SparkSQL >>>>> treats >>>>> percentile_approx as normal UDF. >>>>> >>>>> Experts can correct me, if there is any misunderstanding. >>>>> >>>>> Cheers >>>>> >>>>> >>>>> >>>>> -- >>>>> View this message in context: >>>>> http://apache-spark-user-list.1001560.n3.nabble.com/callUdf-percentile-approx-col-mycol-lit-0-25-does-not-compile-spark-1-5-1-source-but-it-does-work-inn-tp25111.html >>>>> Sent from the Apache Spark User List mailing list archive at >>>>> Nabble.com. >>>>> >>>>> --------------------------------------------------------------------- >>>>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >>>>> For additional commands, e-mail: user-h...@spark.apache.org >>>>> >>>>> >>>> >>> >>