Re: [Spark Sql/ UDFs] Spark and Hive UDFs parity

2017-06-18 Thread Yong Zhang
time, which can control your UDF's behavior. If you have a concrete example that you cannot do in Spark Scala UDF, you can post here. Yong From: RD Sent: Friday, June 16, 2017 11:37 AM To: Georg Heiler Cc: user@spark.apache.org Subject: Re: [Spark Sql/ UDFs]

Re: [Spark Sql/ UDFs] Spark and Hive UDFs parity

2017-06-16 Thread Georg Heiler
I assume you want to have this life cycle in oder to create big/ heavy / complex objects only once ( per partition) map partitions should fit this usecase pretty well. RD schrieb am Fr. 16. Juni 2017 um 17:37: > Thanks Georg. But I'm not sure how mapPartitions is relevant here. Can > you elabora

Re: [Spark Sql/ UDFs] Spark and Hive UDFs parity

2017-06-16 Thread RD
Thanks Georg. But I'm not sure how mapPartitions is relevant here. Can you elaborate? On Thu, Jun 15, 2017 at 4:18 AM, Georg Heiler wrote: > What about using map partitions instead? > > RD schrieb am Do. 15. Juni 2017 um 06:52: > >> Hi Spark folks, >> >> Is there any plan to support the

Re: [Spark Sql/ UDFs] Spark and Hive UDFs parity

2017-06-15 Thread Georg Heiler
What about using map partitions instead? RD schrieb am Do. 15. Juni 2017 um 06:52: > Hi Spark folks, > > Is there any plan to support the richer UDF API that Hive supports for > Spark UDFs ? Hive supports the GenericUDF API which has, among others > methods like initialize(), configure() (cal

[Spark Sql/ UDFs] Spark and Hive UDFs parity

2017-06-14 Thread RD
Hi Spark folks, Is there any plan to support the richer UDF API that Hive supports for Spark UDFs ? Hive supports the GenericUDF API which has, among others methods like initialize(), configure() (called once on the cluster) etc, which a lot of our users use. We have now a lot of UDFs in Hive