ok.. for plain sql, I've no idea other than defining a udaf
On Mon, Jun 26, 2017 at 10:59 AM, jeff saremi <jeffsar...@hotmail.com> wrote: > My specific and immediate need is this: We have a native function wrapped > in JNI. To increase performance we'd like to avoid calling it record by > record. mapPartitions() give us the ability to invoke this in bulk. We're > looking for a similar approach in SQL. > > > ------------------------------ > *From:* Ryan <ryan.hd....@gmail.com> > *Sent:* Sunday, June 25, 2017 7:18:32 PM > *To:* jeff saremi > *Cc:* user@spark.apache.org > *Subject:* Re: What is the equivalent of mapPartitions in SpqrkSQL? > > Why would you like to do so? I think there's no need for us to explicitly > ask for a forEachPartition in spark sql because tungsten is smart enough to > figure out whether a sql operation could be applied on each partition or > there has to be a shuffle. > > On Sun, Jun 25, 2017 at 11:32 PM, jeff saremi <jeffsar...@hotmail.com> > wrote: > >> You can do a map() using a select and functions/UDFs. But how do you >> process a partition using SQL? >> >> >> >