Thanks Tathagata!
I did mean using the transformation in the form of a UDF in Spark SQL. This
function I envision of works on individual records as described by you.


On Fri, Aug 8, 2014 at 6:48 PM, Tathagata Das <tathagata.das1...@gmail.com>
wrote:

> You can always define an arbitrary RDD-to-RDD function, use it from both
> Spark and Spark Streaming. For example,
>
> def myTransofmration(rdd: RDD[X]): RDD[Y] = { .... }
>
> In spark you can obvious apply it on an RDD. In spark streaming, you can
> apply on the RDDs of a DStream by
>
> myDStream.transform(rdd => myTransform(rdd))
>
> I am not sure what you mean by reuse that transformation through Spark
> SQL. Do you mean from a sql query? In Spark SQL you can register a
> function, that operates on each records (so a map like function only), but
> not a arbitrary transformation on tables. But then its easy to mix things
> up with Spark and Spark SQL together, as you can do sqlContext.sql("sql
> query"), get back the result RDDs, and then apply the myTransformation on
> that RDD.
>
> Hope this clarifies things.
>
> TD
>
>
> On Fri, Aug 8, 2014 at 11:10 AM, Jeevak Kasarkod <jee...@gmail.com> wrote:
>
>> Is it possible to create custom transformations in Spark? For example
>> data security transforms such as encrypt and decrypt. Ideally its something
>> one would like to reuse across Spark streaming, Spark SQL and Spark.
>>
>>
>


-- 
Cheers,
Jeevak

Reply via email to