Hello Jayant, Thanks for great OSS Contribution :)
On Thu, Jul 12, 2018 at 1:36 PM, Jayant Shekhar <jayantbaya...@gmail.com> wrote: > Hello Chetan, > > Sorry missed replying earlier. You can find some sample code here : > > http://sparkflows.readthedocs.io/en/latest/user-guide/ > python/pipe-python.html > > We will continue adding more there. > > Feel free to ping me directly in case of questions. > > Thanks, > Jayant > > > On Mon, Jul 9, 2018 at 9:56 PM, Chetan Khatri <chetan.opensou...@gmail.com > > wrote: > >> Hello Jayant, >> >> Thank you so much for suggestion. My view was to use Python function as >> transformation which can take couple of column names and return object. >> which you explained. would that possible to point me to similiar codebase >> example. >> >> Thanks. >> >> On Fri, Jul 6, 2018 at 2:56 AM, Jayant Shekhar <jayantbaya...@gmail.com> >> wrote: >> >>> Hello Chetan, >>> >>> We have currently done it with .pipe(.py) as Prem suggested. >>> >>> That passes the RDD as CSV strings to the python script. The python >>> script can either process it line by line, create the result and return it >>> back. Or create things like Pandas Dataframe for processing and finally >>> write the results back. >>> >>> In the Spark/Scala/Java code, you get an RDD of string, which we convert >>> back to a Dataframe. >>> >>> Feel free to ping me directly in case of questions. >>> >>> Thanks, >>> Jayant >>> >>> >>> On Thu, Jul 5, 2018 at 3:39 AM, Chetan Khatri < >>> chetan.opensou...@gmail.com> wrote: >>> >>>> Prem sure, Thanks for suggestion. >>>> >>>> On Wed, Jul 4, 2018 at 8:38 PM, Prem Sure <sparksure...@gmail.com> >>>> wrote: >>>> >>>>> try .pipe(.py) on RDD >>>>> >>>>> Thanks, >>>>> Prem >>>>> >>>>> On Wed, Jul 4, 2018 at 7:59 PM, Chetan Khatri < >>>>> chetan.opensou...@gmail.com> wrote: >>>>> >>>>>> Can someone please suggest me , thanks >>>>>> >>>>>> On Tue 3 Jul, 2018, 5:28 PM Chetan Khatri, < >>>>>> chetan.opensou...@gmail.com> wrote: >>>>>> >>>>>>> Hello Dear Spark User / Dev, >>>>>>> >>>>>>> I would like to pass Python user defined function to Spark Job >>>>>>> developed using Scala and return value of that function would be >>>>>>> returned >>>>>>> to DF / Dataset API. >>>>>>> >>>>>>> Can someone please guide me, which would be best approach to do >>>>>>> this. Python function would be mostly transformation function. Also >>>>>>> would >>>>>>> like to pass Java Function as a String to Spark / Scala job and it >>>>>>> applies >>>>>>> to RDD / Data Frame and should return RDD / Data Frame. >>>>>>> >>>>>>> Thank you. >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>> >>>> >>> >> >