Re: use WithColumn with external function in a java jar
Thanks, I'll check it out. On Mon, Aug 28, 2017 at 10:22 PM Praneeth Gayam wrote: > You can create a UDF which will invoke your java lib > > def calculateExpense: UserDefinedFunction = udf((pexpense: String, cexpense: > String) => new MyJava().calculateExpense(pexpense.toDouble, > cexpense.toDouble)) > > > > > > On Tue, Aug 29, 2017 at 6:53 AM, purna pradeep > wrote: > >> I have data in a DataFrame with below columns >> >> 1)Fileformat is csv >> 2)All below column datatypes are String >> >> employeeid,pexpense,cexpense >> >> Now I need to create a new DataFrame which has new column called >> `expense`, which is calculated based on columns `pexpense`, `cexpense`. >> >> The tricky part is the calculation algorithm is not an **UDF** function >> which I created, but it's an external function that needs to be imported >> from a Java library which takes primitive types as arguments - in this case >> `pexpense`, `cexpense` - to calculate the value required for new column. >> >> The external function signature >> >> public class MyJava >> >> { >> >> public Double calculateExpense(Double pexpense, Double cexpense) { >>// calculation >> } >> >> } >> >> So how can I invoke that external function to create a new calculated >> column. Can I register that external function as UDF in my Spark >> application? >> >> Stackoverflow reference >> >> >> https://stackoverflow.com/questions/45928007/use-withcolumn-with-external-function >> >> >> >> >> >> >
Re: use WithColumn with external function in a java jar
You can create a UDF which will invoke your java lib def calculateExpense: UserDefinedFunction = udf((pexpense: String, cexpense: String) => new MyJava().calculateExpense(pexpense.toDouble, cexpense.toDouble)) On Tue, Aug 29, 2017 at 6:53 AM, purna pradeep wrote: > I have data in a DataFrame with below columns > > 1)Fileformat is csv > 2)All below column datatypes are String > > employeeid,pexpense,cexpense > > Now I need to create a new DataFrame which has new column called > `expense`, which is calculated based on columns `pexpense`, `cexpense`. > > The tricky part is the calculation algorithm is not an **UDF** function > which I created, but it's an external function that needs to be imported > from a Java library which takes primitive types as arguments - in this case > `pexpense`, `cexpense` - to calculate the value required for new column. > > The external function signature > > public class MyJava > > { > > public Double calculateExpense(Double pexpense, Double cexpense) { >// calculation > } > > } > > So how can I invoke that external function to create a new calculated > column. Can I register that external function as UDF in my Spark > application? > > Stackoverflow reference > > https://stackoverflow.com/questions/45928007/use-withcolumn-with-external- > function > > > > > >
use WithColumn with external function in a java jar
I have data in a DataFrame with below columns 1)Fileformat is csv 2)All below column datatypes are String employeeid,pexpense,cexpense Now I need to create a new DataFrame which has new column called `expense`, which is calculated based on columns `pexpense`, `cexpense`. The tricky part is the calculation algorithm is not an **UDF** function which I created, but it's an external function that needs to be imported from a Java library which takes primitive types as arguments - in this case `pexpense`, `cexpense` - to calculate the value required for new column. The external function signature public class MyJava { public Double calculateExpense(Double pexpense, Double cexpense) { // calculation } } So how can I invoke that external function to create a new calculated column. Can I register that external function as UDF in my Spark application? Stackoverflow reference https://stackoverflow.com/questions/45928007/use-withcolumn-with-external-function