Hi,

I have the following CSV load

val df =
sqlContext.read.format("com.databricks.spark.csv").option("inferSchema",
"true").option("header", "true").load("/data/stg/table2")

I have defined this UDF

def ChangeDate(word : String) : String = {
   return
word.substring(6,10)+"-"+word.substring(3,5)+"-"+word.substring(0,2)
}

I use the following mapping

scala> df.map(x => (x(1).toString,
x(1).toString.substring(6,10)+"-"+x(1).toString.substring(3,5)+"-"+x(1).toString.substring(0,2))).take(1)
res20: Array[(String, String)] = Array((10/02/2014,2014-02-10))

Now rather than using that longwinded substring can I use some variation of
that UDF.

This does not work

scala> df.map(x => (x(1).toString, changeDate(x(1).toString))
     | )
<console>:22: error: not found: value changeDate
              df.map(x => (x(1).toString, changeDate(x(1).toString))

Any ideas from experts?

Thanks

Dr Mich Talebzadeh



LinkedIn * 
https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*



http://talebzadehmich.wordpress.com

Reply via email to