Hi,
I have the following CSV load
val df =
sqlContext.read.format("com.databricks.spark.csv").option("inferSchema",
"true").option("header", "true").load("/data/stg/table2")
I have defined this UDF
def ChangeDate(word : String) : String = {
return
word.substring(6,10)+"-"+word.substring(3,5)+"-"+word.substring(0,2)
}
I use the following mapping
scala> df.map(x => (x(1).toString,
x(1).toString.substring(6,10)+"-"+x(1).toString.substring(3,5)+"-"+x(1).toString.substring(0,2))).take(1)
res20: Array[(String, String)] = Array((10/02/2014,2014-02-10))
Now rather than using that longwinded substring can I use some variation of
that UDF.
This does not work
scala> df.map(x => (x(1).toString, changeDate(x(1).toString))
| )
<console>:22: error: not found: value changeDate
df.map(x => (x(1).toString, changeDate(x(1).toString))
Any ideas from experts?
Thanks
Dr Mich Talebzadeh
LinkedIn *
https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
http://talebzadehmich.wordpress.com