Hi,
I am trying to ad a column to a data frame that I created based on a JSON file
like this:
val input =
hiveCtx.jsonFile("wasb://[email protected]/json/*").toDF().persist(StorageLevel.MEMORY_AND_DISK)
I have a function that is generating the values for the new column:
def determineDayPartID(evntStDate: String, evntStHour: String) : Int = {
val stFormat = new
java.text.SimpleDateFormat("yyMMdd")
var stDateStr:String = evntStDate.substring(2,8)
val stDate:Date = stFormat.parse(stDateStr)
val stHour = evntStHour.substring(1,3).toDouble + 0.1
var bucket = Math.ceil(stHour/3.0).toInt
val cal:Calendar = Calendar.getInstance
cal.setTime(stDate)
var dayOfWeek = cal.get(Calendar.DAY_OF_WEEK)
if (dayOfWeek == 1) dayOfWeek = 8
if (dayOfWeek > 6) bucket = bucket + 8
return bucket
}
When I try:
input.withColumn("DayPartID", callUDF(determineDayPartID, IntegerType,
col("StartDate"), col("EventStartHour")))
I am getting the error:
missing arguments for
method determineDayPartID in object rating; follow this method with `_' if you
want to treat it as a partially applied function
Can you please help?
Stefan Panayotov, PhD
Home: 610-355-0919
Cell: 610-517-5586
email: [email protected]
[email protected]
[email protected]