hi,all I define a udf with multiple parameters ,but I don't know how to call it with DataFrame UDF:
def ssplit2 = udf { (sentence: String, delNum: Boolean, delEn: Boolean, minTermLen: Int) => val terms = HanLP.segment(sentence).asScala ..... Call : scala> val output = input.select(ssplit2($"text",true,true,2).as('words)) <console>:40: error: type mismatch; found : Boolean(true) required: org.apache.spark.sql.Column val output = input.select(ssplit2($"text",true,true,2).as('words)) ^ <console>:40: error: type mismatch; found : Boolean(true) required: org.apache.spark.sql.Column val output = input.select(ssplit2($"text",true,true,2).as('words)) ^ <console>:40: error: type mismatch; found : Int(2) required: org.apache.spark.sql.Column val output = input.select(ssplit2($"text",true,true,2).as('words)) ^ scala> val output = input.select(ssplit2($"text",$"true",$"true",$"2").as('words)) org.apache.spark.sql.AnalysisException: cannot resolve '`true`' given input columns: [id, text];; 'Project [UDF(text#6, 'true, 'true, '2) AS words#16] +- Project [_1#2 AS id#5, _2#3 AS text#6] +- LocalRelation [_1#2, _2#3] I need help!! 2017-06-16 lk_spark