Hi,
I define a udf to mark the empty string in java like that:
public class MarkUnknown implements UDF2<String,String,String> {
@Override
public String call(String processor,String fillContent){
if(processor.trim().equals("")){
logger.info("find empty string");
return fillContent;
}
else{
return processor;
}
}
}
and register by sparkSession:
spark.udf().register("markUnknown",markUnknown,StringType);
but when I use the udf in sql : "select markUnknown(useId,'unknown') FROM
table", I got a exception:
Exception in thread "main" java.lang.ClassCastException:
org.apache.spark.sql.UDFRegistration$$anonfun$27 cannot be cast to
scala.Function2
at
org.apache.spark.sql.catalyst.expressions.ScalaUDF.<init>(ScalaUDF.scala:97)
at
org.apache.spark.sql.UDFRegistration$$anonfun$register$26.apply(UDFRegistration.scala:515)
at
org.apache.spark.sql.UDFRegistration$$anonfun$register$26.apply(UDFRegistration.scala:515)
at
org.apache.spark.sql.catalyst.analysis.SimpleFunctionRegistry.lookupFunction(FunctionRegistry.scala:91)
at
org.apache.spark.sql.catalyst.catalog.SessionCatalog.lookupFunction(SessionCatalog.scala:1165)
at
org.apache.spark.sql.hive.HiveSessionCatalog.org$apache$spark$sql$hive$HiveSessionCatalog$$super$lookupFunction(HiveSessionCatalog.scala:129)
at
org.apache.spark.sql.hive.HiveSessionCatalog$$anonfun$3.apply(HiveSessionCatalog.scala:129)
at
org.apache.spark.sql.hive.HiveSessionCatalog$$anonfun$3.apply(HiveSessionCatalog.scala:129)
at scala.util.Try$.apply(Try.scala:192)
at
org.apache.spark.sql.hive.HiveSessionCatalog.lookupFunction0(HiveSessionCatalog.scala:129)
at
org.apache.spark.sql.hive.HiveSessionCatalog.lookupFunction(HiveSessionCatalog.scala:122)
at
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveFunctions$$anonfun$apply$16$$anonfun$applyOrElse$7$$anonfun$applyOrElse$53.apply(Analyzer.scala:1189)
at
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveFunctions$$anonfun$apply$16$$anonfun$applyOrElse$7$$anonfun$applyOrElse$53.apply(Analyzer.scala:1189)
I replaced the String "unknown" with other column : "select
markUnknown(useId,companyId) FROM table" , still got the same exception.
so how to define the udf in java?
thanks for any reply