[jira] [Created] (SPARK-11688) UDF's doesn't work when it has a default arguments

2015-11-11 Thread M Bharat lal (JIRA)
M Bharat lal created SPARK-11688:


 Summary: UDF's doesn't work when it has a default arguments
 Key: SPARK-11688
 URL: https://issues.apache.org/jira/browse/SPARK-11688
 Project: Spark
  Issue Type: Improvement
  Components: SQL
Reporter: M Bharat lal
Priority: Minor


Use case:

Suppose we have a function which accepts three parameters (string, subString 
and frmIndex which has 0 default value )

def hasSubstring(string:String, subString:String, frmIndex:Int = 0): Long = 
string.indexOf(subString, frmIndex)

above function works perfectly if I dont pass frmIndex parameter

scala> hasSubstring("Scala", "la")
res0: Long = 3

But, when I register the above function as UDF (successfully registered) and 
call the same without  passing frmIndex parameter got the below exception

scala> val df  = 
sqlContext.createDataFrame(Seq(("scala","Spark","MLlib"),("abc", "def", 
"gfh"))).toDF("c1", "c2", "c3")
df: org.apache.spark.sql.DataFrame = [c1: string, c2: string, c3: string]

scala> df.show
+-+-+-+
|   c1|   c2|   c3|
+-+-+-+
|scala|Spark|MLlib|
|  abc|  def|  gfh|
+-+-+-+

scala> sqlContext.udf.register("hasSubstring", hasSubstring _ )
res3: org.apache.spark.sql.UserDefinedFunction = 
UserDefinedFunction(,LongType,List())

scala> val result = df.as("i0").withColumn("subStringIndex", 
callUDF("hasSubstring", $"i0.c1", lit("la")))

org.apache.spark.sql.AnalysisException: undefined function hasSubstring;
at 
org.apache.spark.sql.hive.HiveFunctionRegistry$$anonfun$lookupFunction$2$$anonfun$1.apply(hiveUDFs.scala:58)
at 
org.apache.spark.sql.hive.HiveFunctionRegistry$$anonfun$lookupFunction$2$$anonfun$1.apply(hiveUDFs.scala:58)
at scala.Option.getOrElse(Option.scala:120)
at 
org.apache.spark.sql.hive.HiveFunctionRegistry$$anonfun$lookupFunction$2.apply(hiveUDFs.scala:57)
at 
org.apache.spark.sql.hive.HiveFunctionRegistry$$anonfun$lookupFunction$2.apply(hiveUDFs.scala:53)
at scala.util.Try.getOrElse(Try.scala:77)
at 
org.apache.spark.sql.hive.HiveFunctionRegistry.lookupFunction(hiveUDFs.scala:53)
at 
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveFunctions$$anonfun$apply$10$$anonfun$applyOrElse$5$$anonfun$applyOrElse$23.apply(Analyzer.scala:490)
at 
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveFunctions$$anonfun$apply$10$$anonfun$applyOrElse$5$$anonfun$applyOrElse$23.apply(Analyzer.scala:490)
at 
org.apache.spark.sql.catalyst.analysis.package$.withPosition(package.scala:48)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-11203) UDF doesn't support charType column and lit function doesn't allow charType as argument

2015-10-20 Thread M Bharat lal (JIRA)
M Bharat lal created SPARK-11203:


 Summary: UDF doesn't support charType column and lit function 
doesn't allow charType as argument
 Key: SPARK-11203
 URL: https://issues.apache.org/jira/browse/SPARK-11203
 Project: Spark
  Issue Type: Improvement
  Components: SQL
Affects Versions: 1.5.0
Reporter: M Bharat lal
Priority: Minor


We have two issues

1) We cannot create dataframe with Char Type , see below example

scala> val employee = 
sqlContext.createDataFrame(Seq((1,"John"))).toDF("id","name")
employee: org.apache.spark.sql.DataFrame = [id: int, name: string]

scala> employee.withColumn("grade",lit('A'))
java.lang.RuntimeException: Unsupported literal type class java.lang.Character A
at 
org.apache.spark.sql.catalyst.expressions.Literal$.apply(literals.scala:49)
at org.apache.spark.sql.functions$.lit(functions.scala:89)


2) we have a function which takes string and char as input parameters and 
returns position of char in given string.  
 
registered function  as UDF and called the same UDF with characters literal 
which gave the below exception. Literal function doesn't support character as 
argument


scala> def strPos(x:String,c:Char):Integer = {x.indexOf(c)}
strPos: (x: String, c: Char)Integer

scala> sqlContext.udf.register("strPos",strPos _)
res13: org.apache.spark.sql.UserDefinedFunction = 
UserDefinedFunction(,IntegerType,List())

scala> df.select( callUDF("strPos",$"name",lit('J')))
java.lang.RuntimeException: Unsupported literal type class java.lang.Character J
at 
org.apache.spark.sql.catalyst.expressions.Literal$.apply(literals.scala:49)

Can you please add this support or let us know if there is any other work 
around to achieve this




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org