[ 
https://issues.apache.org/jira/browse/FLINK-9430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shuyi Chen updated FLINK-9430:
------------------------------
    Description: 
We want to add a SQL UDF to access specific element in a JSON string using JSON 
path. However, the JSON element can be of different types, e.g. Int, Float, 
Double, String, Boolean and etc.. Since return type is not part of the method 
signature, we can not use overload. So we will end up writing a UDF for each 
type, e.g. GetFloatFromJSON, GetIntFromJSON and etc., which has a lot of 
duplication. 

One way to unify all these UDF functions is to implement one UDF and return 
java.lang.Object, and in the SQL statement, use CAST AS to cast the returned 
Object into the correct type. Below is an example:

 
{code:java}
object JsonPathUDF extends ScalarFunction {
 def eval(jsonStr: String, path: String): Object = {
   JSONParser.parse(jsonStr).read(path)
 }
}{code}
{code:java}
 SELECT CAST(jsonpath(json, "$.store.book.title") AS VARCHAR(32)) as bookTitle 
FROM table1{code}

The current Flink SQL cast implementation does not support casting from 
GenericTypeInfo<java.lang.Object> to another type, I have already got a local 
branch to fix this. Please comment if there are alternatives to the problem 
above.

  was:
We want to add a SQL UDF to access specific element in a JSON string using JSON 
path. However, the JSON element can be of different types, e.g. Int, Float, 
Double, String, Boolean and etc.. Since return type is not part of the method 
signature, we can not use overload. So we will end up writing a UDF for each 
type, e.g. GetFloatFromJSON, GetIntFromJSON and etc., which has a lot of 
duplication. 

One way to unify all these UDF functions is to implement one UDF and return 
java.lang.Object, and in the SQL statement, use CAST AS to cast the returned 
Object into the correct type. Below is an example:

 
{code:java}
object JsonPathUDF extends ScalarFunction {
 def eval(jsonStr: String, path: String): Object = {
   JSONParser.parse(jsonStr).read(path)
 }
}{code}
{code:java}
 SELECT CAST(jsonpath(json, "$.store.book.title") AS VARCHAR(32)) as bookTitle 
FROM table1{code}

I have already got a local branch working. Please comment if there are 
alternatives.


> Support Casting of Object to Primitive types for Flink SQL UDF
> --------------------------------------------------------------
>
>                 Key: FLINK-9430
>                 URL: https://issues.apache.org/jira/browse/FLINK-9430
>             Project: Flink
>          Issue Type: New Feature
>          Components: Table API &amp; SQL
>            Reporter: Shuyi Chen
>            Assignee: Shuyi Chen
>            Priority: Major
>
> We want to add a SQL UDF to access specific element in a JSON string using 
> JSON path. However, the JSON element can be of different types, e.g. Int, 
> Float, Double, String, Boolean and etc.. Since return type is not part of the 
> method signature, we can not use overload. So we will end up writing a UDF 
> for each type, e.g. GetFloatFromJSON, GetIntFromJSON and etc., which has a 
> lot of duplication. 
> One way to unify all these UDF functions is to implement one UDF and return 
> java.lang.Object, and in the SQL statement, use CAST AS to cast the returned 
> Object into the correct type. Below is an example:
>  
> {code:java}
> object JsonPathUDF extends ScalarFunction {
>  def eval(jsonStr: String, path: String): Object = {
>    JSONParser.parse(jsonStr).read(path)
>  }
> }{code}
> {code:java}
>  SELECT CAST(jsonpath(json, "$.store.book.title") AS VARCHAR(32)) as 
> bookTitle FROM table1{code}
> The current Flink SQL cast implementation does not support casting from 
> GenericTypeInfo<java.lang.Object> to another type, I have already got a local 
> branch to fix this. Please comment if there are alternatives to the problem 
> above.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to