Maxim Gekk created SPARK-24642:
----------------------------------

             Summary: Add a function which infers schema from a JSON column
                 Key: SPARK-24642
                 URL: https://issues.apache.org/jira/browse/SPARK-24642
             Project: Spark
          Issue Type: Improvement
          Components: SQL
    Affects Versions: 2.3.1
            Reporter: Maxim Gekk


Need to add new aggregate function - *infer_schema()*. The function should 
infer schema for set of JSON strings. The result of the function is a schema in 
DDL format (or JSON format).

One of the use cases is passing output of *infer_schema()* to *from_json()*. 
Currently, the from_json() function requires a schema as a mandatory argument. 
It is possible to infer schema programmatically in Scala/Python and pass it as 
the second argument but in SQL it is not possible. An user has to pass schema 
as string literal in SQL. The new function should allow to use it in SQL like 
in the example:

{code:sql}
select from_json(json_col, infer_schema(json_col))
from json_table;
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to