Maxim Gekk created SPARK-24643:
----------------------------------

             Summary: from_json should accept an aggregate function as schema
                 Key: SPARK-24643
                 URL: https://issues.apache.org/jira/browse/SPARK-24643
             Project: Spark
          Issue Type: Improvement
          Components: SQL
    Affects Versions: 2.3.1
            Reporter: Maxim Gekk


Currently, the *from_json()* function accepts only string literals as schema:
 - Checking of schema argument inside of JsonToStructs: 
[https://github.com/apache/spark/blob/b8f27ae3b34134a01998b77db4b7935e7f82a4fe/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala#L530]

 - Accepting only string literal: 
[https://github.com/apache/spark/blob/b8f27ae3b34134a01998b77db4b7935e7f82a4fe/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala#L749-L752]

JsonToStructs should be modified to accept results of aggregate functions like 
*infer_schema* (see SPARK-24642). It should be possible to write SQL like:
{code:sql}
select from_json(json_col, infer_schema(json_col)) from json_table
{code}
Here is a test case with existing aggregate function - *first()*:
{code:sql}
create temporary view schemas(schema) as select * from values
  ('struct<a:int>'),
  ('map<string,int>');

select from_json('{"a":1}', first(schema)) from schemas;
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to