[GitHub] spark pull request #21686: [SPARK-24709][SQL] schema_of_json() - schema infe...

HyukjinKwon Tue, 03 Jul 2018 08:44:18 -0700

Github user HyukjinKwon commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21686#discussion_r199856590
  
    --- Diff: python/pyspark/sql/functions.py ---
    @@ -2235,6 +2240,28 @@ def to_json(col, options={}):
         return Column(jc)
     
     
    +@ignore_unicode_prefix
    +@since(2.4)
    +def schema_of_json(col):
    +    """
    +    Parses a column containing a JSON string and infers its schema in DDL 
format.
    +
    +    :param col: string column in json format
    +
    +    >>> from pyspark.sql.types import *
    +    >>> data = [(1, '''{"a": 1}''')]
    +    >>> df = spark.createDataFrame(data, ("key", "value"))
    +    >>> df.select(schema_of_json(df.value).alias("json")).collect()
    +    [Row(json=u'struct<a:bigint>')]
    +    >>> df.select(schema_of_json(lit('''{"a": 
0}''')).alias("json")).collect()
    --- End diff --
    
    minor nit `'''{"a": 0}'''` -> `'{"a": 0}'`



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request #21686: [SPARK-24709][SQL] schema_of_json() - schema infe...

Reply via email to