[GitHub] spark pull request #22379: [SPARK-25393][SQL] Adding new function from_csv()

HyukjinKwon Sun, 14 Oct 2018 02:30:01 -0700

Github user HyukjinKwon commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22379#discussion_r224985980
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/functions.scala ---
    @@ -3854,6 +3854,38 @@ object functions {
       @scala.annotation.varargs
       def map_concat(cols: Column*): Column = withExpr { 
MapConcat(cols.map(_.expr)) }
     
    +  /**
    +   * Parses a column containing a CSV string into a `StructType` with the 
specified schema.
    +   * Returns `null`, in the case of an unparseable string.
    +   *
    +   * @param e a string column containing CSV data.
    +   * @param schema the schema to use when parsing the CSV string
    +   * @param options options to control how the CSV is parsed. accepts the 
same options and the
    +   *                CSV data source.
    +   *
    +   * @group collection_funcs
    +   * @since 3.0.0
    +   */
    +  def from_csv(e: Column, schema: StructType, options: Map[String, 
String]): Column = withExpr {
    +    CsvToStructs(schema, options, e.expr)
    +  }
    +
    +  /**
    +   * (Java-specific) Parses a column containing a CSV string into a 
`StructType`
    +   * with the specified schema. Returns `null`, in the case of an 
unparseable string.
    +   *
    +   * @param e a string column containing CSV data.
    +   * @param schema the schema to use when parsing the CSV string
    +   * @param options options to control how the CSV is parsed. accepts the 
same options and the
    +   *                CSV data source.
    +   *
    +   * @group collection_funcs
    +   * @since 3.0.0
    +   */
    +  def from_csv(e: Column, schema: String, options: java.util.Map[String, 
String]): Column = {
    --- End diff --
    
    Eh, I wasn't following too. Is the problem related to its parameters for 
the function? we can just define `characterOrColumn` and use it. We can do 
something like:
    
    ```
    if # is character
      schema <- lit(schema)
    else if # is column
      schema <- schema
    else
      stop("it should be column or characters")
    ```
    
    like you did in Python side.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request #22379: [SPARK-25393][SQL] Adding new function from_csv()

Reply via email to