cloud-fan commented on code in PR #42462:
URL: https://github.com/apache/spark/pull/42462#discussion_r1300806528


##########
connector/connect/client/jvm/src/main/scala/org/apache/spark/sql/functions.scala:
##########
@@ -7227,6 +7227,112 @@ object functions {
    */
   def to_csv(e: Column): Column = to_csv(e, Collections.emptyMap())
 
+  // scalastyle:off line.size.limit
+  /**
+   * Parses a column containing a XML string into the data type corresponding 
to the specified
+   * schema. Returns `null`, in the case of an unparseable string.
+   *
+   * @param e
+   *   a string column containing XML data.
+   * @param schema
+   *   the schema to use when parsing the XML string
+   * @param options
+   *   options to control how the XML is parsed. accepts the same options and 
the XML data source.
+   *   See <a href=
+   *   
"https://spark.apache.org/docs/latest/sql-data-sources-xml.html#data-source-option";>
 Data
+   *   Source Option</a> in the version you use.
+   * @group collection_funcs
+   *
+   * @since 4.0.0
+   */
+  // scalastyle:on line.size.limit
+  def from_xml(e: Column, schema: StructType, options: Map[String, String]): 
Column =

Review Comment:
   the schema parameter can be `StructType` or `Column`, the options parameter 
can be scala or java map, or omitted. This means we need 6 overloads of 
`from_xml`.
   
   Does it really worth it? I know we did the same thing for `from_json`, but 
this is really convoluted.
   
   How about something like
   ```
   TextParsingFunction.newBuilder()
     .withSchema(...) // It has multiple overloads
     .withOptions(...) // It has multiple overloads
     .xml() // returns a Column
   ```
   
   Anyway, it's unrelated to this PR. We can do it later. cc @HyukjinKwon 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to