shrirangmhalgi commented on PR #56310: URL: https://github.com/apache/spark/pull/56310#issuecomment-4620335879
The existing `levenshtein()` function is exposed in both `functions.scala` (Scala/Java DataFrame API) and `pyspark.sql.functions` (PySpark API), in addition to the SQL registration. This PR only registers `jaro_winkler_similarity` in `FunctionRegistry` - so it's accessible via SQL but not via `df.select(jaro_winkler_similarity(...))` in Scala or `F.jaro_winkler_similarity(...)` in PySpark. Is the DataFrame API planned as a follow-up, or should it be included here for consistency? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
