zero323 commented on a change in pull request #27406:
[SPARK-30681][PYSPARK][SQL] Add higher order functions API to PySpark
URL: https://github.com/apache/spark/pull/27406#discussion_r375704703
##########
File path: python/pyspark/sql/functions.py
##########
@@ -2840,6 +2840,463 @@ def from_csv(col, schema, options={}):
return Column(jc)
+def _unresolved_named_lambda_variable(*name_parts):
+ """
+ Create `o.a.s.sql.expressions.UnresolvedNamedLambdaVariable`,
+ convert it to o.s.sql.Column and wrap in Python `Column`
+
+ :param name_parts: str
+ """
+ sc = SparkContext._active_spark_context
+ name_parts_seq = _to_seq(sc, name_parts)
+ expressions = sc._jvm.org.apache.spark.sql.catalyst.expressions
+ return Column(
+ sc._jvm.Column(
+ expressions.UnresolvedNamedLambdaVariable(name_parts_seq)
+ )
+ )
+
+
+def _get_lambda_parameters(f):
Review comment:
Sorry, but I don't see this.
- Argument type validation is not covered by planner, and some combinations
of argument types will either lead to cryptic exceptions or ambiguous behavior.
How would you generate placeholders for example for `*args` (or similarly `...`
in R).
- Having consistent placeholders with Scala counterpart is good. And even to
get to `x1`, x2`, `...,` cannot be done without inspecting signature. And even
if we did only that (which still keeps most of the complexity, calling things
`_(**kwargs)` or `_(*args)` or different keyword variants would be rather
sloppy.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]