HyukjinKwon commented on a change in pull request #27406:
[SPARK-30681][PYSPARK][SQL] Add higher order functions API to PySpark
URL: https://github.com/apache/spark/pull/27406#discussion_r376327514
##########
File path: python/pyspark/sql/functions.py
##########
@@ -2840,6 +2840,463 @@ def from_csv(col, schema, options={}):
return Column(jc)
+def _unresolved_named_lambda_variable(*name_parts):
+ """
+ Create `o.a.s.sql.expressions.UnresolvedNamedLambdaVariable`,
+ convert it to o.s.sql.Column and wrap in Python `Column`
+
+ :param name_parts: str
+ """
+ sc = SparkContext._active_spark_context
+ name_parts_seq = _to_seq(sc, name_parts)
+ expressions = sc._jvm.org.apache.spark.sql.catalyst.expressions
+ return Column(
+ sc._jvm.Column(
+ expressions.UnresolvedNamedLambdaVariable(name_parts_seq)
+ )
+ )
+
+
+def _get_lambda_parameters(f):
Review comment:
I am not particularly against this codes. I just thought it's easier to
logically separate two PRs. This PR just to add higher order functions relying
on JVM check. The other PR to add argument checking in Python (or R) sides.
For the former case, I am totally supportive and I can sign-off. For the
latter I am not completely sure. I would like to collect some feedback from
some more people. So .. asking to remove it is just to make it rolling quicker
:-).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]