zero323 commented on a change in pull request #27406: 
[SPARK-30681][PYSPARK][SQL] Add higher order functions API to PySpark
URL: https://github.com/apache/spark/pull/27406#discussion_r373494544
 
 

 ##########
 File path: python/pyspark/sql/column.py
 ##########
 @@ -129,6 +129,103 @@ def _(self, other):
     return _
 
 
+def _unresolved_named_lambda_variable(*name_parts):
+    """
+    Create o.a.s.sql.expressions.UnresolvedNamedLambdaVariable and
+    convert it to o.s.sql.Column
+
+    :param name_parts: str
+    """
+    sc = SparkContext._active_spark_context
+    name_parts_seq = _to_seq(sc, name_parts)
+    expressions = sc._jvm.org.apache.spark.sql.catalyst.expressions
+    return Column(
+        sc._jvm.Column(
+            expressions.UnresolvedNamedLambdaVariable(name_parts_seq)
+        )
+    )
+
+
+def _get_lambda_parameters(f):
+    import inspect
+
+    signature = inspect.signature(f)
+    parameters = signature.parameters.values()
+
+    # We should exclude functions that use
+    # variable args and keyword argnames
+    # as well as keyword only args
+    supported_parmeter_types = {
+        inspect.Parameter.POSITIONAL_OR_KEYWORD,
+        inspect.Parameter.POSITIONAL_ONLY,
+    }
+
+    # Validate that
+    # function arity is between 1 and 3
+    if not (1 <= len(parameters) <= 3):
 
 Review comment:
   If you don't have strong opinion about it @HyukjinKwon, I'd keep it as is. 
While `AnalysisException` is good enough for seasoned Spark users, I see that 
many beginners are intimidated by JVM stack trace that comes with it, and often 
don't even try to figure out what's going on.
   
   Also Python side checks, can prevent some naive mistakes early in cases when 
expressions are defined upfront. This is good, because it doesn't wait for the 
middle of the job to fail (obviously I'll add type hints on my side, to address 
that 😁), and there is really not much add code or complexity here ‒ we have to 
check arguments and arity anyway.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to