zero323 commented on a change in pull request #27406: 
[SPARK-30681][PYSPARK][SQL] Add higher order functions API to PySpark
URL: https://github.com/apache/spark/pull/27406#discussion_r374074657
 
 

 ##########
 File path: python/pyspark/sql/column.py
 ##########
 @@ -129,6 +129,103 @@ def _(self, other):
     return _
 
 
+def _unresolved_named_lambda_variable(*name_parts):
+    """
+    Create o.a.s.sql.expressions.UnresolvedNamedLambdaVariable and
+    convert it to o.s.sql.Column
+
+    :param name_parts: str
+    """
+    sc = SparkContext._active_spark_context
+    name_parts_seq = _to_seq(sc, name_parts)
+    expressions = sc._jvm.org.apache.spark.sql.catalyst.expressions
+    return Column(
+        sc._jvm.Column(
+            expressions.UnresolvedNamedLambdaVariable(name_parts_seq)
+        )
+    )
+
+
+def _get_lambda_parameters(f):
+    import inspect
+
+    signature = inspect.signature(f)
+    parameters = signature.parameters.values()
+
+    # We should exclude functions that use
+    # variable args and keyword argnames
+    # as well as keyword only args
+    supported_parmeter_types = {
+        inspect.Parameter.POSITIONAL_OR_KEYWORD,
+        inspect.Parameter.POSITIONAL_ONLY,
+    }
+
+    # Validate that
+    # function arity is between 1 and 3
+    if not (1 <= len(parameters) <= 3):
 
 Review comment:
   Not to reiterate 
https://github.com/apache/spark/pull/27433#discussion_r374069388 ‒ captured JVM 
exception is simply not as good as guest language exception. It cannot be 
inspected, it cannot be debugged and it doesn't directly connect to user 
objects. It doesn't really matter if how nicely it is wrapped and processed 
(please remember that these are interpreted languages and it is not unusual for 
user to actually drop to interactive debugger in the middle of the session).
   
   If  consensus is that it is not the way to go I'll of course drop the 
checks. But please forgive me, if I won't do it without putting up a fight. 
   
   > Also, to be complete, we should also check the input and output after f 
execution in Python side then..
   
   I am not sure if I get the point. Input is generated so we know it is valid, 
and output type is validated. That's the best we can do until placeholders are 
resolved, which is out of Python's scope.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to