Re: [PR] [SPARK-47274][PYTHON][SQL] Provide more useful context for PySpark DataFrame API errors [spark]

via GitHub Mon, 15 Apr 2024 02:46:26 -0700


cloud-fan commented on code in PR #45377:
URL: https://github.com/apache/spark/pull/45377#discussion_r1565491650



##########
sql/core/src/main/scala/org/apache/spark/sql/Column.scala:
##########
@@ -171,6 +171,29 @@ class Column(val expr: Expression) extends Logging {
     Column.fn(name, this, lit(other))
   }
 
+  /**
+   * A version of the `fn` method specifically designed for binary operations 
in PySpark
+   * that require logging information.
+   * This method is used when the operation involves another Column.
+   *
+   * @param name                The name of the operation to be performed.
+   * @param other               The value to be used in the operation, which 
will be converted to a
+   *                            Column if not already one.
+   * @param pysparkFragment     A string representing the 'fragment' of the 
PySpark error context,
+   *                            typically indicates the name of PySpark 
function.
+   * @param pysparkCallSite     A string representing the 'callSite' of the 
PySpark error context,
+   *                            providing the exact location within the 
PySpark code where the
+   *                            operation originated.
+   * @return A Column resulting from the operation.
+   */
+  private def fn(

Review Comment:
   @HyukjinKwon This probably can't cover all the cases, and we may need to add 
more overloads for certain functions that require non-expression parameters, 
but it shouldn't be any.
   
   I think it's better than using ThreadLocal which can be quite fragile to 
pass values between Python and JVM.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] [SPARK-47274][PYTHON][SQL] Provide more useful context for PySpark DataFrame API errors [spark]

Reply via email to