itholic commented on code in PR #49054:
URL: https://github.com/apache/spark/pull/49054#discussion_r1868865917


##########
python/pyspark/sql/classic/column.py:
##########
@@ -175,12 +174,16 @@ def _reverse_op(
     return Column(jc)
 
 
-@with_origin_to_class
 class Column(ParentColumn):
     def __new__(
         cls,
         jc: "JavaObject",
     ) -> "Column":
+        from pyspark.errors.utils import with_origin_to_class
+
+        if not hasattr(cls, "_with_origin_applied"):
+            cls = with_origin_to_class(cls)

Review Comment:
   I agree with that there is overhead to check `hasattr` but I think we still 
call `with_origin_to_class` only once (since it's applied to class, not 
instance) so the overhead seems not very remarkable?
   
   For example, I tested creating 5 different Columns with print out for 
calling `__new__` and `with_origin_to_class` and the result is the same between 
Before and After:
   
   **Before**
   ```python
   >>> F.col("name1")
   Call with_origin_to_class
   Call Column.__new__
   Column<'name1'>
   >>> F.col("name2")
   Call Column.__new__
   Column<'name2'>
   >>> F.col("name3")
   Call Column.__new__
   Column<'name3'>
   >>> F.col("name4")
   Call Column.__new__
   Column<'name4'>
   >>> F.col("name5")
   Call Column.__new__
   Column<'name5'>
   ```
   
   **After**
   ```python
   >>> F.col("name1")
   Call with_origin_to_class
   Call Column.__new__
   Column<'name1'>
   >>> F.col("name2")
   Call Column.__new__
   Column<'name2'>
   >>> F.col("name3")
   Call Column.__new__
   Column<'name3'>
   >>> F.col("name4")
   Call Column.__new__
   Column<'name4'>
   >>> F.col("name5")
   Call Column.__new__
   Column<'name5'>
   ```
   
   Please let me know if I missed something or if anyway we still need to 
figure out another method to remove the overhead for checking `hasattr`??



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to