itholic commented on code in PR #49054:
URL: https://github.com/apache/spark/pull/49054#discussion_r1868865917
##########
python/pyspark/sql/classic/column.py:
##########
@@ -175,12 +174,16 @@ def _reverse_op(
return Column(jc)
-@with_origin_to_class
class Column(ParentColumn):
def __new__(
cls,
jc: "JavaObject",
) -> "Column":
+ from pyspark.errors.utils import with_origin_to_class
+
+ if not hasattr(cls, "_with_origin_applied"):
+ cls = with_origin_to_class(cls)
Review Comment:
I agree with that there is overhead to check `hasattr` but I think we still
call `with_origin_to_class` only once (since it's applied to class, not
instance) so the overhead seems not very remarkable?
For example, I tested creating 5 different Columns with print out for
calling `__new__` and `with_origin_to_class` and the result is the same between
Before and After:
**Before**
```python
>>> F.col("name1")
Call with_origin_to_class
Call Column.__new__
Column<'name1'>
>>> F.col("name2")
Call Column.__new__
Column<'name2'>
>>> F.col("name3")
Call Column.__new__
Column<'name3'>
>>> F.col("name4")
Call Column.__new__
Column<'name4'>
>>> F.col("name5")
Call Column.__new__
Column<'name5'>
```
**After**
```python
>>> F.col("name1")
Call with_origin_to_class
Call Column.__new__
Column<'name1'>
>>> F.col("name2")
Call Column.__new__
Column<'name2'>
>>> F.col("name3")
Call Column.__new__
Column<'name3'>
>>> F.col("name4")
Call Column.__new__
Column<'name4'>
>>> F.col("name5")
Call Column.__new__
Column<'name5'>
```
Please let me know if I missed something or if anyway we still need to
figure out another method to remove the overhead for checking `hasattr`??
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]