samredai commented on PR #4466:
URL: https://github.com/apache/iceberg/pull/4466#issuecomment-1107157069

   The ValueError getting raised comes from the fact that `__init__` is always 
called after `__new__` when you call the standard class constructor. The first 
argument to `__init__` (meaning `self`) is the value returned from `__new__`.
   
   However for the `if rest` case that's in `__new__` here, a fully constructed 
instance is returned by `return reduce(And, (left, right, *rest))` and so 
`__init__` doesn't need to do anything, which is why it was only doing 
something `if not rest`. So whether or not `rest` is provided is being used as 
a sort of signal to `__init__` on whether or not the class has already been 
fully initialized in `__new__`. Adding the `ValueError` disrupts the "already 
initialized path" by causing `__init__` to panic.
   
   Two other alternatives I can think of is:
   1. Move the initialization entirely into `__new__` and get rid of `__init__` 
completely:
   ```py
   class And(BooleanExpression):
       """AND operation expression - logical conjunction"""
   
       def __new__(cls, left: BooleanExpression, right: BooleanExpression, 
*rest: BooleanExpression):
           if rest:
               return reduce(And, (left, right, *rest))
           if left is AlwaysFalse() or right is AlwaysFalse():
               return AlwaysFalse()
           elif left is AlwaysTrue():
               return right
           elif right is AlwaysTrue():
               return left
           new_expression = super().__new__(cls)
           new_expression.args = (left, right)
           vars(new_expression)["left"] = left
           vars(new_expression)["right"] = right
           return new_expression
   ```
   2. Move the reduction logic out into a standalone function and simplify the 
`And` class to only accept a `left` and `right` arg:
   ```py
   def and_expression(left: BooleanExpression, right: BooleanExpression, *rest: 
BooleanExpression) -> Union[And, AlwaysTrue, AlwaysFalse]:
       if rest:
           return reduce(And, (left, right, *rest))
       if left is AlwaysFalse() or right is AlwaysFalse():
           return AlwaysFalse()
       elif left is AlwaysTrue():
           return right
       elif right is AlwaysTrue():
           return left
       return And(left, right)
   ```
   A note on option 1, someone can still override `__new__` in a subclass as 
@rdblue pointed out but I think the expectation holds that they should call 
`super().__new__(cls, *args, **kwargs)`, just as it's expected to call 
`super().__init__(*args, **kwargs)` if you override `__init__` in a subclass.
   
   A note about option 2 is that it may make checking if something is an 
instance of `And` less intuitive, in other words:
   ```py
   from iceberg.expressions.base import and_expression, And
   
   expr = and_expression(left, right, rest)
   isinstance(expr, And)
   ```
   as opposed to:
   ```py
   from iceberg.expressions.base import And
   
   expr = And(left, right, rest)
   isinstance(expr, And)
   ```
   
   (sorry for the super long comment...)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to