samredai commented on a change in pull request #3399: URL: https://github.com/apache/iceberg/pull/3399#discussion_r740320039
########## File path: python/src/iceberg/expressions.py ########## @@ -0,0 +1,106 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +from enum import Enum + +from iceberg.exceptions import OperationError + + +class Operation(Enum): + """Operations to be used as components in expressions + + Various operations can be negated or reversed. Negating an + operation is as simple as using the built-in subtraction operator: + + >>> print(-Operation.TRUE) + Operation.FALSE + >>> print(-Operation.IS_NULL) + Operation.NOT_NULL + + Reversing an operation can be done using the built-in reversed() method: + >>> print(reversed(Operation.LT)) + Operation.GT + >>> print(reversed(Operation.EQ)) + Operation.NOT_EQ + + Raises: + OperationError: This is raised when attempting to negate or reverse + an operation that cannot be negated or reversed. + """ + TRUE = "TRUE" Review comment: It looks like Enum comparisons are always done by identity ([docs](https://docs.python.org/3.11/howto/enum.html#comparisons)) and the values don't matter in that regard. I verified this by making an enum where the values were very long arrays and there was no impact to the performance of comparing the enums (comparing the long arrays directly takes ages). However I did find these alternatives in the docs that remove the requirement for hard-coding literals completely: ## Auto (option 1) ```py from enum import Enum, auto class Operation(Enum): TRUE = auto() # <- resolves to 1 FALSE = auto() # <- resolves to 2 ``` ## Auto as name (option 2) ```py from enum import Enum, auto class AutoName(Enum): def _generate_next_value_(name, start, count, last_values): return name class Operation(AutoName): TRUE = auto() # <- resolves to "TRUE" FALSE = auto() # <- resolves to "FALSE" ``` Option 1 looks very clean but for the few extra lines in option 2, the Enum has readable values that are actually meaningful. Thoughts on which one we should use (if either)? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
