dramaticlly commented on code in PR #4717:
URL: https://github.com/apache/iceberg/pull/4717#discussion_r872689598
##########
python/src/iceberg/table/partitioning.py:
##########
@@ -14,8 +14,13 @@
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.
+from typing import Dict, Iterable, List, Tuple
+
+from iceberg.schema import Schema
from iceberg.transforms import Transform
+_PARTITION_DATA_ID_START: int = 1000
+
class PartitionField:
Review Comment:
yeah I agree the PartitionField is an immutable class after construction so
dataclass with both eq and frozen sounds fair to me.
for reference this is what will be look like for immutable PartitionField,
with all testcase passing (small ordering change on repr but I think default
one is very close to what we have today in java impl)
```python
@dataclass(frozen=True)
class PartitionField:
"""
PartitionField is a single element with name and unique id,
It represents how one partition value is derived from the source column
via transformation
Attributes:
source_id(int): The source column id of table's schema
field_id(int): The partition field id across all the table
metadata's partition specs
transform(Transform): The transform used to produce partition values
from source column
name(str): The name of this partition field
"""
source_id: int
field_id: int
transform: Transform
name: str
def __str__(self):
return f"{self.field_id}: {self.name}:
{self.transform}({self.source_id})"
```
On the other side, I think the biggest benefit of the dataclass is the
[`__post_init__`](https://docs.python.org/3.7/library/dataclasses.html#post-init-processing)
method which allow for java-like builderPattern equivalent processing when we
build the PartitionSpec. There's collection of validations need to happen and I
am discussing with @samredai in
https://github.com/apache/iceberg/issues/4631#issuecomment-1113632408.
From what I can tell, we will need a `PartitionSpecBuilder` class with
convenient way to construct the PartitionSpec, but we also want to make sure
avoid duplicate the builder logic in an overly complex init method for
PartitionSpec
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]