Fokko commented on PR #7751: URL: https://github.com/apache/iceberg/pull/7751#issuecomment-1577410577
@rdblue The idea is that you'll have the changes in the open-api spec and also the changes in the Python code alongside. Small changes can have a significant impact on the code itself. Let's take https://github.com/apache/iceberg/pull/7710#discussion_r1216646858 ```yaml ReportMetricsRequest: anyOf: - $ref: '#/components/schemas/ScanReport' - $ref: '#/components/schemas/CommitReport' required: - report-type properties: report-type: type: string ``` Above is the the current code on master, and that generates: ```python class ReportMetricsRequest(ScanReport): report_type: str = Field(..., alias='report-type') class ReportMetricsRequest1(CommitReport): report_type: str = Field(..., alias='report-type') class ReportMetricsRequest2(BaseModel): __root__: Union[ReportMetricsRequest, ReportMetricsRequest1] ``` This looks weird, and it is. `ReportMetricsRequest` is `anyOf` (validates the value against any (one or more) of the subschemas) of both `ScanReport` and `CommitReport`. It is actually the other way around. `ScanReport` and `CommitReport` is `allOf` `ReportMetricsRequest`. And based on the `report-type` it is `oneOf` the subtypes based on the `report-type`: ```yaml ReportMetricsRequest: type: object discriminator: propertyName: report-type mapping: commit-report: '#/components/schemas/CommitReport' scan-report: '#/components/schemas/ScanReport' required: - report-type properties: report-type: type: string ``` The point that I'm trying to make here is that is very easy to get confused in the `allOf`, `anyOf`, and `oneOf`. And for me, it really helps to see the code alongside the PR. The idea is that we keep the Python code up to sync, and when you update the open-api code, you'll see the impact on the code. If you do it the other way around, you'll end up with: ```python class ReportMetricsRequest(BaseModel): report_type: str = Field(..., alias='report-type') class CommitReport(ReportMetricsRequest): report_type: Literal['commit-report'] = Field(..., alias='report-type') table_name: str = Field(..., alias='table-name') snapshot_id: int = Field(..., alias='snapshot-id') sequence_number: int = Field(..., alias='sequence-number') operation: str metrics: Metrics metadata: Optional[Dict[str, str]] = None class ScanReport(ReportMetricsRequest): report_type: Literal['scan-report'] = Field(..., alias='report-type') table_name: str = Field(..., alias='table-name') snapshot_id: int = Field(..., alias='snapshot-id') filter: Expression schema_id: int = Field(..., alias='schema-id') projected_field_ids: List[int] = Field(..., alias='projected-field-ids') projected_field_names: List[str] = Field(..., alias='projected-field-names') metrics: Metrics metadata: Optional[Dict[str, str]] = None ``` If we want to generate clients in other languages as well, it helps to keep the open-API as close to the supposed implementation as possible. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
