rdblue commented on code in PR #4685:
URL: https://github.com/apache/iceberg/pull/4685#discussion_r864990932
##########
python/src/iceberg/schema.py:
##########
@@ -503,21 +507,34 @@ class _BuildPositionAccessors(SchemaVisitor[Dict[int,
"Accessor"]]):
def __init__(self) -> None:
self._index: Dict[int, Accessor] = {}
- def schema(self, schema, result: Dict[int, Accessor]) -> Dict[int,
Accessor]:
- return self._index
+ def schema(self, schema: Schema, field_results: Dict[int, Accessor]) ->
Dict[int, Accessor]:
+ return field_results
+
+ def struct(self, struct: StructType, field_results: List[Dict[int,
Accessor]]) -> Dict[int, Accessor]:
+ fields = struct.fields
+ for idx, field in enumerate(fields):
+ result = field_results[idx]
+ if result:
+ self._index.update(result)
+ else:
+ self._index[field.field_id] = Accessor(field.field_id)
Review Comment:
Yeah, it looks like this is creating accessors based on the field ID, but
what we need is to create accessors that use the current schema's positions and
return a map of them by field ID.
For example, this schema:
```
table {
2: int id,
1: string data,
3: struct location {
5: float lat,
6: float long
}
}
```
Should produce the following accessor map:
```
1 -> Accessor(position=1)
2 -> Accessor(position=0)
3 -> Accessor(position=2)
5 -> Accessor(position=2, inner=Accessor(position=0))
6 -> Accessor(position=2, inner=Accessor(position=1))
```
That way when an expression uses, `location.lat`, the accessor will call
`row.get(2).get(0)` to return the value.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]