cfrancois7 commented on issue #953: URL: https://github.com/apache/iceberg-python/issues/953#issuecomment-2250346701
The first issue regarding the parsing is resolved by the PR. But the second issue related to the pyarrow command is still there: `ArrowInvalid: No match for FieldRef.Name(status) in id: int32` ```python File [~/projects/mpdata/my_proj/notebooks/pyiceberg/io/pyarrow.py:1195](http://localhost:8888/notebooks/pyiceberg/io/pyarrow.py#line=1194), in _task_to_record_batches(fs, task, bound_row_filter, projected_schema, projected_field_ids, positional_deletes, case_sensitive, name_mapping) 1192 if file_schema is None: 1193 raise ValueError(f"Missing Iceberg schema in Metadata for file: {path}") -> 1195 fragment_scanner = ds.Scanner.from_fragment( 1196 fragment=fragment, 1197 # With PyArrow 16.0.0 there is an issue with casting record-batches: 1198 # https://github.com/apache/arrow/issues/41884 1199 # https://github.com/apache/arrow/issues/43183 1200 # Would be good to remove this later on 1201 schema=_pyarrow_schema_ensure_large_types(physical_schema), 1202 # This will push down the query to Arrow. 1203 # But in case there are positional deletes, we have to apply them first 1204 filter=pyarrow_filter if not positional_deletes else None, 1205 columns=[col.name for col in file_project_schema.columns], 1206 ) 1208 current_index = 0 1209 batches = fragment_scanner.to_batches() File [~/.anaconda3/envs/my_proj/lib/python3.12/site-packages/pyarrow/_dataset.pyx:3558](http://localhost:8888/home/machine_learning/.anaconda3/envs/my_proj/lib/python3.12/site-packages/pyarrow/_dataset.pyx#line=3557), in pyarrow._dataset.Scanner.from_fragment() File [~/.anaconda3/envs/my_proj/lib/python3.12/site-packages/pyarrow/_dataset.pyx:3327](http://localhost:8888/home/machine_learning/.anaconda3/envs/my_proj/lib/python3.12/site-packages/pyarrow/_dataset.pyx#line=3326), in pyarrow._dataset._populate_builder() File [~/.anaconda3/envs/my_proj/lib/python3.12/site-packages/pyarrow/_compute.pyx:2700](http://localhost:8888/home/machine_learning/.anaconda3/envs/my_proj/lib/python3.12/site-packages/pyarrow/_compute.pyx#line=2699), in pyarrow._compute._bind() File [~/.anaconda3/envs/my_proj/lib/python3.12/site-packages/pyarrow/error.pxi:154](http://localhost:8888/home/machine_learning/.anaconda3/envs/my_proj/lib/python3.12/site-packages/pyarrow/error.pxi#line=153), in pyarrow.lib.pyarrow_internal_check_status() File [~/.anaconda3/envs/my_proj/lib/python3.12/site-packages/pyarrow/error.pxi:91](http://localhost:8888/home/machine_learning/.anaconda3/envs/my_proj/lib/python3.12/site-packages/pyarrow/error.pxi#line=90), in pyarrow.lib.check_status() ArrowInvalid: No match for FieldRef.Name(status) in id: int32 name: large_string age: int32 address: struct<street: large_string, city: large_string, postal_code: large_string> contact: struct<email: large_string, phone: large_string> employment: struct<status: large_string, position: large_string, company: struct<name: large_string, location: large_string>> preferences: struct<newsletter: bool, notifications: struct<email: bool, sms: bool>> ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
