zhangfengcdt opened a new pull request, #44:
URL: https://github.com/apache/sedona-db/pull/44
Adds an optional `options` parameter to the `read_parquet()` function to
support S3 anonymous access and other DataFusion table configuration options.
- Add `options: Optional[Dict[str, Any]]` parameter to Python
`read_parquet()` function
- Extend `GeoParquetReadOptions` to handle custom table options via
`from_table_options()`
- Update Python-Rust bindings to convert Python dict to Rust
`HashMap<String, String>`
- Maintain full backward compatibility - all existing code continues to
work unchanged
- Support for S3 anonymous access via `{"aws.nosign": True}` and similar
options
- Comprehensive unit test coverage (23 tests) for all usage scenarios
## Usage Examples
```python
import sedonadb
sd = sedonadb.connect()
# Anonymous S3 access to public buckets
df = sd.read_parquet(
"s3://public-bucket/data.parquet",
options={"aws.nosign": True, "aws.region": "us-west-2"}
)
# Backward compatibility - existing code unchanged
df = sd.read_parquet("local/file.parquet") # Still works
# Empty options (equivalent to no options)
df = sd.read_parquet("file.parquet", options={})
## Test Coverage:
- /tests/io/test_parquet_options.py (14 tests)
- /tests/test_context.py (5 integration tests)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]