Hi all,
I'm a first-time contributor to NumPy and would appreciate community feedback
on a proposed small API addition, currently in PR #29294:
TL;DR:
`np.savez_compressed` today always uses `zipfile.ZIP_DEFLATED` at default
level. The PR allows users to control any `zipfile.ZipFile` compression method
or level, in a backwards-compatible way.
### What would change?
A new optional keyword argument:
```python
np.savez_compressed(
"data.npz",
a=array0,
b=array1,
zipfile_kwargs={"compression": "lzma", "compresslevel": 9},
)
```
* `zipfile_kwargs` (default `None`) is forwarded directly to `zipfile.ZipFile`.
NumPy does not parse its contents beyond mapping human-friendly aliases for
`"stored"`, `"deflated"`, `"bzip2"`, `"lzma"` (case-insensitive).
* If `zipfile_kwargs` is not used, behavior remains identical to current
`np.savez_compressed`.
* No new top-level keywords like `compression=` are added, so existing code
like `np.savez_compressed(file, compression=my_array)` remains valid.
* Full tests included as in the new class `TestSavezCompressed`.
### Why this addition?
There have been recurring requests for:
* Controlling deflate level,
* Using `bzip2` or `lzma` in `.npz` files,
* Improving compression ratio for large arrays.
Without this, users must manually rewrite `.npz` archives—an inconvenient and
inefficient workaround.
By forwarding `zipfile.ZipFile` kwargs, users can leverage all its options
cleanly, including any future compression methods added to the standard library.
### Risk and Compatibility
* No changes to default behavior.
* Single new reserved keyword, `zipfile_kwargs`, to minimize potential key
collisions.
* Python version requirements respected: the PR only targets Python ≥3.11.
### Open Question
Is `zipfile_kwargs` an acceptable keyword name? It makes intent clear, but
alternative suggestions are welcome.
### Call for Feedback
I'd appreciate hearing from maintainers and contributors:
* Is there support for exposing this functionality?
* Any objections to the proposed API shape?
Thank you for your time and guidance!
Sajjad Ali
PR link: https://github.com/numpy/numpy/pull/29294
_______________________________________________
NumPy-Discussion mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3//lists/numpy-discussion.python.org
Member address: [email protected]