Script 'mail_helper' called by obssrc
Hello community,
here is the log from the commit of package python-h5netcdf for openSUSE:Factory
checked in at 2023-01-07 17:19:57
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Comparing /work/SRC/openSUSE:Factory/python-h5netcdf (Old)
and /work/SRC/openSUSE:Factory/.python-h5netcdf.new.1563 (New)
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Package is "python-h5netcdf"
Sat Jan 7 17:19:57 2023 rev:7 rq:1056763 version:1.1.0
Changes:
--------
--- /work/SRC/openSUSE:Factory/python-h5netcdf/python-h5netcdf.changes
2022-08-09 15:43:50.436272948 +0200
+++
/work/SRC/openSUSE:Factory/.python-h5netcdf.new.1563/python-h5netcdf.changes
2023-01-07 17:23:19.047445675 +0100
@@ -1,0 +2,12 @@
+Sat Jan 7 12:08:07 UTC 2023 - Dirk Müller <[email protected]>
+
+- update to 1.1.0:
+ * Rework adding _FillValue-attribute, add tests.
+ * Add special add_phony method for creating phony dimensions, add test.
+ * Rewrite _unlabeled_dimension_mix (labeled/unlabeled), add tests.
+ * Add default netcdf fillvalues, pad only if necessary, adapt tests.
+ * Fix regression in padding algorithm, add test.
+ * Set ``track_order=True`` by default in created files if h5py 3.7.0 or
+ greater is detected to help compatibility with netCDF4-c programs.
+
+-------------------------------------------------------------------
Old:
----
h5netcdf-1.0.2.tar.gz
New:
----
h5netcdf-1.1.0.tar.gz
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Other differences:
------------------
++++++ python-h5netcdf.spec ++++++
--- /var/tmp/diff_new_pack.G7VZgM/_old 2023-01-07 17:23:19.647449254 +0100
+++ /var/tmp/diff_new_pack.G7VZgM/_new 2023-01-07 17:23:19.651449278 +0100
@@ -1,7 +1,7 @@
#
# spec file for package python-h5netcdf
#
-# Copyright (c) 2022 SUSE LLC
+# Copyright (c) 2023 SUSE LLC
#
# All modifications and additions to the file contributed by third parties
# remain the property of their copyright owners, unless otherwise agreed
@@ -20,7 +20,7 @@
%define skip_python2 1
%define skip_python36 1
Name: python-h5netcdf
-Version: 1.0.2
+Version: 1.1.0
Release: 0
Summary: A Python library to use netCDF4 files via h5py
License: BSD-3-Clause
++++++ h5netcdf-1.0.2.tar.gz -> h5netcdf-1.1.0.tar.gz ++++++
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn'
'--exclude=.svnignore' old/h5netcdf-1.0.2/CHANGELOG.rst
new/h5netcdf-1.1.0/CHANGELOG.rst
--- old/h5netcdf-1.0.2/CHANGELOG.rst 2022-08-02 11:33:39.000000000 +0200
+++ new/h5netcdf-1.1.0/CHANGELOG.rst 2022-11-23 07:40:05.000000000 +0100
@@ -1,5 +1,21 @@
Change Log
----------
+Version 1.1.0 (November 23rd, 2022):
+
+- Rework adding _FillValue-attribute, add tests.
+ By `Kai Mühlbauer <https://github.com/kmuehlbauer>`_.
+- Add special add_phony method for creating phony dimensions, add test.
+ By `Kai Mühlbauer <https://github.com/kmuehlbauer>`_.
+- Rewrite _unlabeled_dimension_mix (labeled/unlabeled), add tests.
+ By `Kai Mühlbauer <https://github.com/kmuehlbauer>`_.
+- Add default netcdf fillvalues, pad only if necessary, adapt tests.
+ By `Kai Mühlbauer <https://github.com/kmuehlbauer>`_.
+- Fix regression in padding algorithm, add test.
+ By `Kai Mühlbauer <https://github.com/kmuehlbauer>`_.
+- Set ``track_order=True`` by default in created files if h5py 3.7.0 or
+ greater is detected to help compatibility with netCDF4-c programs.
+ By `Mark Harfouche <https://github.com/hmaarrfk>`_.
+
Version 1.0.2 (August 2nd, 2022):
- Adapt boolean indexing as h5py 3.7.0 started supporting it.
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn'
'--exclude=.svnignore' old/h5netcdf-1.0.2/PKG-INFO new/h5netcdf-1.1.0/PKG-INFO
--- old/h5netcdf-1.0.2/PKG-INFO 2022-08-02 11:34:00.711885000 +0200
+++ new/h5netcdf-1.1.0/PKG-INFO 2022-11-23 07:40:28.608608200 +0100
@@ -1,6 +1,6 @@
Metadata-Version: 2.1
Name: h5netcdf
-Version: 1.0.2
+Version: 1.1.0
Summary: netCDF4 via h5py
Home-page: https://h5netcdf.org
Author: h5netcdf developers
@@ -265,15 +265,30 @@
Track Order
~~~~~~~~~~~
-In h5netcdf version 0.12.0 and earlier, `order tracking`_ was disabled in
-HDF5 file. As this is a requirement for the current netCDF4 standard,
-it has been enabled without deprecation as of version 0.13.0 `[*]`_.
+As of h5netcdf 1.1.0, if h5py 3.7.0 or greater is detected, the ``track_order``
+parameter is set to ``True`` enabling `order tracking`_ for newly created
+netCDF4 files. This helps ensure that files created with the h5netcdf library
+can be modified by the netCDF4-c and netCDF4-python implementation used in
+other software stacks. Since this change should be transparent to most users,
+it was made without deprecation.
+
+Since track_order is set at creation time, any dataset that was created with
+``track_order=False`` (h5netcdf version 1.0.2 and older except for 0.13.0) will
+continue to opened with order tracker disabled.
+
+The following describes the behavior of h5netcdf with respect to order tracking
+for a few key versions:
+
+- Version 0.12.0 and earlier, the ``track_order`` parameter`order was missing
+ and thus order tracking was implicitely set to ``False``.
+- Version 0.13.0 enabled order tracking by setting the parameter
+ ``track_order`` to ``True`` by default without deprecation.
+- Versions 0.13.1 to 1.0.2 set ``track_order`` to ``False`` due to a bug in a
+ core dependency of h5netcdf, h5py `upstream bug`_ which was resolved in h5py
+ 3.7.0 with the help of the h5netcdf team.
+- In version 1.1.0, if h5py 3.7.0 or above is detected, the ``track_order``
+ parameter is set to ``True`` by default.
-However in version 0.13.1 this has been reverted due to a bug in a core
-dependency of h5netcdf, h5py `upstream bug`_.
-
-Datasets created with h5netcdf version 0.12.0 that are opened with
-newer versions of h5netcdf will continue to disable order tracker.
.. _order tracking:
https://docs.unidata.ucar.edu/netcdf-c/current/file_format_specifications.html#creation_order
.. _upstream bug: https://github.com/h5netcdf/h5netcdf/issues/136
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn'
'--exclude=.svnignore' old/h5netcdf-1.0.2/README.rst
new/h5netcdf-1.1.0/README.rst
--- old/h5netcdf-1.0.2/README.rst 2022-08-02 11:33:39.000000000 +0200
+++ new/h5netcdf-1.1.0/README.rst 2022-11-23 07:40:05.000000000 +0100
@@ -241,15 +241,30 @@
Track Order
~~~~~~~~~~~
-In h5netcdf version 0.12.0 and earlier, `order tracking`_ was disabled in
-HDF5 file. As this is a requirement for the current netCDF4 standard,
-it has been enabled without deprecation as of version 0.13.0 `[*]`_.
+As of h5netcdf 1.1.0, if h5py 3.7.0 or greater is detected, the ``track_order``
+parameter is set to ``True`` enabling `order tracking`_ for newly created
+netCDF4 files. This helps ensure that files created with the h5netcdf library
+can be modified by the netCDF4-c and netCDF4-python implementation used in
+other software stacks. Since this change should be transparent to most users,
+it was made without deprecation.
-However in version 0.13.1 this has been reverted due to a bug in a core
-dependency of h5netcdf, h5py `upstream bug`_.
+Since track_order is set at creation time, any dataset that was created with
+``track_order=False`` (h5netcdf version 1.0.2 and older except for 0.13.0) will
+continue to opened with order tracker disabled.
+
+The following describes the behavior of h5netcdf with respect to order tracking
+for a few key versions:
+
+- Version 0.12.0 and earlier, the ``track_order`` parameter`order was missing
+ and thus order tracking was implicitely set to ``False``.
+- Version 0.13.0 enabled order tracking by setting the parameter
+ ``track_order`` to ``True`` by default without deprecation.
+- Versions 0.13.1 to 1.0.2 set ``track_order`` to ``False`` due to a bug in a
+ core dependency of h5netcdf, h5py `upstream bug`_ which was resolved in h5py
+ 3.7.0 with the help of the h5netcdf team.
+- In version 1.1.0, if h5py 3.7.0 or above is detected, the ``track_order``
+ parameter is set to ``True`` by default.
-Datasets created with h5netcdf version 0.12.0 that are opened with
-newer versions of h5netcdf will continue to disable order tracker.
.. _order tracking:
https://docs.unidata.ucar.edu/netcdf-c/current/file_format_specifications.html#creation_order
.. _upstream bug: https://github.com/h5netcdf/h5netcdf/issues/136
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn'
'--exclude=.svnignore' old/h5netcdf-1.0.2/h5netcdf/_version.py
new/h5netcdf-1.1.0/h5netcdf/_version.py
--- old/h5netcdf-1.0.2/h5netcdf/_version.py 2022-08-02 11:34:00.000000000
+0200
+++ new/h5netcdf-1.1.0/h5netcdf/_version.py 2022-11-23 07:40:28.000000000
+0100
@@ -1,5 +1,5 @@
# coding: utf-8
# file generated by setuptools_scm
# don't change, don't track in version control
-__version__ = version = '1.0.2'
-__version_tuple__ = version_tuple = (1, 0, 2)
+__version__ = version = '1.1.0'
+__version_tuple__ = version_tuple = (1, 1, 0)
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn'
'--exclude=.svnignore' old/h5netcdf-1.0.2/h5netcdf/core.py
new/h5netcdf-1.1.0/h5netcdf/core.py
--- old/h5netcdf-1.0.2/h5netcdf/core.py 2022-08-02 11:33:39.000000000 +0200
+++ new/h5netcdf-1.1.0/h5netcdf/core.py 2022-11-23 07:40:05.000000000 +0100
@@ -51,12 +51,17 @@
def _transform_1d_boolean_indexers(key):
"""Find and transform 1D boolean indexers to int"""
- key = [
- np.asanyarray(k).nonzero()[0]
- if isinstance(k, (np.ndarray, list)) and type(k[0]) in (bool, np.bool_)
- else k
- for k in key
- ]
+ # return key, if not iterable
+ try:
+ key = [
+ np.asanyarray(k).nonzero()[0]
+ if isinstance(k, (np.ndarray, list)) and type(k[0]) in (bool,
np.bool_)
+ else k
+ for k in key
+ ]
+ except TypeError:
+ return key
+
return tuple(key)
@@ -145,10 +150,14 @@
# normal variable carrying DIMENSION_LIST
# extract hdf5 file references and get objects name
if "DIMENSION_LIST" in attrs:
- return tuple(
- self._root._h5file[ref[0]].name.split("/")[-1]
- for ref in list(self._h5ds.attrs.get("DIMENSION_LIST", []))
- )
+ # check if malformed variable and raise
+ if _unlabeled_dimension_mix(self._h5ds) == "labeled":
+ # If a dimension has attached more than one scale for some
reason, then
+ # take the last one. This is in line with netcdf-c and
netcdf4-python.
+ return tuple(
+ self._root._h5file[ref[-1]].name.split("/")[-1]
+ for ref in list(self._h5ds.attrs.get("DIMENSION_LIST", []))
+ )
# need to use the h5ds name here to distinguish from collision
dimensions
child_name = self._h5ds.name.split("/")[-1]
@@ -271,6 +280,39 @@
"""Return NumPy dtype object giving the variableâs type."""
return self._h5ds.dtype
+ def _get_padding(self, key):
+ """Return padding if needed, defaults to False."""
+ padding = False
+ if self.dtype != str and self.dtype.kind in ["f", "i", "u"]:
+ key0 = _expanded_indexer(key, self.ndim)
+ key0 = _transform_1d_boolean_indexers(key0)
+ # extract max shape of key vs hdf5-shape
+ h5ds_shape = self._h5ds.shape
+ shape = self.shape
+
+ # check for ndarray and list
+ # see https://github.com/pydata/xarray/issues/7154
+ # first get maximum index
+ max_index = [
+ max(k) + 1 if isinstance(k, (np.ndarray, list)) else k.stop
+ for k in key0
+ ]
+ # second convert to max shape
+ max_shape = tuple(
+ [
+ shape[i] if k is None else max(h5ds_shape[i], k)
+ for i, k in enumerate(max_index)
+ ]
+ )
+
+ # check if hdf5 dataset dimensions are smaller than
+ # their respective netcdf dimensions
+ sdiff = [d0 - d1 for d0, d1 in zip(max_shape, h5ds_shape)]
+ # create padding only if hdf5 dataset is smaller than netcdf
dimension
+ if sum(sdiff):
+ padding = [(0, s) for s in sdiff]
+ return padding
+
def __array__(self, *args, **kwargs):
return self._h5ds.__array__(*args, **kwargs)
@@ -279,7 +321,6 @@
if isinstance(self._parent._root, Dataset):
# this is only for legacyapi
- key = _expanded_indexer(key, self.ndim)
# fix boolean indexing for affected versions
# https://github.com/h5py/h5py/pull/2079
# https://github.com/h5netcdf/h5netcdf/pull/125/
@@ -292,18 +333,17 @@
if string_info and string_info.length is None:
return self._h5ds.asstr()[key]
- # return array padded with fillvalue (both api)
- if self.dtype != str and self.dtype.kind in ["f", "i", "u"]:
- sdiff = [d0 - d1 for d0, d1 in zip(self.shape, self._h5ds.shape)]
- if sum(sdiff):
- fv = self.dtype.type(self._h5ds.fillvalue)
- padding = [(0, s) for s in sdiff]
- return np.pad(
- self._h5ds,
- pad_width=padding,
- mode="constant",
- constant_values=fv,
- )[key]
+ # get padding
+ padding = self._get_padding(key)
+ # apply padding with fillvalue (both api)
+ if padding:
+ fv = self.dtype.type(self._h5ds.fillvalue)
+ return np.pad(
+ self._h5ds,
+ pad_width=padding,
+ mode="constant",
+ constant_values=fv,
+ )[key]
return self._h5ds[key]
@@ -406,15 +446,26 @@
def _unlabeled_dimension_mix(h5py_dataset):
- dims = sum([len(j) for j in h5py_dataset.dims])
- if dims:
- if dims != h5py_dataset.ndim:
+ # check if dataset has dims and get it
+ dimlist = getattr(h5py_dataset, "dims", [])
+ if not dimlist:
+ status = "nodim"
+ else:
+ dimset = set([len(j) for j in dimlist])
+ # either all dimensions have exactly one scale
+ # or all dimensions have no scale
+ if dimset ^ {0} == set():
+ status = "unlabeled"
+ elif dimset & {0}:
name = h5py_dataset.name.split("/")[-1]
raise ValueError(
"malformed variable {0} has mixing of labeled and "
"unlabeled dimensions.".format(name)
)
- return dims
+ else:
+ status = "labeled"
+
+ return status
class Group(Mapping):
@@ -462,9 +513,8 @@
self._dimensions.add(k)
else:
if self._root._phony_dims_mode is not None:
-
- # check if malformed variable
- if not _unlabeled_dimension_mix(v):
+ # check if malformed variable and raise
+ if _unlabeled_dimension_mix(v) == "unlabeled":
# if unscaled variable, get phony dimensions
phony_dims |= Counter(v.shape)
@@ -486,7 +536,7 @@
if self._root._phony_dims_mode == "sort":
name += self._root._max_dim_id + 1
name = "phony_dim_{}".format(name)
- self._dimensions[name] = size
+ self._dimensions.add_phony(name, size)
self._initialized = True
@@ -675,6 +725,14 @@
if self._root._h5py.__name__ == "h5py":
kwargs.update(dict(track_order=self._parent._track_order))
+ # handling default fillvalues for legacyapi
+ # see https://github.com/h5netcdf/h5netcdf/issues/182
+ from .legacyapi import Dataset, _get_default_fillvalue
+
+ fillval = fillvalue
+ if fillvalue is None and isinstance(self._parent._root, Dataset):
+ fillval = _get_default_fillvalue(dtype)
+
# create hdf5 variable
self._h5group.create_dataset(
h5name,
@@ -682,7 +740,7 @@
dtype=dtype,
data=data,
chunks=chunks,
- fillvalue=fillvalue,
+ fillvalue=fillval,
**kwargs,
)
@@ -712,7 +770,20 @@
variable._ensure_dim_id()
if fillvalue is not None:
- value = variable.dtype.type(fillvalue)
+ # trying to create correct type of fillvalue
+ if variable.dtype is str:
+ value = fillvalue
+ else:
+ string_info =
self._root._h5py.check_string_dtype(variable.dtype)
+ if (
+ string_info
+ and string_info.length is not None
+ and string_info.length > 1
+ ):
+ value = fillvalue
+ else:
+ value = variable.dtype.type(fillvalue)
+
variable.attrs._h5attrs["_FillValue"] = value
return variable
@@ -773,6 +844,12 @@
"""
# if root-variable
if name.startswith("/"):
+ # handling default fillvalues for legacyapi
+ # see https://github.com/h5netcdf/h5netcdf/issues/182
+ from .legacyapi import Dataset, _get_default_fillvalue
+
+ if fillvalue is None and isinstance(self._parent._root, Dataset):
+ fillvalue = _get_default_fillvalue(dtype)
return self._root.create_variable(
name[1:],
dimensions,
@@ -911,6 +988,16 @@
phony_dims: 'sort', 'access'
See :ref:`phony dims` for more details.
+ track_order: bool
+ Corresponds to the h5py.File `track_order` parameter. Unless
+ specified, the library will choose a default that enhances
+ compatibility with netCDF4-c. If h5py version 3.7.0 or greater is
+ installed, this parameter will be set to True by default.
+ track_order is required to be true to for netCDF4-c libraries to
+ append to a file. If an older version of h5py is detected, this
+ parameter will be set to False by default to work around a bug in
+ h5py limiting the number of attributes for a given variable.
+
**kwargs:
Additional keyword arguments to be passed to the ``h5py.File``
constructor.
@@ -930,22 +1017,14 @@
# standard
# https://github.com/Unidata/netcdf-c/issues/2054
# https://github.com/h5netcdf/h5netcdf/issues/128
- # 2022/01/20: hmaarrfk
- # However, it was found that this causes issues with attrs and h5py
- # https://github.com/h5netcdf/h5netcdf/issues/136
- # https://github.com/h5py/h5py/issues/1385
- track_order = kwargs.pop("track_order", False)
-
- # When the issues with track_order in h5py are resolved, we
- # can consider uncommenting the code below
- # if not track_order:
- # self._closed = True
- # raise ValueError(
- # f"track_order, if specified must be set to to True (got
{track_order})"
- # "to conform to the netCDF4 file format. Please see "
- # "https://github.com/h5netcdf/h5netcdf/issues/130 "
- # "for more details."
- # )
+ # h5py versions less than 3.7.0 had a bug that limited the number of
+ # attributes when track_order was set to true by default.
+ # However, setting track_order to True helps with compatibility
+ # with netcdf4-c and generally, keeping track of how things were added
+ # to the dataset.
+ #
https://github.com/h5netcdf/h5netcdf/issues/136#issuecomment-1017457067
+ track_order_default = version.parse(h5py.__version__) >=
version.parse("3.7.0")
+ track_order = kwargs.pop("track_order", track_order_default)
if version.parse(h5py.__version__) >= version.parse("3.0.0"):
self.decode_vlen_strings = kwargs.pop("decode_vlen_strings", None)
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn'
'--exclude=.svnignore' old/h5netcdf-1.0.2/h5netcdf/dimensions.py
new/h5netcdf-1.1.0/h5netcdf/dimensions.py
--- old/h5netcdf-1.0.2/h5netcdf/dimensions.py 2022-08-02 11:33:39.000000000
+0200
+++ new/h5netcdf-1.1.0/h5netcdf/dimensions.py 2022-11-23 07:40:05.000000000
+0100
@@ -21,14 +21,18 @@
def __setitem__(self, name, size):
# creating new dimensions
- phony = "phony_dim" in name
- if not self._group._root._writable and not phony:
+ if not self._group._root._writable:
raise RuntimeError("H5NetCDF: Write to read only")
if name in self._objects:
raise ValueError("dimension %r already exists" % name)
self._objects[name] = Dimension(self._group, name, size,
create_h5ds=True)
+ def add_phony(self, name, size):
+ self._objects[name] = Dimension(
+ self._group, name, size, create_h5ds=False, phony=True
+ )
+
def add(self, name):
# adding dimensions which are already created in the file
self._objects[name] = Dimension(self._group, name)
@@ -56,7 +60,7 @@
class Dimension(object):
- def __init__(self, parent, name, size=None, create_h5ds=False):
+ def __init__(self, parent, name, size=None, create_h5ds=False,
phony=False):
"""NetCDF4 Dimension constructor.
Parameters
@@ -69,9 +73,11 @@
Size of the Netcdf4 Dimension. Defaults to None (unlimited).
create_h5ds : bool
For internal use only.
+ phony : bool
+ For internal use only.
"""
self._parent_ref = weakref.ref(parent)
- self._phony = "phony_dim" in name
+ self._phony = phony
self._root_ref = weakref.ref(parent._root)
self._h5path = _join_h5paths(parent.name, name)
self._name = name
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn'
'--exclude=.svnignore' old/h5netcdf-1.0.2/h5netcdf/legacyapi.py
new/h5netcdf-1.1.0/h5netcdf/legacyapi.py
--- old/h5netcdf-1.0.2/h5netcdf/legacyapi.py 2022-08-02 11:33:39.000000000
+0200
+++ new/h5netcdf-1.1.0/h5netcdf/legacyapi.py 2022-11-23 07:40:05.000000000
+0100
@@ -5,6 +5,30 @@
from . import core
+#: default netcdf fillvalues
+default_fillvals = {
+ "S1": "\x00",
+ "i1": -127,
+ "u1": 255,
+ "i2": -32767,
+ "u2": 65535,
+ "i4": -2147483647,
+ "u4": 4294967295,
+ "i8": -9223372036854775806,
+ "u8": 18446744073709551614,
+ "f4": 9.969209968386869e36,
+ "f8": 9.969209968386869e36,
+}
+
+
+def _get_default_fillvalue(dtype):
+ kind = np.dtype(dtype).kind
+ fillvalue = None
+ if kind in ["u", "i", "f"]:
+ size = np.dtype(dtype).itemsize
+ fillvalue = default_fillvals[f"{kind}{size}"]
+ return fillvalue
+
def _check_return_dtype_endianess(endian="native"):
little_endian = sys.byteorder == "little"
@@ -204,7 +228,7 @@
fletcher32=fletcher32,
chunks=chunksizes,
fillvalue=fill_value,
- **kwds
+ **kwds,
)
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn'
'--exclude=.svnignore' old/h5netcdf-1.0.2/h5netcdf/tests/test_h5netcdf.py
new/h5netcdf-1.1.0/h5netcdf/tests/test_h5netcdf.py
--- old/h5netcdf-1.0.2/h5netcdf/tests/test_h5netcdf.py 2022-08-02
11:33:39.000000000 +0200
+++ new/h5netcdf-1.1.0/h5netcdf/tests/test_h5netcdf.py 2022-11-23
07:40:05.000000000 +0100
@@ -677,6 +677,13 @@
pass
+def test_fake_phony_dims(tmp_local_or_remote_netcdf):
+ # tests writing of dimension with phony naming scheme
+ # see https://github.com/h5netcdf/h5netcdf/issues/178
+ with h5netcdf.File(tmp_local_or_remote_netcdf, mode="w") as ds:
+ ds.dimensions["phony_dim_0"] = 3
+
+
def check_invalid_netcdf4_mixed(var, i):
pdim = "phony_dim_{}".format(i)
assert var["foo1"].dimensions[0] == "y1"
@@ -761,8 +768,14 @@
f["foo1"].dims[0].attach_scale(f["x"])
with raises(ValueError):
+ with h5netcdf.File(tmp_local_or_remote_netcdf, "r") as ds:
+ assert ds
+ print(ds)
+
+ with raises(ValueError):
with h5netcdf.File(tmp_local_or_remote_netcdf, "r", phony_dims="sort")
as ds:
assert ds
+ print(ds)
def test_hierarchical_access_auto_create(tmp_local_or_remote_netcdf):
@@ -1142,6 +1155,7 @@
assert f["dummy2"].shape == (3, 2, 2)
f.groups["test"]["dummy3"].shape == (3, 3)
f.groups["test"]["dummy4"].shape == (0, 0)
+ assert f["dummy5"].shape == (2, 3)
def test_reading_unused_unlimited_dimension(tmp_local_or_remote_netcdf):
@@ -1163,10 +1177,12 @@
def test_nc4_non_coord(tmp_local_netcdf):
- # Track order True is the new default for versions after 0.12.0
- # 0.12.0 defaults to `track_order=False`
- # Ensure that the tests order the variables in their creation order
- # not alphabetical order
+ # Here we generate a few variables and coordinates
+ # The default should be to track the order of creation
+ # Thus, on reopening the file, the order in which
+ # the variables are listed should be maintained
+ # y -- refers to the coordinate y
+ # _nc4_non_coord_y -- refers to the data y
with h5netcdf.File(tmp_local_netcdf, "w") as f:
f.dimensions = {"x": None, "y": 2}
f.create_variable("test", dimensions=("x",), dtype=np.int64)
@@ -1177,8 +1193,23 @@
assert f.dimensions["x"].size == 0
assert f.dimensions["x"].isunlimited()
assert f.dimensions["y"].size == 2
- assert list(f.variables) == ["y", "test"]
- assert list(f._h5group.keys()) == ["_nc4_non_coord_y", "test", "x",
"y"]
+ if version.parse(h5py.__version__) >= version.parse("3.7.0"):
+ assert list(f.variables) == ["test", "y"]
+ assert list(f._h5group.keys()) == ["x", "y", "test",
"_nc4_non_coord_y"]
+
+ with h5netcdf.File(tmp_local_netcdf, "w") as f:
+ f.dimensions = {"x": None, "y": 2}
+ f.create_variable("y", dimensions=("x",), dtype=np.int64)
+ f.create_variable("test", dimensions=("x",), dtype=np.int64)
+
+ with h5netcdf.File(tmp_local_netcdf, "r") as f:
+ assert list(f.dimensions) == ["x", "y"]
+ assert f.dimensions["x"].size == 0
+ assert f.dimensions["x"].isunlimited()
+ assert f.dimensions["y"].size == 2
+ if version.parse(h5py.__version__) >= version.parse("3.7.0"):
+ assert list(f.variables) == ["y", "test"]
+ assert list(f._h5group.keys()) == ["x", "y", "_nc4_non_coord_y",
"test"]
def test_overwrite_existing_file(tmp_local_netcdf):
@@ -1472,6 +1503,9 @@
def test_expanded_variables_netcdf4(tmp_local_netcdf, netcdf_write_module):
+ # partially reimplemented due to performance reason in edge cases
+ # https://github.com/h5netcdf/h5netcdf/issues/182
+
with netcdf_write_module.Dataset(tmp_local_netcdf, "w") as ds:
f = ds.createGroup("test")
f.createDimension("x", None)
@@ -1483,8 +1517,8 @@
dummy4 = f.createVariable("dummy4", float, ("x", "y"))
dummy1[:] = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
- dummy2[:] = [[1, 2, 3]]
- dummy3[:] = [[1, 2, 3], [4, 5, 6]]
+ dummy2[1, :] = [4, 5, 6]
+ dummy3[0:2, :] = [[1, 2, 3], [4, 5, 6]]
# don't mask, since h5netcdf doesn't do masking
if netcdf_write_module == netCDF4:
@@ -1503,10 +1537,16 @@
f = ds["test"]
np.testing.assert_allclose(f.variables["dummy1"][:], res1)
+ np.testing.assert_allclose(f.variables["dummy1"][1, :], [4.0, 5.0,
6.0])
+ np.testing.assert_allclose(f.variables["dummy1"][1:2, :], [[4.0, 5.0,
6.0]])
assert f.variables["dummy1"].shape == (3, 3)
np.testing.assert_allclose(f.variables["dummy2"][:], res2)
+ np.testing.assert_allclose(f.variables["dummy2"][1, :], [4.0, 5.0,
6.0])
+ np.testing.assert_allclose(f.variables["dummy2"][1:2, :], [[4.0, 5.0,
6.0]])
assert f.variables["dummy2"].shape == (3, 3)
np.testing.assert_allclose(f.variables["dummy3"][:], res3)
+ np.testing.assert_allclose(f.variables["dummy3"][1, :], [4.0, 5.0,
6.0])
+ np.testing.assert_allclose(f.variables["dummy3"][1:2, :], [[4.0, 5.0,
6.0]])
assert f.variables["dummy3"].shape == (3, 3)
np.testing.assert_allclose(f.variables["dummy4"][:], res4)
assert f.variables["dummy4"].shape == (3, 3)
@@ -1514,12 +1554,22 @@
with legacyapi.Dataset(tmp_local_netcdf, "r") as ds:
f = ds["test"]
np.testing.assert_allclose(f.variables["dummy1"][:], res1)
+ np.testing.assert_allclose(f.variables["dummy1"][1, :], [4.0, 5.0,
6.0])
+ np.testing.assert_allclose(f.variables["dummy1"][1:2, :], [[4.0, 5.0,
6.0]])
+ np.testing.assert_allclose(f.variables["dummy1"]._h5ds[1, :], [4.0,
5.0, 6.0])
+ np.testing.assert_allclose(
+ f.variables["dummy1"]._h5ds[1:2, :], [[4.0, 5.0, 6.0]]
+ )
assert f.variables["dummy1"].shape == (3, 3)
assert f.variables["dummy1"]._h5ds.shape == (3, 3)
np.testing.assert_allclose(f.variables["dummy2"][:], res2)
+ np.testing.assert_allclose(f.variables["dummy2"][1, :], [4.0, 5.0,
6.0])
+ np.testing.assert_allclose(f.variables["dummy2"][1:2, :], [[4.0, 5.0,
6.0]])
assert f.variables["dummy2"].shape == (3, 3)
- assert f.variables["dummy2"]._h5ds.shape == (1, 3)
+ assert f.variables["dummy2"]._h5ds.shape == (2, 3)
np.testing.assert_allclose(f.variables["dummy3"][:], res3)
+ np.testing.assert_allclose(f.variables["dummy3"][1, :], [4.0, 5.0,
6.0])
+ np.testing.assert_allclose(f.variables["dummy3"][1:2, :], [[4.0, 5.0,
6.0]])
assert f.variables["dummy3"].shape == (3, 3)
assert f.variables["dummy3"]._h5ds.shape == (2, 3)
np.testing.assert_allclose(f.variables["dummy4"][:], res4)
@@ -1529,12 +1579,19 @@
with h5netcdf.File(tmp_local_netcdf, "r") as ds:
f = ds["test"]
np.testing.assert_allclose(f.variables["dummy1"][:], res1)
+ np.testing.assert_allclose(f.variables["dummy1"][:, :], res1)
+ np.testing.assert_allclose(f.variables["dummy1"][1, :], [4.0, 5.0,
6.0])
+ np.testing.assert_allclose(f.variables["dummy1"][1:2, :], [[4.0, 5.0,
6.0]])
assert f.variables["dummy1"].shape == (3, 3)
assert f.variables["dummy1"]._h5ds.shape == (3, 3)
np.testing.assert_allclose(f.variables["dummy2"][:], res2)
+ np.testing.assert_allclose(f.variables["dummy2"][1, :], [4.0, 5.0,
6.0])
+ np.testing.assert_allclose(f.variables["dummy2"][1:2, :], [[4.0, 5.0,
6.0]])
assert f.variables["dummy2"].shape == (3, 3)
- assert f.variables["dummy2"]._h5ds.shape == (1, 3)
+ assert f.variables["dummy2"]._h5ds.shape == (2, 3)
np.testing.assert_allclose(f.variables["dummy3"][:], res3)
+ np.testing.assert_allclose(f.variables["dummy3"][1, :], [4.0, 5.0,
6.0])
+ np.testing.assert_allclose(f.variables["dummy3"][1:2, :], [[4.0, 5.0,
6.0]])
assert f.variables["dummy3"].shape == (3, 3)
assert f.variables["dummy3"]._h5ds.shape == (2, 3)
np.testing.assert_allclose(f.variables["dummy4"][:], res4)
@@ -1573,13 +1630,14 @@
np.testing.assert_array_equal(variable[...].data, 10)
-# https://github.com/h5netcdf/h5netcdf/issues/136
[email protected](reason="h5py bug with track_order")
-def test_track_order_false(tmp_local_netcdf):
- # track_order must be specified as True or not specified at all
- # https://github.com/h5netcdf/h5netcdf/issues/130
- with pytest.raises(ValueError):
- h5netcdf.File(tmp_local_netcdf, "w", track_order=False)
+def test_track_order_specification(tmp_local_netcdf):
+ # While netcdf4-c has historically only allowed track_order to be True
+ # There doesn't seem to be a good reason for this
+ # https://github.com/Unidata/netcdf-c/issues/2054 historically, h5netcdf
+ # has not specified this parameter (leaving it implicitely as False)
+ # We want to make sure we allow both here
+ with h5netcdf.File(tmp_local_netcdf, "w", track_order=False):
+ pass
with h5netcdf.File(tmp_local_netcdf, "w", track_order=True):
pass
@@ -1607,8 +1665,8 @@
# We don't expect any errors. This is effectively a void context
manager
expected_errors = memoryview(b"")
- with expected_errors:
- with h5netcdf.File(tmp_local_netcdf, "w", track_order=track_order) as
h5file:
+ with h5netcdf.File(tmp_local_netcdf, "w", track_order=track_order) as
h5file:
+ with expected_errors:
for i in range(100):
h5file.attrs[f"key{i}"] = i
h5file.attrs[f"key{i}"] = 0
@@ -1710,6 +1768,25 @@
ds["hello"][bool_slice, :]
+def test_fancy_indexing(tmp_local_netcdf):
+ # regression test for https://github.com/pydata/xarray/issues/7154
+ with h5netcdf.legacyapi.Dataset(tmp_local_netcdf, "w") as ds:
+ ds.createDimension("x", None)
+ ds.createDimension("y", None)
+ ds.createVariable("hello", int, ("x", "y"), fill_value=0)
+ ds["hello"][:5, :10] = np.arange(5 * 10, dtype="int").reshape((5, 10))
+ ds.createVariable("hello2", int, ("x", "y"))
+ ds["hello2"][:10, :20] = np.arange(10 * 20, dtype="int").reshape((10,
20))
+
+ with legacyapi.Dataset(tmp_local_netcdf, "a") as ds:
+ np.testing.assert_array_equal(ds["hello"][1, [7, 8, 9]], [17, 18, 19])
+ np.testing.assert_array_equal(ds["hello"][1, [9, 10, 11]], [19, 0, 0])
+ np.testing.assert_array_equal(ds["hello"][1, slice(9, 12)], [19, 0, 0])
+ np.testing.assert_array_equal(ds["hello"][[2, 3, 4], 1], [21, 31, 41])
+ np.testing.assert_array_equal(ds["hello"][[4, 5, 6], 1], [41, 0, 0])
+ np.testing.assert_array_equal(ds["hello"][slice(4, 7), 1], [41, 0, 0])
+
+
def test_h5py_chunking(tmp_local_netcdf):
with h5netcdf.File(tmp_local_netcdf, "w") as ds:
ds.dimensions = {"x": 10, "y": 10, "z": 10, "t": None}
@@ -1789,7 +1866,7 @@
def test_create_invalid_netcdf_catch_error(tmp_local_netcdf):
# see https://github.com/h5netcdf/h5netcdf/issues/138
- with h5netcdf.File("test.nc", "w") as f:
+ with h5netcdf.File(tmp_local_netcdf, "w") as f:
try:
f.create_variable("test", ("x", "y"), data=np.ones((10, 10),
dtype="bool"))
except CompatibilityError:
@@ -1797,8 +1874,8 @@
assert repr(f.dimensions) == "<h5netcdf.Dimensions: >"
-def test_dimensions_in_parent_groups():
- with netCDF4.Dataset("test_netcdf.nc", mode="w") as ds:
+def test_dimensions_in_parent_groups(tmpdir):
+ with netCDF4.Dataset(tmpdir.join("test_netcdf.nc"), mode="w") as ds:
ds0 = ds
for i in range(10):
ds = ds.createGroup(f"group{i:02d}")
@@ -1808,7 +1885,7 @@
var = ds0["group00"].createVariable("x", float, ("x", "y"))
var[:] = np.ones((10, 20))
- with legacyapi.Dataset("test_legacy.nc", mode="w") as ds:
+ with legacyapi.Dataset(tmpdir.join("test_legacy.nc"), mode="w") as ds:
ds0 = ds
for i in range(10):
ds = ds.createGroup(f"group{i:02d}")
@@ -1818,8 +1895,8 @@
var = ds0["group00"].createVariable("x", float, ("x", "y"))
var[:] = np.ones((10, 20))
- with h5netcdf.File("test_netcdf.nc", mode="r") as ds0:
- with h5netcdf.File("test_legacy.nc", mode="r") as ds1:
+ with h5netcdf.File(tmpdir.join("test_netcdf.nc"), mode="r") as ds0:
+ with h5netcdf.File(tmpdir.join("test_legacy.nc"), mode="r") as ds1:
assert repr(ds0.dimensions["x"]) == repr(ds1.dimensions["x"])
assert repr(ds0.dimensions["y"]) == repr(ds1.dimensions["y"])
assert repr(ds0["group00"]) == repr(ds1["group00"])
@@ -2025,3 +2102,78 @@
np.testing.assert_equal(ds.int_array, np.arange(10))
np.testing.assert_equal(ds.empty_list, np.array([]))
np.testing.assert_equal(ds.empty_array, np.array([]))
+
+
[email protected](
+ version.parse(h5py.__version__) < version.parse("3.7.0"),
+ reason="does not work with h5py < 3.7.0",
+)
+def test_vlen_string_dataset_fillvalue(tmp_local_netcdf, decode_vlen_strings):
+ # check _FillValue for VLEN string datasets
+ # only works for h5py >= 3.7.0
+
+ # first with new API
+ with h5netcdf.File(tmp_local_netcdf, "w") as ds:
+ ds.dimensions = {"string": 10}
+ dt0 = h5py.string_dtype()
+ fill_value0 = "bár"
+ ds.create_variable("x0", ("string",), dtype=dt0, fillvalue=fill_value0)
+ dt1 = h5py.string_dtype("ascii")
+ fill_value1 = "bar"
+ ds.create_variable("x1", ("string",), dtype=dt1, fillvalue=fill_value1)
+
+ # check, if new API can read them
+ with h5netcdf.File(tmp_local_netcdf, "r", **decode_vlen_strings) as ds:
+ decode_vlen = decode_vlen_strings["decode_vlen_strings"]
+ fvalue0 = fill_value0 if decode_vlen else fill_value0.encode("utf-8")
+ fvalue1 = fill_value1 if decode_vlen else fill_value1.encode("utf-8")
+ assert ds["x0"][0] == fvalue0
+ assert ds["x0"].attrs["_FillValue"] == fill_value0
+ assert ds["x1"][0] == fvalue1
+ assert ds["x1"].attrs["_FillValue"] == fill_value1
+
+ # check if legacyapi can read them
+ with legacyapi.Dataset(tmp_local_netcdf, "r") as ds:
+ assert ds["x0"][0] == fill_value0
+ assert ds["x0"]._FillValue == fill_value0
+ assert ds["x1"][0] == fill_value1
+ assert ds["x1"]._FillValue == fill_value1
+
+ # check if netCDF4-python can read them
+ with netCDF4.Dataset(tmp_local_netcdf, "r") as ds:
+ assert ds["x0"][0] == fill_value0
+ assert ds["x0"]._FillValue == fill_value0
+ assert ds["x1"][0] == fill_value1
+ assert ds["x1"]._FillValue == fill_value1
+
+ # second with legacyapi
+ with legacyapi.Dataset(tmp_local_netcdf, "w") as ds:
+ ds.createDimension("string", 10)
+ fill_value0 = "bár"
+ ds.createVariable("x0", str, ("string",), fill_value=fill_value0)
+ fill_value1 = "bar"
+ ds.createVariable("x1", str, ("string",), fill_value=fill_value1)
+
+ # check if new API can read them
+ with h5netcdf.File(tmp_local_netcdf, "r", **decode_vlen_strings) as ds:
+ decode_vlen = decode_vlen_strings["decode_vlen_strings"]
+ fvalue0 = fill_value0 if decode_vlen else fill_value0.encode("utf-8")
+ fvalue1 = fill_value1 if decode_vlen else fill_value1.encode("utf-8")
+ assert ds["x0"][0] == fvalue0
+ assert ds["x0"].attrs["_FillValue"] == fill_value0
+ assert ds["x1"][0] == fvalue1
+ assert ds["x1"].attrs["_FillValue"] == fill_value1
+
+ # check if legacyapi can read them
+ with legacyapi.Dataset(tmp_local_netcdf, "r") as ds:
+ assert ds["x0"][0] == fill_value0
+ assert ds["x0"]._FillValue == fill_value0
+ assert ds["x1"][0] == fill_value1
+ assert ds["x1"]._FillValue == fill_value1
+
+ # check if netCDF4-python can read them
+ with netCDF4.Dataset(tmp_local_netcdf, "r") as ds:
+ assert ds["x0"][0] == fill_value0
+ assert ds["x0"]._FillValue == fill_value0
+ assert ds["x1"][0] == fill_value1
+ assert ds["x1"]._FillValue == fill_value1
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn'
'--exclude=.svnignore' old/h5netcdf-1.0.2/h5netcdf.egg-info/PKG-INFO
new/h5netcdf-1.1.0/h5netcdf.egg-info/PKG-INFO
--- old/h5netcdf-1.0.2/h5netcdf.egg-info/PKG-INFO 2022-08-02
11:34:00.000000000 +0200
+++ new/h5netcdf-1.1.0/h5netcdf.egg-info/PKG-INFO 2022-11-23
07:40:28.000000000 +0100
@@ -1,6 +1,6 @@
Metadata-Version: 2.1
Name: h5netcdf
-Version: 1.0.2
+Version: 1.1.0
Summary: netCDF4 via h5py
Home-page: https://h5netcdf.org
Author: h5netcdf developers
@@ -265,15 +265,30 @@
Track Order
~~~~~~~~~~~
-In h5netcdf version 0.12.0 and earlier, `order tracking`_ was disabled in
-HDF5 file. As this is a requirement for the current netCDF4 standard,
-it has been enabled without deprecation as of version 0.13.0 `[*]`_.
+As of h5netcdf 1.1.0, if h5py 3.7.0 or greater is detected, the ``track_order``
+parameter is set to ``True`` enabling `order tracking`_ for newly created
+netCDF4 files. This helps ensure that files created with the h5netcdf library
+can be modified by the netCDF4-c and netCDF4-python implementation used in
+other software stacks. Since this change should be transparent to most users,
+it was made without deprecation.
+
+Since track_order is set at creation time, any dataset that was created with
+``track_order=False`` (h5netcdf version 1.0.2 and older except for 0.13.0) will
+continue to opened with order tracker disabled.
+
+The following describes the behavior of h5netcdf with respect to order tracking
+for a few key versions:
+
+- Version 0.12.0 and earlier, the ``track_order`` parameter`order was missing
+ and thus order tracking was implicitely set to ``False``.
+- Version 0.13.0 enabled order tracking by setting the parameter
+ ``track_order`` to ``True`` by default without deprecation.
+- Versions 0.13.1 to 1.0.2 set ``track_order`` to ``False`` due to a bug in a
+ core dependency of h5netcdf, h5py `upstream bug`_ which was resolved in h5py
+ 3.7.0 with the help of the h5netcdf team.
+- In version 1.1.0, if h5py 3.7.0 or above is detected, the ``track_order``
+ parameter is set to ``True`` by default.
-However in version 0.13.1 this has been reverted due to a bug in a core
-dependency of h5netcdf, h5py `upstream bug`_.
-
-Datasets created with h5netcdf version 0.12.0 that are opened with
-newer versions of h5netcdf will continue to disable order tracker.
.. _order tracking:
https://docs.unidata.ucar.edu/netcdf-c/current/file_format_specifications.html#creation_order
.. _upstream bug: https://github.com/h5netcdf/h5netcdf/issues/136