commit python-h5netcdf for openSUSE:Factory

Source-Sync Sat, 07 Jan 2023 08:23:30 -0800

Script 'mail_helper' called by obssrc
Hello community,

here is the log from the commit of package python-h5netcdf for openSUSE:Factory 
checked in at 2023-01-07 17:19:57
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Comparing /work/SRC/openSUSE:Factory/python-h5netcdf (Old)
 and      /work/SRC/openSUSE:Factory/.python-h5netcdf.new.1563 (New)
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++


Package is "python-h5netcdf"

Sat Jan  7 17:19:57 2023 rev:7 rq:1056763 version:1.1.0

Changes:
--------
--- /work/SRC/openSUSE:Factory/python-h5netcdf/python-h5netcdf.changes  
2022-08-09 15:43:50.436272948 +0200
+++ 
/work/SRC/openSUSE:Factory/.python-h5netcdf.new.1563/python-h5netcdf.changes    
    2023-01-07 17:23:19.047445675 +0100
@@ -1,0 +2,12 @@
+Sat Jan  7 12:08:07 UTC 2023 - Dirk MÃ¼ller <[email protected]>
+
+- update to 1.1.0:
+  * Rework adding _FillValue-attribute, add tests.
+  * Add special add_phony method for creating phony dimensions, add test.
+  * Rewrite _unlabeled_dimension_mix (labeled/unlabeled), add tests.
+  * Add default netcdf fillvalues, pad only if necessary, adapt tests.
+  * Fix regression in padding algorithm, add test.
+  * Set ``track_order=True`` by default in created files if h5py 3.7.0 or
+  greater is detected to help compatibility with netCDF4-c programs.
+
+-------------------------------------------------------------------

Old:
----
  h5netcdf-1.0.2.tar.gz

New:
----
  h5netcdf-1.1.0.tar.gz

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Other differences:
------------------
++++++ python-h5netcdf.spec ++++++
--- /var/tmp/diff_new_pack.G7VZgM/_old  2023-01-07 17:23:19.647449254 +0100
+++ /var/tmp/diff_new_pack.G7VZgM/_new  2023-01-07 17:23:19.651449278 +0100
@@ -1,7 +1,7 @@
 #
 # spec file for package python-h5netcdf
 #
-# Copyright (c) 2022 SUSE LLC
+# Copyright (c) 2023 SUSE LLC
 #
 # All modifications and additions to the file contributed by third parties
 # remain the property of their copyright owners, unless otherwise agreed
@@ -20,7 +20,7 @@
 %define         skip_python2 1
 %define         skip_python36 1
 Name:           python-h5netcdf
-Version:        1.0.2
+Version:        1.1.0
 Release:        0
 Summary:        A Python library to use netCDF4 files via h5py
 License:        BSD-3-Clause

++++++ h5netcdf-1.0.2.tar.gz -> h5netcdf-1.1.0.tar.gz ++++++
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/h5netcdf-1.0.2/CHANGELOG.rst 
new/h5netcdf-1.1.0/CHANGELOG.rst
--- old/h5netcdf-1.0.2/CHANGELOG.rst    2022-08-02 11:33:39.000000000 +0200
+++ new/h5netcdf-1.1.0/CHANGELOG.rst    2022-11-23 07:40:05.000000000 +0100
@@ -1,5 +1,21 @@
 Change Log
 ----------
+Version 1.1.0 (November 23rd, 2022):
+
+- Rework adding _FillValue-attribute, add tests.
+  By `Kai MÃ¼hlbauer <https://github.com/kmuehlbauer>`_.
+- Add special add_phony method for creating phony dimensions, add test.
+  By `Kai MÃ¼hlbauer <https://github.com/kmuehlbauer>`_.
+- Rewrite _unlabeled_dimension_mix (labeled/unlabeled), add tests.
+  By `Kai MÃ¼hlbauer <https://github.com/kmuehlbauer>`_.
+- Add default netcdf fillvalues, pad only if necessary, adapt tests.
+  By `Kai MÃ¼hlbauer <https://github.com/kmuehlbauer>`_.
+- Fix regression in padding algorithm, add test.
+  By `Kai MÃ¼hlbauer <https://github.com/kmuehlbauer>`_.
+- Set ``track_order=True`` by default in created files if h5py 3.7.0 or 
+  greater is detected to help compatibility with netCDF4-c programs.
+  By `Mark Harfouche <https://github.com/hmaarrfk>`_.
+
 Version 1.0.2 (August 2nd, 2022):
 
 - Adapt boolean indexing as h5py 3.7.0 started supporting it.
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/h5netcdf-1.0.2/PKG-INFO new/h5netcdf-1.1.0/PKG-INFO
--- old/h5netcdf-1.0.2/PKG-INFO 2022-08-02 11:34:00.711885000 +0200
+++ new/h5netcdf-1.1.0/PKG-INFO 2022-11-23 07:40:28.608608200 +0100
@@ -1,6 +1,6 @@
 Metadata-Version: 2.1
 Name: h5netcdf
-Version: 1.0.2
+Version: 1.1.0
 Summary: netCDF4 via h5py
 Home-page: https://h5netcdf.org
 Author: h5netcdf developers
@@ -265,15 +265,30 @@
 Track Order
 ~~~~~~~~~~~
 
-In h5netcdf version 0.12.0 and earlier, `order tracking`_ was disabled in
-HDF5 file. As this is a requirement for the current netCDF4 standard,
-it has been enabled without deprecation as of version 0.13.0 `[*]`_.
+As of h5netcdf 1.1.0, if h5py 3.7.0 or greater is detected, the ``track_order``
+parameter is set to ``True`` enabling `order tracking`_ for newly created
+netCDF4 files. This helps ensure that files created with the h5netcdf library
+can be modified by the netCDF4-c and netCDF4-python implementation used in
+other software stacks. Since this change should be transparent to most users,
+it was made without deprecation.
+
+Since track_order is set at creation time, any dataset that was created with
+``track_order=False`` (h5netcdf version 1.0.2 and older except for 0.13.0) will
+continue to opened with order tracker disabled.
+
+The following describes the behavior of h5netcdf with respect to order tracking
+for a few key versions:
+
+- Version 0.12.0 and earlier, the ``track_order`` parameter`order was missing
+  and thus order tracking was implicitely set to ``False``.
+- Version 0.13.0 enabled order tracking by setting the parameter
+  ``track_order`` to ``True`` by default without deprecation.
+- Versions 0.13.1 to 1.0.2 set ``track_order`` to ``False`` due to a bug in a
+  core dependency of h5netcdf, h5py `upstream bug`_ which was resolved in h5py
+  3.7.0 with the help of the h5netcdf team.
+- In version 1.1.0, if h5py 3.7.0 or above is detected, the ``track_order``
+  parameter is set to ``True`` by default.
 
-However in version 0.13.1 this has been reverted due to a bug in a core
-dependency of h5netcdf, h5py `upstream bug`_.
-
-Datasets created with h5netcdf version 0.12.0 that are opened with
-newer versions of h5netcdf will continue to disable order tracker.
 
 .. _order tracking: 
https://docs.unidata.ucar.edu/netcdf-c/current/file_format_specifications.html#creation_order
 .. _upstream bug: https://github.com/h5netcdf/h5netcdf/issues/136
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/h5netcdf-1.0.2/README.rst 
new/h5netcdf-1.1.0/README.rst
--- old/h5netcdf-1.0.2/README.rst       2022-08-02 11:33:39.000000000 +0200
+++ new/h5netcdf-1.1.0/README.rst       2022-11-23 07:40:05.000000000 +0100
@@ -241,15 +241,30 @@
 Track Order
 ~~~~~~~~~~~
 
-In h5netcdf version 0.12.0 and earlier, `order tracking`_ was disabled in
-HDF5 file. As this is a requirement for the current netCDF4 standard,
-it has been enabled without deprecation as of version 0.13.0 `[*]`_.
+As of h5netcdf 1.1.0, if h5py 3.7.0 or greater is detected, the ``track_order``
+parameter is set to ``True`` enabling `order tracking`_ for newly created
+netCDF4 files. This helps ensure that files created with the h5netcdf library
+can be modified by the netCDF4-c and netCDF4-python implementation used in
+other software stacks. Since this change should be transparent to most users,
+it was made without deprecation.
 
-However in version 0.13.1 this has been reverted due to a bug in a core
-dependency of h5netcdf, h5py `upstream bug`_.
+Since track_order is set at creation time, any dataset that was created with
+``track_order=False`` (h5netcdf version 1.0.2 and older except for 0.13.0) will
+continue to opened with order tracker disabled.
+
+The following describes the behavior of h5netcdf with respect to order tracking
+for a few key versions:
+
+- Version 0.12.0 and earlier, the ``track_order`` parameter`order was missing
+  and thus order tracking was implicitely set to ``False``.
+- Version 0.13.0 enabled order tracking by setting the parameter
+  ``track_order`` to ``True`` by default without deprecation.
+- Versions 0.13.1 to 1.0.2 set ``track_order`` to ``False`` due to a bug in a
+  core dependency of h5netcdf, h5py `upstream bug`_ which was resolved in h5py
+  3.7.0 with the help of the h5netcdf team.
+- In version 1.1.0, if h5py 3.7.0 or above is detected, the ``track_order``
+  parameter is set to ``True`` by default.
 
-Datasets created with h5netcdf version 0.12.0 that are opened with
-newer versions of h5netcdf will continue to disable order tracker.
 
 .. _order tracking: 
https://docs.unidata.ucar.edu/netcdf-c/current/file_format_specifications.html#creation_order
 .. _upstream bug: https://github.com/h5netcdf/h5netcdf/issues/136
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/h5netcdf-1.0.2/h5netcdf/_version.py 
new/h5netcdf-1.1.0/h5netcdf/_version.py
--- old/h5netcdf-1.0.2/h5netcdf/_version.py     2022-08-02 11:34:00.000000000 
+0200
+++ new/h5netcdf-1.1.0/h5netcdf/_version.py     2022-11-23 07:40:28.000000000 
+0100
@@ -1,5 +1,5 @@
 # coding: utf-8
 # file generated by setuptools_scm
 # don't change, don't track in version control
-__version__ = version = '1.0.2'
-__version_tuple__ = version_tuple = (1, 0, 2)
+__version__ = version = '1.1.0'
+__version_tuple__ = version_tuple = (1, 1, 0)
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/h5netcdf-1.0.2/h5netcdf/core.py 
new/h5netcdf-1.1.0/h5netcdf/core.py
--- old/h5netcdf-1.0.2/h5netcdf/core.py 2022-08-02 11:33:39.000000000 +0200
+++ new/h5netcdf-1.1.0/h5netcdf/core.py 2022-11-23 07:40:05.000000000 +0100
@@ -51,12 +51,17 @@
 
 def _transform_1d_boolean_indexers(key):
     """Find and transform 1D boolean indexers to int"""
-    key = [
-        np.asanyarray(k).nonzero()[0]
-        if isinstance(k, (np.ndarray, list)) and type(k[0]) in (bool, np.bool_)
-        else k
-        for k in key
-    ]
+    # return key, if not iterable
+    try:
+        key = [
+            np.asanyarray(k).nonzero()[0]
+            if isinstance(k, (np.ndarray, list)) and type(k[0]) in (bool, 
np.bool_)
+            else k
+            for k in key
+        ]
+    except TypeError:
+        return key
+
     return tuple(key)
 
 
@@ -145,10 +150,14 @@
         # normal variable carrying DIMENSION_LIST
         # extract hdf5 file references and get objects name
         if "DIMENSION_LIST" in attrs:
-            return tuple(
-                self._root._h5file[ref[0]].name.split("/")[-1]
-                for ref in list(self._h5ds.attrs.get("DIMENSION_LIST", []))
-            )
+            # check if malformed variable and raise
+            if _unlabeled_dimension_mix(self._h5ds) == "labeled":
+                # If a dimension has attached more than one scale for some 
reason, then
+                # take the last one. This is in line with netcdf-c and 
netcdf4-python.
+                return tuple(
+                    self._root._h5file[ref[-1]].name.split("/")[-1]
+                    for ref in list(self._h5ds.attrs.get("DIMENSION_LIST", []))
+                )
 
         # need to use the h5ds name here to distinguish from collision 
dimensions
         child_name = self._h5ds.name.split("/")[-1]
@@ -271,6 +280,39 @@
         """Return NumPy dtype object giving the variableâs type."""
         return self._h5ds.dtype
 
+    def _get_padding(self, key):
+        """Return padding if needed, defaults to False."""
+        padding = False
+        if self.dtype != str and self.dtype.kind in ["f", "i", "u"]:
+            key0 = _expanded_indexer(key, self.ndim)
+            key0 = _transform_1d_boolean_indexers(key0)
+            # extract max shape of key vs hdf5-shape
+            h5ds_shape = self._h5ds.shape
+            shape = self.shape
+
+            # check for ndarray and list
+            # see https://github.com/pydata/xarray/issues/7154
+            # first get maximum index
+            max_index = [
+                max(k) + 1 if isinstance(k, (np.ndarray, list)) else k.stop
+                for k in key0
+            ]
+            # second convert to max shape
+            max_shape = tuple(
+                [
+                    shape[i] if k is None else max(h5ds_shape[i], k)
+                    for i, k in enumerate(max_index)
+                ]
+            )
+
+            # check if hdf5 dataset dimensions are smaller than
+            # their respective netcdf dimensions
+            sdiff = [d0 - d1 for d0, d1 in zip(max_shape, h5ds_shape)]
+            # create padding only if hdf5 dataset is smaller than netcdf 
dimension
+            if sum(sdiff):
+                padding = [(0, s) for s in sdiff]
+        return padding
+
     def __array__(self, *args, **kwargs):
         return self._h5ds.__array__(*args, **kwargs)
 
@@ -279,7 +321,6 @@
 
         if isinstance(self._parent._root, Dataset):
             # this is only for legacyapi
-            key = _expanded_indexer(key, self.ndim)
             # fix boolean indexing for affected versions
             # https://github.com/h5py/h5py/pull/2079
             # https://github.com/h5netcdf/h5netcdf/pull/125/
@@ -292,18 +333,17 @@
             if string_info and string_info.length is None:
                 return self._h5ds.asstr()[key]
 
-        # return array padded with fillvalue (both api)
-        if self.dtype != str and self.dtype.kind in ["f", "i", "u"]:
-            sdiff = [d0 - d1 for d0, d1 in zip(self.shape, self._h5ds.shape)]
-            if sum(sdiff):
-                fv = self.dtype.type(self._h5ds.fillvalue)
-                padding = [(0, s) for s in sdiff]
-                return np.pad(
-                    self._h5ds,
-                    pad_width=padding,
-                    mode="constant",
-                    constant_values=fv,
-                )[key]
+        # get padding
+        padding = self._get_padding(key)
+        # apply padding with fillvalue (both api)
+        if padding:
+            fv = self.dtype.type(self._h5ds.fillvalue)
+            return np.pad(
+                self._h5ds,
+                pad_width=padding,
+                mode="constant",
+                constant_values=fv,
+            )[key]
 
         return self._h5ds[key]
 
@@ -406,15 +446,26 @@
 
 
 def _unlabeled_dimension_mix(h5py_dataset):
-    dims = sum([len(j) for j in h5py_dataset.dims])
-    if dims:
-        if dims != h5py_dataset.ndim:
+    # check if dataset has dims and get it
+    dimlist = getattr(h5py_dataset, "dims", [])
+    if not dimlist:
+        status = "nodim"
+    else:
+        dimset = set([len(j) for j in dimlist])
+        # either all dimensions have exactly one scale
+        # or all dimensions have no scale
+        if dimset ^ {0} == set():
+            status = "unlabeled"
+        elif dimset & {0}:
             name = h5py_dataset.name.split("/")[-1]
             raise ValueError(
                 "malformed variable {0} has mixing of labeled and "
                 "unlabeled dimensions.".format(name)
             )
-    return dims
+        else:
+            status = "labeled"
+
+    return status
 
 
 class Group(Mapping):
@@ -462,9 +513,8 @@
                     self._dimensions.add(k)
                 else:
                     if self._root._phony_dims_mode is not None:
-
-                        # check if malformed variable
-                        if not _unlabeled_dimension_mix(v):
+                        # check if malformed variable and raise
+                        if _unlabeled_dimension_mix(v) == "unlabeled":
                             # if unscaled variable, get phony dimensions
                             phony_dims |= Counter(v.shape)
 
@@ -486,7 +536,7 @@
                     if self._root._phony_dims_mode == "sort":
                         name += self._root._max_dim_id + 1
                     name = "phony_dim_{}".format(name)
-                    self._dimensions[name] = size
+                    self._dimensions.add_phony(name, size)
 
         self._initialized = True
 
@@ -675,6 +725,14 @@
         if self._root._h5py.__name__ == "h5py":
             kwargs.update(dict(track_order=self._parent._track_order))
 
+        # handling default fillvalues for legacyapi
+        # see https://github.com/h5netcdf/h5netcdf/issues/182
+        from .legacyapi import Dataset, _get_default_fillvalue
+
+        fillval = fillvalue
+        if fillvalue is None and isinstance(self._parent._root, Dataset):
+            fillval = _get_default_fillvalue(dtype)
+
         # create hdf5 variable
         self._h5group.create_dataset(
             h5name,
@@ -682,7 +740,7 @@
             dtype=dtype,
             data=data,
             chunks=chunks,
-            fillvalue=fillvalue,
+            fillvalue=fillval,
             **kwargs,
         )
 
@@ -712,7 +770,20 @@
         variable._ensure_dim_id()
 
         if fillvalue is not None:
-            value = variable.dtype.type(fillvalue)
+            # trying to create correct type of fillvalue
+            if variable.dtype is str:
+                value = fillvalue
+            else:
+                string_info = 
self._root._h5py.check_string_dtype(variable.dtype)
+                if (
+                    string_info
+                    and string_info.length is not None
+                    and string_info.length > 1
+                ):
+                    value = fillvalue
+                else:
+                    value = variable.dtype.type(fillvalue)
+
             variable.attrs._h5attrs["_FillValue"] = value
         return variable
 
@@ -773,6 +844,12 @@
         """
         # if root-variable
         if name.startswith("/"):
+            # handling default fillvalues for legacyapi
+            # see https://github.com/h5netcdf/h5netcdf/issues/182
+            from .legacyapi import Dataset, _get_default_fillvalue
+
+            if fillvalue is None and isinstance(self._parent._root, Dataset):
+                fillvalue = _get_default_fillvalue(dtype)
             return self._root.create_variable(
                 name[1:],
                 dimensions,
@@ -911,6 +988,16 @@
         phony_dims: 'sort', 'access'
             See :ref:`phony dims` for more details.
 
+        track_order: bool
+            Corresponds to the h5py.File `track_order` parameter. Unless
+            specified, the library will choose a default that enhances
+            compatibility with netCDF4-c. If h5py version 3.7.0 or greater is
+            installed, this parameter will be set to True by default.
+            track_order is required to be true to for netCDF4-c libraries to
+            append to a file. If an older version of h5py is detected, this
+            parameter will be set to False by default to work around a bug in
+            h5py limiting the number of attributes for a given variable.
+
         **kwargs:
             Additional keyword arguments to be passed to the ``h5py.File``
             constructor.
@@ -930,22 +1017,14 @@
         # standard
         # https://github.com/Unidata/netcdf-c/issues/2054
         # https://github.com/h5netcdf/h5netcdf/issues/128
-        # 2022/01/20: hmaarrfk
-        # However, it was found that this causes issues with attrs and h5py
-        # https://github.com/h5netcdf/h5netcdf/issues/136
-        # https://github.com/h5py/h5py/issues/1385
-        track_order = kwargs.pop("track_order", False)
-
-        # When the issues with track_order in h5py are resolved, we
-        # can consider uncommenting the code below
-        # if not track_order:
-        #     self._closed = True
-        #     raise ValueError(
-        #         f"track_order, if specified must be set to to True (got 
{track_order})"
-        #         "to conform to the netCDF4 file format. Please see "
-        #         "https://github.com/h5netcdf/h5netcdf/issues/130 "
-        #         "for more details."
-        #     )
+        # h5py versions less than 3.7.0 had a bug that limited the number of
+        # attributes when track_order was set to true by default.
+        # However, setting track_order to True helps with compatibility
+        # with netcdf4-c and generally, keeping track of how things were added
+        # to the dataset.
+        # 
https://github.com/h5netcdf/h5netcdf/issues/136#issuecomment-1017457067
+        track_order_default = version.parse(h5py.__version__) >= 
version.parse("3.7.0")
+        track_order = kwargs.pop("track_order", track_order_default)
 
         if version.parse(h5py.__version__) >= version.parse("3.0.0"):
             self.decode_vlen_strings = kwargs.pop("decode_vlen_strings", None)
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/h5netcdf-1.0.2/h5netcdf/dimensions.py 
new/h5netcdf-1.1.0/h5netcdf/dimensions.py
--- old/h5netcdf-1.0.2/h5netcdf/dimensions.py   2022-08-02 11:33:39.000000000 
+0200
+++ new/h5netcdf-1.1.0/h5netcdf/dimensions.py   2022-11-23 07:40:05.000000000 
+0100
@@ -21,14 +21,18 @@
 
     def __setitem__(self, name, size):
         # creating new dimensions
-        phony = "phony_dim" in name
-        if not self._group._root._writable and not phony:
+        if not self._group._root._writable:
             raise RuntimeError("H5NetCDF: Write to read only")
         if name in self._objects:
             raise ValueError("dimension %r already exists" % name)
 
         self._objects[name] = Dimension(self._group, name, size, 
create_h5ds=True)
 
+    def add_phony(self, name, size):
+        self._objects[name] = Dimension(
+            self._group, name, size, create_h5ds=False, phony=True
+        )
+
     def add(self, name):
         # adding dimensions which are already created in the file
         self._objects[name] = Dimension(self._group, name)
@@ -56,7 +60,7 @@
 
 
 class Dimension(object):
-    def __init__(self, parent, name, size=None, create_h5ds=False):
+    def __init__(self, parent, name, size=None, create_h5ds=False, 
phony=False):
         """NetCDF4 Dimension constructor.
 
         Parameters
@@ -69,9 +73,11 @@
             Size of the Netcdf4 Dimension. Defaults to None (unlimited).
         create_h5ds : bool
             For internal use only.
+        phony : bool
+            For internal use only.
         """
         self._parent_ref = weakref.ref(parent)
-        self._phony = "phony_dim" in name
+        self._phony = phony
         self._root_ref = weakref.ref(parent._root)
         self._h5path = _join_h5paths(parent.name, name)
         self._name = name
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/h5netcdf-1.0.2/h5netcdf/legacyapi.py 
new/h5netcdf-1.1.0/h5netcdf/legacyapi.py
--- old/h5netcdf-1.0.2/h5netcdf/legacyapi.py    2022-08-02 11:33:39.000000000 
+0200
+++ new/h5netcdf-1.1.0/h5netcdf/legacyapi.py    2022-11-23 07:40:05.000000000 
+0100
@@ -5,6 +5,30 @@
 
 from . import core
 
+#: default netcdf fillvalues
+default_fillvals = {
+    "S1": "\x00",
+    "i1": -127,
+    "u1": 255,
+    "i2": -32767,
+    "u2": 65535,
+    "i4": -2147483647,
+    "u4": 4294967295,
+    "i8": -9223372036854775806,
+    "u8": 18446744073709551614,
+    "f4": 9.969209968386869e36,
+    "f8": 9.969209968386869e36,
+}
+
+
+def _get_default_fillvalue(dtype):
+    kind = np.dtype(dtype).kind
+    fillvalue = None
+    if kind in ["u", "i", "f"]:
+        size = np.dtype(dtype).itemsize
+        fillvalue = default_fillvals[f"{kind}{size}"]
+    return fillvalue
+
 
 def _check_return_dtype_endianess(endian="native"):
     little_endian = sys.byteorder == "little"
@@ -204,7 +228,7 @@
             fletcher32=fletcher32,
             chunks=chunksizes,
             fillvalue=fill_value,
-            **kwds
+            **kwds,
         )
 
 
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/h5netcdf-1.0.2/h5netcdf/tests/test_h5netcdf.py 
new/h5netcdf-1.1.0/h5netcdf/tests/test_h5netcdf.py
--- old/h5netcdf-1.0.2/h5netcdf/tests/test_h5netcdf.py  2022-08-02 
11:33:39.000000000 +0200
+++ new/h5netcdf-1.1.0/h5netcdf/tests/test_h5netcdf.py  2022-11-23 
07:40:05.000000000 +0100
@@ -677,6 +677,13 @@
             pass
 
 
+def test_fake_phony_dims(tmp_local_or_remote_netcdf):
+    # tests writing of dimension with phony naming scheme
+    # see https://github.com/h5netcdf/h5netcdf/issues/178
+    with h5netcdf.File(tmp_local_or_remote_netcdf, mode="w") as ds:
+        ds.dimensions["phony_dim_0"] = 3
+
+
 def check_invalid_netcdf4_mixed(var, i):
     pdim = "phony_dim_{}".format(i)
     assert var["foo1"].dimensions[0] == "y1"
@@ -761,8 +768,14 @@
         f["foo1"].dims[0].attach_scale(f["x"])
 
     with raises(ValueError):
+        with h5netcdf.File(tmp_local_or_remote_netcdf, "r") as ds:
+            assert ds
+            print(ds)
+
+    with raises(ValueError):
         with h5netcdf.File(tmp_local_or_remote_netcdf, "r", phony_dims="sort") 
as ds:
             assert ds
+            print(ds)
 
 
 def test_hierarchical_access_auto_create(tmp_local_or_remote_netcdf):
@@ -1142,6 +1155,7 @@
         assert f["dummy2"].shape == (3, 2, 2)
         f.groups["test"]["dummy3"].shape == (3, 3)
         f.groups["test"]["dummy4"].shape == (0, 0)
+        assert f["dummy5"].shape == (2, 3)
 
 
 def test_reading_unused_unlimited_dimension(tmp_local_or_remote_netcdf):
@@ -1163,10 +1177,12 @@
 
 
 def test_nc4_non_coord(tmp_local_netcdf):
-    # Track order True is the new default for versions after 0.12.0
-    # 0.12.0 defaults to `track_order=False`
-    # Ensure that the tests order the variables in their creation order
-    # not alphabetical order
+    # Here we generate a few variables and coordinates
+    # The default should be to track the order of creation
+    # Thus, on reopening the file, the order in which
+    # the variables are listed should be maintained
+    # y   --   refers to the coordinate y
+    # _nc4_non_coord_y  --  refers to the data y
     with h5netcdf.File(tmp_local_netcdf, "w") as f:
         f.dimensions = {"x": None, "y": 2}
         f.create_variable("test", dimensions=("x",), dtype=np.int64)
@@ -1177,8 +1193,23 @@
         assert f.dimensions["x"].size == 0
         assert f.dimensions["x"].isunlimited()
         assert f.dimensions["y"].size == 2
-        assert list(f.variables) == ["y", "test"]
-        assert list(f._h5group.keys()) == ["_nc4_non_coord_y", "test", "x", 
"y"]
+        if version.parse(h5py.__version__) >= version.parse("3.7.0"):
+            assert list(f.variables) == ["test", "y"]
+            assert list(f._h5group.keys()) == ["x", "y", "test", 
"_nc4_non_coord_y"]
+
+    with h5netcdf.File(tmp_local_netcdf, "w") as f:
+        f.dimensions = {"x": None, "y": 2}
+        f.create_variable("y", dimensions=("x",), dtype=np.int64)
+        f.create_variable("test", dimensions=("x",), dtype=np.int64)
+
+    with h5netcdf.File(tmp_local_netcdf, "r") as f:
+        assert list(f.dimensions) == ["x", "y"]
+        assert f.dimensions["x"].size == 0
+        assert f.dimensions["x"].isunlimited()
+        assert f.dimensions["y"].size == 2
+        if version.parse(h5py.__version__) >= version.parse("3.7.0"):
+            assert list(f.variables) == ["y", "test"]
+            assert list(f._h5group.keys()) == ["x", "y", "_nc4_non_coord_y", 
"test"]
 
 
 def test_overwrite_existing_file(tmp_local_netcdf):
@@ -1472,6 +1503,9 @@
 
 
 def test_expanded_variables_netcdf4(tmp_local_netcdf, netcdf_write_module):
+    # partially reimplemented due to performance reason in edge cases
+    # https://github.com/h5netcdf/h5netcdf/issues/182
+
     with netcdf_write_module.Dataset(tmp_local_netcdf, "w") as ds:
         f = ds.createGroup("test")
         f.createDimension("x", None)
@@ -1483,8 +1517,8 @@
         dummy4 = f.createVariable("dummy4", float, ("x", "y"))
 
         dummy1[:] = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
-        dummy2[:] = [[1, 2, 3]]
-        dummy3[:] = [[1, 2, 3], [4, 5, 6]]
+        dummy2[1, :] = [4, 5, 6]
+        dummy3[0:2, :] = [[1, 2, 3], [4, 5, 6]]
 
         # don't mask, since h5netcdf doesn't do masking
         if netcdf_write_module == netCDF4:
@@ -1503,10 +1537,16 @@
         f = ds["test"]
 
         np.testing.assert_allclose(f.variables["dummy1"][:], res1)
+        np.testing.assert_allclose(f.variables["dummy1"][1, :], [4.0, 5.0, 
6.0])
+        np.testing.assert_allclose(f.variables["dummy1"][1:2, :], [[4.0, 5.0, 
6.0]])
         assert f.variables["dummy1"].shape == (3, 3)
         np.testing.assert_allclose(f.variables["dummy2"][:], res2)
+        np.testing.assert_allclose(f.variables["dummy2"][1, :], [4.0, 5.0, 
6.0])
+        np.testing.assert_allclose(f.variables["dummy2"][1:2, :], [[4.0, 5.0, 
6.0]])
         assert f.variables["dummy2"].shape == (3, 3)
         np.testing.assert_allclose(f.variables["dummy3"][:], res3)
+        np.testing.assert_allclose(f.variables["dummy3"][1, :], [4.0, 5.0, 
6.0])
+        np.testing.assert_allclose(f.variables["dummy3"][1:2, :], [[4.0, 5.0, 
6.0]])
         assert f.variables["dummy3"].shape == (3, 3)
         np.testing.assert_allclose(f.variables["dummy4"][:], res4)
         assert f.variables["dummy4"].shape == (3, 3)
@@ -1514,12 +1554,22 @@
     with legacyapi.Dataset(tmp_local_netcdf, "r") as ds:
         f = ds["test"]
         np.testing.assert_allclose(f.variables["dummy1"][:], res1)
+        np.testing.assert_allclose(f.variables["dummy1"][1, :], [4.0, 5.0, 
6.0])
+        np.testing.assert_allclose(f.variables["dummy1"][1:2, :], [[4.0, 5.0, 
6.0]])
+        np.testing.assert_allclose(f.variables["dummy1"]._h5ds[1, :], [4.0, 
5.0, 6.0])
+        np.testing.assert_allclose(
+            f.variables["dummy1"]._h5ds[1:2, :], [[4.0, 5.0, 6.0]]
+        )
         assert f.variables["dummy1"].shape == (3, 3)
         assert f.variables["dummy1"]._h5ds.shape == (3, 3)
         np.testing.assert_allclose(f.variables["dummy2"][:], res2)
+        np.testing.assert_allclose(f.variables["dummy2"][1, :], [4.0, 5.0, 
6.0])
+        np.testing.assert_allclose(f.variables["dummy2"][1:2, :], [[4.0, 5.0, 
6.0]])
         assert f.variables["dummy2"].shape == (3, 3)
-        assert f.variables["dummy2"]._h5ds.shape == (1, 3)
+        assert f.variables["dummy2"]._h5ds.shape == (2, 3)
         np.testing.assert_allclose(f.variables["dummy3"][:], res3)
+        np.testing.assert_allclose(f.variables["dummy3"][1, :], [4.0, 5.0, 
6.0])
+        np.testing.assert_allclose(f.variables["dummy3"][1:2, :], [[4.0, 5.0, 
6.0]])
         assert f.variables["dummy3"].shape == (3, 3)
         assert f.variables["dummy3"]._h5ds.shape == (2, 3)
         np.testing.assert_allclose(f.variables["dummy4"][:], res4)
@@ -1529,12 +1579,19 @@
     with h5netcdf.File(tmp_local_netcdf, "r") as ds:
         f = ds["test"]
         np.testing.assert_allclose(f.variables["dummy1"][:], res1)
+        np.testing.assert_allclose(f.variables["dummy1"][:, :], res1)
+        np.testing.assert_allclose(f.variables["dummy1"][1, :], [4.0, 5.0, 
6.0])
+        np.testing.assert_allclose(f.variables["dummy1"][1:2, :], [[4.0, 5.0, 
6.0]])
         assert f.variables["dummy1"].shape == (3, 3)
         assert f.variables["dummy1"]._h5ds.shape == (3, 3)
         np.testing.assert_allclose(f.variables["dummy2"][:], res2)
+        np.testing.assert_allclose(f.variables["dummy2"][1, :], [4.0, 5.0, 
6.0])
+        np.testing.assert_allclose(f.variables["dummy2"][1:2, :], [[4.0, 5.0, 
6.0]])
         assert f.variables["dummy2"].shape == (3, 3)
-        assert f.variables["dummy2"]._h5ds.shape == (1, 3)
+        assert f.variables["dummy2"]._h5ds.shape == (2, 3)
         np.testing.assert_allclose(f.variables["dummy3"][:], res3)
+        np.testing.assert_allclose(f.variables["dummy3"][1, :], [4.0, 5.0, 
6.0])
+        np.testing.assert_allclose(f.variables["dummy3"][1:2, :], [[4.0, 5.0, 
6.0]])
         assert f.variables["dummy3"].shape == (3, 3)
         assert f.variables["dummy3"]._h5ds.shape == (2, 3)
         np.testing.assert_allclose(f.variables["dummy4"][:], res4)
@@ -1573,13 +1630,14 @@
         np.testing.assert_array_equal(variable[...].data, 10)
 
 
-# https://github.com/h5netcdf/h5netcdf/issues/136
[email protected](reason="h5py bug with track_order")
-def test_track_order_false(tmp_local_netcdf):
-    # track_order must be specified as True or not specified at all
-    # https://github.com/h5netcdf/h5netcdf/issues/130
-    with pytest.raises(ValueError):
-        h5netcdf.File(tmp_local_netcdf, "w", track_order=False)
+def test_track_order_specification(tmp_local_netcdf):
+    # While netcdf4-c has historically only allowed track_order to be True
+    # There doesn't seem to be a good reason for this
+    # https://github.com/Unidata/netcdf-c/issues/2054 historically, h5netcdf
+    # has not specified this parameter (leaving it implicitely as False)
+    # We want to make sure we allow both here
+    with h5netcdf.File(tmp_local_netcdf, "w", track_order=False):
+        pass
 
     with h5netcdf.File(tmp_local_netcdf, "w", track_order=True):
         pass
@@ -1607,8 +1665,8 @@
         # We don't expect any errors. This is effectively a void context 
manager
         expected_errors = memoryview(b"")
 
-    with expected_errors:
-        with h5netcdf.File(tmp_local_netcdf, "w", track_order=track_order) as 
h5file:
+    with h5netcdf.File(tmp_local_netcdf, "w", track_order=track_order) as 
h5file:
+        with expected_errors:
             for i in range(100):
                 h5file.attrs[f"key{i}"] = i
                 h5file.attrs[f"key{i}"] = 0
@@ -1710,6 +1768,25 @@
             ds["hello"][bool_slice, :]
 
 
+def test_fancy_indexing(tmp_local_netcdf):
+    # regression test for https://github.com/pydata/xarray/issues/7154
+    with h5netcdf.legacyapi.Dataset(tmp_local_netcdf, "w") as ds:
+        ds.createDimension("x", None)
+        ds.createDimension("y", None)
+        ds.createVariable("hello", int, ("x", "y"), fill_value=0)
+        ds["hello"][:5, :10] = np.arange(5 * 10, dtype="int").reshape((5, 10))
+        ds.createVariable("hello2", int, ("x", "y"))
+        ds["hello2"][:10, :20] = np.arange(10 * 20, dtype="int").reshape((10, 
20))
+
+    with legacyapi.Dataset(tmp_local_netcdf, "a") as ds:
+        np.testing.assert_array_equal(ds["hello"][1, [7, 8, 9]], [17, 18, 19])
+        np.testing.assert_array_equal(ds["hello"][1, [9, 10, 11]], [19, 0, 0])
+        np.testing.assert_array_equal(ds["hello"][1, slice(9, 12)], [19, 0, 0])
+        np.testing.assert_array_equal(ds["hello"][[2, 3, 4], 1], [21, 31, 41])
+        np.testing.assert_array_equal(ds["hello"][[4, 5, 6], 1], [41, 0, 0])
+        np.testing.assert_array_equal(ds["hello"][slice(4, 7), 1], [41, 0, 0])
+
+
 def test_h5py_chunking(tmp_local_netcdf):
     with h5netcdf.File(tmp_local_netcdf, "w") as ds:
         ds.dimensions = {"x": 10, "y": 10, "z": 10, "t": None}
@@ -1789,7 +1866,7 @@
 
 def test_create_invalid_netcdf_catch_error(tmp_local_netcdf):
     # see https://github.com/h5netcdf/h5netcdf/issues/138
-    with h5netcdf.File("test.nc", "w") as f:
+    with h5netcdf.File(tmp_local_netcdf, "w") as f:
         try:
             f.create_variable("test", ("x", "y"), data=np.ones((10, 10), 
dtype="bool"))
         except CompatibilityError:
@@ -1797,8 +1874,8 @@
         assert repr(f.dimensions) == "<h5netcdf.Dimensions: >"
 
 
-def test_dimensions_in_parent_groups():
-    with netCDF4.Dataset("test_netcdf.nc", mode="w") as ds:
+def test_dimensions_in_parent_groups(tmpdir):
+    with netCDF4.Dataset(tmpdir.join("test_netcdf.nc"), mode="w") as ds:
         ds0 = ds
         for i in range(10):
             ds = ds.createGroup(f"group{i:02d}")
@@ -1808,7 +1885,7 @@
         var = ds0["group00"].createVariable("x", float, ("x", "y"))
         var[:] = np.ones((10, 20))
 
-    with legacyapi.Dataset("test_legacy.nc", mode="w") as ds:
+    with legacyapi.Dataset(tmpdir.join("test_legacy.nc"), mode="w") as ds:
         ds0 = ds
         for i in range(10):
             ds = ds.createGroup(f"group{i:02d}")
@@ -1818,8 +1895,8 @@
         var = ds0["group00"].createVariable("x", float, ("x", "y"))
         var[:] = np.ones((10, 20))
 
-    with h5netcdf.File("test_netcdf.nc", mode="r") as ds0:
-        with h5netcdf.File("test_legacy.nc", mode="r") as ds1:
+    with h5netcdf.File(tmpdir.join("test_netcdf.nc"), mode="r") as ds0:
+        with h5netcdf.File(tmpdir.join("test_legacy.nc"), mode="r") as ds1:
             assert repr(ds0.dimensions["x"]) == repr(ds1.dimensions["x"])
             assert repr(ds0.dimensions["y"]) == repr(ds1.dimensions["y"])
             assert repr(ds0["group00"]) == repr(ds1["group00"])
@@ -2025,3 +2102,78 @@
         np.testing.assert_equal(ds.int_array, np.arange(10))
         np.testing.assert_equal(ds.empty_list, np.array([]))
         np.testing.assert_equal(ds.empty_array, np.array([]))
+
+
[email protected](
+    version.parse(h5py.__version__) < version.parse("3.7.0"),
+    reason="does not work with h5py < 3.7.0",
+)
+def test_vlen_string_dataset_fillvalue(tmp_local_netcdf, decode_vlen_strings):
+    # check _FillValue for VLEN string datasets
+    # only works for h5py >= 3.7.0
+
+    # first with new API
+    with h5netcdf.File(tmp_local_netcdf, "w") as ds:
+        ds.dimensions = {"string": 10}
+        dt0 = h5py.string_dtype()
+        fill_value0 = "bÃ¡r"
+        ds.create_variable("x0", ("string",), dtype=dt0, fillvalue=fill_value0)
+        dt1 = h5py.string_dtype("ascii")
+        fill_value1 = "bar"
+        ds.create_variable("x1", ("string",), dtype=dt1, fillvalue=fill_value1)
+
+    # check, if new API can read them
+    with h5netcdf.File(tmp_local_netcdf, "r", **decode_vlen_strings) as ds:
+        decode_vlen = decode_vlen_strings["decode_vlen_strings"]
+        fvalue0 = fill_value0 if decode_vlen else fill_value0.encode("utf-8")
+        fvalue1 = fill_value1 if decode_vlen else fill_value1.encode("utf-8")
+        assert ds["x0"][0] == fvalue0
+        assert ds["x0"].attrs["_FillValue"] == fill_value0
+        assert ds["x1"][0] == fvalue1
+        assert ds["x1"].attrs["_FillValue"] == fill_value1
+
+    # check if legacyapi can read them
+    with legacyapi.Dataset(tmp_local_netcdf, "r") as ds:
+        assert ds["x0"][0] == fill_value0
+        assert ds["x0"]._FillValue == fill_value0
+        assert ds["x1"][0] == fill_value1
+        assert ds["x1"]._FillValue == fill_value1
+
+    # check if netCDF4-python can read them
+    with netCDF4.Dataset(tmp_local_netcdf, "r") as ds:
+        assert ds["x0"][0] == fill_value0
+        assert ds["x0"]._FillValue == fill_value0
+        assert ds["x1"][0] == fill_value1
+        assert ds["x1"]._FillValue == fill_value1
+
+    # second with legacyapi
+    with legacyapi.Dataset(tmp_local_netcdf, "w") as ds:
+        ds.createDimension("string", 10)
+        fill_value0 = "bÃ¡r"
+        ds.createVariable("x0", str, ("string",), fill_value=fill_value0)
+        fill_value1 = "bar"
+        ds.createVariable("x1", str, ("string",), fill_value=fill_value1)
+
+    # check if new API can read them
+    with h5netcdf.File(tmp_local_netcdf, "r", **decode_vlen_strings) as ds:
+        decode_vlen = decode_vlen_strings["decode_vlen_strings"]
+        fvalue0 = fill_value0 if decode_vlen else fill_value0.encode("utf-8")
+        fvalue1 = fill_value1 if decode_vlen else fill_value1.encode("utf-8")
+        assert ds["x0"][0] == fvalue0
+        assert ds["x0"].attrs["_FillValue"] == fill_value0
+        assert ds["x1"][0] == fvalue1
+        assert ds["x1"].attrs["_FillValue"] == fill_value1
+
+    # check if legacyapi can read them
+    with legacyapi.Dataset(tmp_local_netcdf, "r") as ds:
+        assert ds["x0"][0] == fill_value0
+        assert ds["x0"]._FillValue == fill_value0
+        assert ds["x1"][0] == fill_value1
+        assert ds["x1"]._FillValue == fill_value1
+
+    # check if netCDF4-python can read them
+    with netCDF4.Dataset(tmp_local_netcdf, "r") as ds:
+        assert ds["x0"][0] == fill_value0
+        assert ds["x0"]._FillValue == fill_value0
+        assert ds["x1"][0] == fill_value1
+        assert ds["x1"]._FillValue == fill_value1
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/h5netcdf-1.0.2/h5netcdf.egg-info/PKG-INFO 
new/h5netcdf-1.1.0/h5netcdf.egg-info/PKG-INFO
--- old/h5netcdf-1.0.2/h5netcdf.egg-info/PKG-INFO       2022-08-02 
11:34:00.000000000 +0200
+++ new/h5netcdf-1.1.0/h5netcdf.egg-info/PKG-INFO       2022-11-23 
07:40:28.000000000 +0100
@@ -1,6 +1,6 @@
 Metadata-Version: 2.1
 Name: h5netcdf
-Version: 1.0.2
+Version: 1.1.0
 Summary: netCDF4 via h5py
 Home-page: https://h5netcdf.org
 Author: h5netcdf developers
@@ -265,15 +265,30 @@
 Track Order
 ~~~~~~~~~~~
 
-In h5netcdf version 0.12.0 and earlier, `order tracking`_ was disabled in
-HDF5 file. As this is a requirement for the current netCDF4 standard,
-it has been enabled without deprecation as of version 0.13.0 `[*]`_.
+As of h5netcdf 1.1.0, if h5py 3.7.0 or greater is detected, the ``track_order``
+parameter is set to ``True`` enabling `order tracking`_ for newly created
+netCDF4 files. This helps ensure that files created with the h5netcdf library
+can be modified by the netCDF4-c and netCDF4-python implementation used in
+other software stacks. Since this change should be transparent to most users,
+it was made without deprecation.
+
+Since track_order is set at creation time, any dataset that was created with
+``track_order=False`` (h5netcdf version 1.0.2 and older except for 0.13.0) will
+continue to opened with order tracker disabled.
+
+The following describes the behavior of h5netcdf with respect to order tracking
+for a few key versions:
+
+- Version 0.12.0 and earlier, the ``track_order`` parameter`order was missing
+  and thus order tracking was implicitely set to ``False``.
+- Version 0.13.0 enabled order tracking by setting the parameter
+  ``track_order`` to ``True`` by default without deprecation.
+- Versions 0.13.1 to 1.0.2 set ``track_order`` to ``False`` due to a bug in a
+  core dependency of h5netcdf, h5py `upstream bug`_ which was resolved in h5py
+  3.7.0 with the help of the h5netcdf team.
+- In version 1.1.0, if h5py 3.7.0 or above is detected, the ``track_order``
+  parameter is set to ``True`` by default.
 
-However in version 0.13.1 this has been reverted due to a bug in a core
-dependency of h5netcdf, h5py `upstream bug`_.
-
-Datasets created with h5netcdf version 0.12.0 that are opened with
-newer versions of h5netcdf will continue to disable order tracker.
 
 .. _order tracking: 
https://docs.unidata.ucar.edu/netcdf-c/current/file_format_specifications.html#creation_order
 .. _upstream bug: https://github.com/h5netcdf/h5netcdf/issues/136

commit python-h5netcdf for openSUSE:Factory

Reply via email to