Script 'mail_helper' called by obssrc
Hello community,

here is the log from the commit of package python-fsspec for openSUSE:Factory 
checked in at 2022-04-28 23:07:47
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Comparing /work/SRC/openSUSE:Factory/python-fsspec (Old)
 and      /work/SRC/openSUSE:Factory/.python-fsspec.new.1538 (New)
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Package is "python-fsspec"

Thu Apr 28 23:07:47 2022 rev:19 rq:973298 version:2022.3.0

Changes:
--------
--- /work/SRC/openSUSE:Factory/python-fsspec/python-fsspec.changes      
2022-02-24 18:24:04.398648791 +0100
+++ /work/SRC/openSUSE:Factory/.python-fsspec.new.1538/python-fsspec.changes    
2022-04-28 23:07:50.324679041 +0200
@@ -1,0 +2,18 @@
+Mon Apr  4 09:08:29 UTC 2022 - John Paul Adrian Glaubitz 
<[email protected]>
+
+- Update to 2022.3.0
+  Enhancements
+  * tqdm example callback with simple methods (#931, 902)
+  * Allow empty root in get_mapper (#930)
+  * implement real info for reference FS (#919)
+  * list known implementations and compressions (#913)
+  Fixes
+  * git branch for testing git backend (#929)
+  * maintaine mem FS's root (#926)
+  * kargs to FS in parquet module (#921)
+  * fix on_error in references (#917)
+  * tar ls consistency (#9114)
+  * pyarrow: don't decompress twice (#911)
+  * fix FUSE tests (#905)
+
+-------------------------------------------------------------------

Old:
----
  fsspec-2022.02.0.tar.gz

New:
----
  fsspec-2022.3.0.tar.gz

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Other differences:
------------------
++++++ python-fsspec.spec ++++++
--- /var/tmp/diff_new_pack.Ft2AVc/_old  2022-04-28 23:07:50.848679613 +0200
+++ /var/tmp/diff_new_pack.Ft2AVc/_new  2022-04-28 23:07:50.852679617 +0200
@@ -26,9 +26,9 @@
 %bcond_with test
 %endif
 %define         skip_python2 1
-%define ghversion 2022.02.0
+%define ghversion 2022.3.0
 Name:           python-fsspec%{psuffix}
-Version:        2022.2.0
+Version:        2022.3.0
 Release:        0
 Summary:        Filesystem specification package
 License:        BSD-3-Clause

++++++ fsspec-2022.02.0.tar.gz -> fsspec-2022.3.0.tar.gz ++++++
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/filesystem_spec-2022.02.0/README.md 
new/filesystem_spec-2022.3.0/README.md
--- old/filesystem_spec-2022.02.0/README.md     2022-02-22 18:44:54.000000000 
+0100
+++ new/filesystem_spec-2022.3.0/README.md      2022-03-31 19:47:04.000000000 
+0200
@@ -40,7 +40,7 @@
 used to configure a development environment and run tests.
 
 First, setup a development conda environment via ``tox -e {env}`` where 
``env`` is one of ``{py36,py37,py38,py39}``.
-This will install fspec dependencies, test & dev tools, and install fsspec in 
develop
+This will install fsspec dependencies, test & dev tools, and install fsspec in 
develop
 mode. You may activate the dev environment under ``.tox/{env}`` via ``conda 
activate .tox/{env}``.
 
 ### Testing
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/filesystem_spec-2022.02.0/docs/source/api.rst 
new/filesystem_spec-2022.3.0/docs/source/api.rst
--- old/filesystem_spec-2022.02.0/docs/source/api.rst   2022-02-22 
18:44:54.000000000 +0100
+++ new/filesystem_spec-2022.3.0/docs/source/api.rst    2022-03-31 
19:47:04.000000000 +0200
@@ -10,6 +10,8 @@
    fsspec.open_files
    fsspec.open
    fsspec.open_local
+   fsspec.available_compressions
+   fsspec.available_protocols
    fsspec.filesystem
    fsspec.get_filesystem_class
    fsspec.get_mapper
@@ -19,6 +21,8 @@
 .. autofunction:: fsspec.open_files
 .. autofunction:: fsspec.open
 .. autofunction:: fsspec.open_local
+.. autofunction:: fsspec.available_compressions
+.. autofunction:: fsspec.available_protocols
 .. autofunction:: fsspec.filesystem
 .. autofunction:: fsspec.get_filesystem_class
 .. autofunction:: fsspec.get_mapper
@@ -47,6 +51,7 @@
    fsspec.callbacks.Callback
    fsspec.callbacks.NoOpCallback
    fsspec.callbacks.DotPrinterCallback
+   fsspec.callbacks.TqdmCallback
 
 .. autoclass:: fsspec.spec.AbstractFileSystem
    :members:
@@ -92,6 +97,9 @@
 .. autoclass:: fsspec.callbacks.DotPrinterCallback
    :members:
 
+.. autoclass:: fsspec.callbacks.TqdmCallback
+   :members:
+
 .. _implementations:
 
 Built-in Implementations
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/filesystem_spec-2022.02.0/docs/source/changelog.rst 
new/filesystem_spec-2022.3.0/docs/source/changelog.rst
--- old/filesystem_spec-2022.02.0/docs/source/changelog.rst     2022-02-22 
18:44:54.000000000 +0100
+++ new/filesystem_spec-2022.3.0/docs/source/changelog.rst      2022-03-31 
19:47:04.000000000 +0200
@@ -1,6 +1,27 @@
 Changelog
 =========
 
+2022.03.0
+---------
+
+Enhancements
+
+- tqdm example callback with simple methods (#931, 902)
+- Allow empty root in get_mapper (#930)
+- implement real info for reference FS (#919)
+- list known implementations and compressions (#913)
+
+Fixes
+
+- git branch for testing git backend (#929)
+- maintaine mem FS's root (#926)
+- kargs to FS in parquet module (#921)
+- fix on_error in references (#917)
+- tar ls consistency (#9114)
+- pyarrow: don't decompress twice (#911)
+- fix FUSE tests (#905)
+
+
 2022.02.0
 ---------
 
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/filesystem_spec-2022.02.0/docs/source/features.rst 
new/filesystem_spec-2022.3.0/docs/source/features.rst
--- old/filesystem_spec-2022.02.0/docs/source/features.rst      2022-02-22 
18:44:54.000000000 +0100
+++ new/filesystem_spec-2022.3.0/docs/source/features.rst       2022-03-31 
19:47:04.000000000 +0200
@@ -68,8 +68,10 @@
 
 As mentioned above, the ``OpenFile`` class allows for the opening of files on 
a binary store,
 which appear to be in text mode and/or allow for a compression/decompression 
layer between the
-caller and the back-end storage system. From the user's point of view, this is 
achieved simply
-by passing arguments to the :func:`fsspec.open_files` or :func:`fsspec.open` 
functions, and
+caller and the back-end storage system. The list of ``fsspec`` supported codec
+can be retrieved using :func:`fsspec.available_compressions`.
+From the user's point of view, this is achieved simply by passing arguments to
+the :func:`fsspec.open_files` or :func:`fsspec.open` functions, and
 thereafter happens transparently.
 
 Key-value stores
@@ -397,3 +399,5 @@
 backends.
 
 See the docstrings in the callbacks module for further details.
+``fsspec.callbacks.TqdmCallback`` can be used to display a progress bar using
+tqdm.
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/filesystem_spec-2022.02.0/docs/source/usage.rst 
new/filesystem_spec-2022.3.0/docs/source/usage.rst
--- old/filesystem_spec-2022.02.0/docs/source/usage.rst 2022-02-22 
18:44:54.000000000 +0100
+++ new/filesystem_spec-2022.3.0/docs/source/usage.rst  2022-03-31 
19:47:04.000000000 +0200
@@ -39,6 +39,8 @@
 
     fs = fsspec.filesystem('ftp', host=host, port=port, username=user, 
password=pw)
 
+The list of implemented ``fsspec`` protocols can be retrieved using 
:func:`fsspec.available_protocols`.
+
 Use a file-system
 -----------------
 
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/filesystem_spec-2022.02.0/fsspec/__init__.py 
new/filesystem_spec-2022.3.0/fsspec/__init__.py
--- old/filesystem_spec-2022.02.0/fsspec/__init__.py    2022-02-22 
18:44:54.000000000 +0100
+++ new/filesystem_spec-2022.3.0/fsspec/__init__.py     2022-03-31 
19:47:04.000000000 +0200
@@ -9,10 +9,12 @@
 
 from . import _version, caching
 from .callbacks import Callback
+from .compression import available_compressions
 from .core import get_fs_token_paths, open, open_files, open_local
 from .exceptions import FSTimeoutError
 from .mapping import FSMap, get_mapper
 from .registry import (
+    available_protocols,
     filesystem,
     get_filesystem_class,
     register_implementation,
@@ -37,6 +39,8 @@
     "registry",
     "caching",
     "Callback",
+    "available_protocols",
+    "available_compressions",
 ]
 
 
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/filesystem_spec-2022.02.0/fsspec/_version.py 
new/filesystem_spec-2022.3.0/fsspec/_version.py
--- old/filesystem_spec-2022.02.0/fsspec/_version.py    2022-02-22 
18:44:54.000000000 +0100
+++ new/filesystem_spec-2022.3.0/fsspec/_version.py     2022-03-31 
19:47:04.000000000 +0200
@@ -22,9 +22,9 @@
     # setup.py/versioneer.py will grep for the variable names, so they must
     # each be defined on a line of their own. _version.py will just call
     # get_keywords().
-    git_refnames = " (HEAD -> master, tag: 2022.02.0)"
-    git_full = "f9089f5ce97e1e52ab70ce1f372fc4c0feed5132"
-    git_date = "2022-02-22 12:44:54 -0500"
+    git_refnames = " (tag: 2022.3.0)"
+    git_full = "a8829696d341e62ca420fcde166434bf10dc68d4"
+    git_date = "2022-03-31 13:47:04 -0400"
     keywords = {"refnames": git_refnames, "full": git_full, "date": git_date}
     return keywords
 
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/filesystem_spec-2022.02.0/fsspec/asyn.py 
new/filesystem_spec-2022.3.0/fsspec/asyn.py
--- old/filesystem_spec-2022.02.0/fsspec/asyn.py        2022-02-22 
18:44:54.000000000 +0100
+++ new/filesystem_spec-2022.3.0/fsspec/asyn.py 2022-03-31 19:47:04.000000000 
+0200
@@ -413,7 +413,7 @@
     ):
         # TODO: on_error
         if max_gap is not None:
-            # to be implemented in utils
+            # use utils.merge_offset_ranges
             raise NotImplementedError
         if not isinstance(paths, list):
             raise TypeError
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/filesystem_spec-2022.02.0/fsspec/callbacks.py 
new/filesystem_spec-2022.3.0/fsspec/callbacks.py
--- old/filesystem_spec-2022.02.0/fsspec/callbacks.py   2022-02-22 
18:44:54.000000000 +0100
+++ new/filesystem_spec-2022.3.0/fsspec/callbacks.py    2022-03-31 
19:47:04.000000000 +0200
@@ -177,4 +177,44 @@
         print(self.chr, end="")
 
 
+class TqdmCallback(Callback):
+    """
+    A callback to display a progress bar using tqdm
+
+    Examples
+    --------
+    >>> import fsspec
+    >>> from fsspec.callbacks import TqdmCallback
+    >>> fs = fsspec.filesystem("memory")
+    >>> path2distant_data = "/your-path"
+    >>> fs.upload(
+            ".",
+            path2distant_data,
+            recursive=True,
+            callback=TqdmCallback(),
+        )
+    """
+
+    def __init__(self, *args, **kwargs):
+        try:
+            import tqdm
+
+            self._tqdm = tqdm
+        except ImportError as exce:
+            raise ImportError(
+                "Using TqdmCallback requires tqdm to be installed"
+            ) from exce
+        super().__init__(*args, **kwargs)
+
+    def set_size(self, size):
+        self.tqdm = self._tqdm.tqdm(desc="test", total=size)
+
+    def relative_update(self, inc=1):
+        self.tqdm.update(inc)
+
+    def __del__(self):
+        self.tqdm.close()
+        self.tqdm = None
+
+
 _DEFAULT_CALLBACK = NoOpCallback()
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/filesystem_spec-2022.02.0/fsspec/compression.py 
new/filesystem_spec-2022.3.0/fsspec/compression.py
--- old/filesystem_spec-2022.02.0/fsspec/compression.py 2022-02-22 
18:44:54.000000000 +0100
+++ new/filesystem_spec-2022.3.0/fsspec/compression.py  2022-03-31 
19:47:04.000000000 +0200
@@ -166,3 +166,8 @@
     register_compression("zstd", zstandard_file, "zst")
 except ImportError:
     pass
+
+
+def available_compressions():
+    """Return a list of the implemented compressions."""
+    return list(compr)
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/filesystem_spec-2022.02.0/fsspec/conftest.py 
new/filesystem_spec-2022.3.0/fsspec/conftest.py
--- old/filesystem_spec-2022.02.0/fsspec/conftest.py    2022-02-22 
18:44:54.000000000 +0100
+++ new/filesystem_spec-2022.3.0/fsspec/conftest.py     2022-03-31 
19:47:04.000000000 +0200
@@ -17,7 +17,7 @@
     """
     m = fsspec.filesystem("memory")
     m.store.clear()
-    m.pseudo_dirs.clear()
+    m.pseudo_dirs = [""]
     try:
         yield m
     finally:
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' 
old/filesystem_spec-2022.02.0/fsspec/implementations/arrow.py 
new/filesystem_spec-2022.3.0/fsspec/implementations/arrow.py
--- old/filesystem_spec-2022.02.0/fsspec/implementations/arrow.py       
2022-02-22 18:44:54.000000000 +0100
+++ new/filesystem_spec-2022.3.0/fsspec/implementations/arrow.py        
2022-03-31 19:47:04.000000000 +0200
@@ -28,6 +28,9 @@
     return wrapper
 
 
+PYARROW_VERSION = None
+
+
 class ArrowFSWrapper(AbstractFileSystem):
     """FSSpec-compatible wrapper of pyarrow.fs.FileSystem.
 
@@ -40,6 +43,10 @@
     root_marker = "/"
 
     def __init__(self, fs, **kwargs):
+        from pyarrow import __version__
+
+        global PYARROW_VERSION
+        PYARROW_VERSION = tuple(map(int, __version__.split(".")))
         self.fs = fs
         super().__init__(**kwargs)
 
@@ -139,12 +146,18 @@
     @wrap_exceptions
     def _open(self, path, mode="rb", block_size=None, **kwargs):
         if mode == "rb":
-            stream = self.fs.open_input_stream(path)
+            method = self.fs.open_input_stream
         elif mode == "wb":
-            stream = self.fs.open_output_stream(path)
+            method = self.fs.open_output_stream
         else:
             raise ValueError(f"unsupported mode for Arrow filesystem: 
{mode!r}")
 
+        _kwargs = {}
+        if PYARROW_VERSION[0] >= 4:
+            # disable compression auto-detection
+            _kwargs["compression"] = None
+        stream = method(path, **_kwargs)
+
         return ArrowFile(self, stream, path, mode, block_size, **kwargs)
 
     @wrap_exceptions
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' 
old/filesystem_spec-2022.02.0/fsspec/implementations/memory.py 
new/filesystem_spec-2022.3.0/fsspec/implementations/memory.py
--- old/filesystem_spec-2022.02.0/fsspec/implementations/memory.py      
2022-02-22 18:44:54.000000000 +0100
+++ new/filesystem_spec-2022.3.0/fsspec/implementations/memory.py       
2022-03-31 19:47:04.000000000 +0200
@@ -116,6 +116,9 @@
 
     def rmdir(self, path):
         path = self._strip_protocol(path)
+        if path == "":
+            # silently avoid deleting FS root
+            return
         if path in self.pseudo_dirs:
             if not self.ls(path):
                 self.pseudo_dirs.remove(path)
@@ -124,7 +127,7 @@
         else:
             raise FileNotFoundError(path)
 
-    def exists(self, path):
+    def exists(self, path, **kwargs):
         path = self._strip_protocol(path)
         return path in self.store or path in self.pseudo_dirs
 
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' 
old/filesystem_spec-2022.02.0/fsspec/implementations/reference.py 
new/filesystem_spec-2022.3.0/fsspec/implementations/reference.py
--- old/filesystem_spec-2022.02.0/fsspec/implementations/reference.py   
2022-02-22 18:44:54.000000000 +0100
+++ new/filesystem_spec-2022.3.0/fsspec/implementations/reference.py    
2022-03-31 19:47:04.000000000 +0200
@@ -14,13 +14,32 @@
 
 from ..asyn import AsyncFileSystem, sync
 from ..callbacks import _DEFAULT_CALLBACK
-from ..core import filesystem, open
-from ..mapping import get_mapper
+from ..core import filesystem, open, split_protocol
 from ..spec import AbstractFileSystem
 
 logger = logging.getLogger("fsspec.reference")
 
 
+def _first(d):
+    return list(d.values())[0]
+
+
+def _prot_in_references(path, references):
+    ref = references.get(path)
+    if isinstance(ref, (list, tuple)):
+        return split_protocol(ref[0])[0] if ref[0] else ref[0]
+
+
+def _protocol_groups(paths, references):
+    if isinstance(paths, str):
+        return {_prot_in_references(paths, references): [paths]}
+    out = {}
+    for path in paths:
+        protocol = _prot_in_references(path, references)
+        out.setdefault(protocol, []).append(path)
+    return out
+
+
 class ReferenceFileSystem(AsyncFileSystem):
     """View byte ranges of some other file as a file system
 
@@ -57,7 +76,6 @@
         template_overrides=None,
         simple_templates=True,
         loop=None,
-        ref_type=None,
         **kwargs,
     ):
         """
@@ -85,15 +103,16 @@
             order.
         remote_options : dict
             kwargs to go with remote_protocol
-        fs : file system instance
-            Directly provide a file system, if you want to configure it 
beforehand. This
-            takes precedence over target_protocol/target_options
+        fs : AbstractFileSystem | dict(str, (AbstractFileSystem | dict))
+            Directly provide a file system(s):
+                - a single filesystem instance
+                - a dict of protocol:filesystem, where each value is either a 
filesystem
+                  instance, or a dict of kwargs that can be used to create in
+                  instance for the given protocol
+            If this is given, remote_options and remote_protocol are ignored.
         template_overrides : dict
             Swap out any templates in the references file with these - useful 
for
             testing.
-        ref_type : "json" | "parquet" | "zarr"
-            If None, guessed from URL suffix, defaulting to JSON. Ignored if fo
-            is not a string.
         simple_templates: bool
             Whether templates can be processed with simple replace (True) or if
             jinja  is needed (False, much slower). All reference sets produced 
by
@@ -106,6 +125,7 @@
         self.template_overrides = template_overrides
         self.simple_templates = simple_templates
         self.templates = {}
+        self.fss = {}
         if hasattr(fo, "read"):
             text = fo.read()
         elif isinstance(fo, str):
@@ -114,45 +134,35 @@
             else:
                 extra = {}
             dic = dict(**(ref_storage_args or target_options or {}), **extra)
-            if ref_type == "zarr" or fo.endswith("zarr"):
-                import pandas as pd
-                import zarr
-
-                self.dataframe = True
-                m = get_mapper(fo, **dic)
-                z = zarr.open_group(m)
-                assert z.attrs["version"] == 1
-                self.templates = z.attrs["templates"]
-                self.gen = z.attrs.get("gen", None)
-                self.df = pd.DataFrame(
-                    {k: z[k][:] for k in ["key", "data", "url", "offset", 
"size"]}
-                ).set_index("key")
-            elif ref_type == "parquet" or fo.endswith("parquet"):
-                import fastparquet as fp
-
-                self.dataframe = True
-                with open(fo, "rb", **dic) as f:
-                    pf = fp.ParquetFile(f)
-                    assert pf.key_value_metadata["version"] == 1
-                    self.templates = 
json.loads(pf.key_value_metadata["templates"])
-                    self.gen = json.loads(pf.key_value_metadata.get("gen", 
"[]"))
-                    self.df = pf.to_pandas(index="key")
-            else:
-                # text JSON
-                with open(fo, "rb", **dic) as f:
-                    logger.info("Read reference from URL %s", fo)
-                    text = f.read()
+            # text JSON
+            with open(fo, "rb", **dic) as f:
+                logger.info("Read reference from URL %s", fo)
+                text = f.read()
         else:
-            # dictionaries; TODO: allow dataframe here?
+            # dictionaries
             text = fo
         if self.dataframe:
             self._process_dataframe()
         else:
             self._process_references(text, template_overrides)
-        if fs is not None:
-            self.fs = fs
+        if isinstance(fs, dict):
+            self.fss = {
+                k: (
+                    fsspec.filesystem(k.split(":", 1)[0], **opts)
+                    if isinstance(opts, dict)
+                    else opts
+                )
+                for k, opts in fs.items()
+            }
             return
+        if fs is not None:
+            # single remote FS
+            remote_protocol = (
+                fs.protocol[0] if isinstance(fs.protocol, tuple) else 
fs.protocol
+            )
+
         if remote_protocol is None:
+            # get single protocol from any templates
             for ref in self.templates.values():
                 if callable(ref):
                     ref = ref()
@@ -161,6 +171,7 @@
                     remote_protocol = protocol
                     break
         if remote_protocol is None:
+            # get single protocol from references
             for ref in self.references.values():
                 if callable(ref):
                     ref = ref()
@@ -172,11 +183,14 @@
         if remote_protocol is None:
             remote_protocol = target_protocol
 
-        self.fs = filesystem(remote_protocol, loop=loop, **(remote_options or 
{}))
+        fs = fs or filesystem(remote_protocol, loop=loop, **(remote_options or 
{}))
+        self.fss[remote_protocol] = fs
+        self.fss[None] = fs  # default one
 
     @property
     def loop(self):
-        return self.fs.loop if self.fs.async_impl else self._loop
+        inloop = [fs.loop for fs in self.fss.values() if fs.async_impl]
+        return inloop[0] if inloop else self._loop
 
     def _cat_common(self, path):
         path = self._strip_protocol(path)
@@ -215,14 +229,21 @@
         part_or_url, start0, end0 = self._cat_common(path)
         if isinstance(part_or_url, bytes):
             return part_or_url[start:end]
-        return (await self.fs._cat_file(part_or_url, start=start0, 
end=end0))[start:end]
+        protocol, _ = split_protocol(part_or_url)
+        # TODO: start and end should be passed to cat_file, not sliced
+        return (
+            await self.fss[protocol]._cat_file(part_or_url, start=start0, 
end=end0)
+        )[start:end]
 
     def cat_file(self, path, start=None, end=None, **kwargs):
         part_or_url, start0, end0 = self._cat_common(path)
         if isinstance(part_or_url, bytes):
             return part_or_url[start:end]
-        # TODO: update start0, end0 if start/end given, instead of slicing
-        return self.fs.cat_file(part_or_url, start=start0, end=end0)[start:end]
+        protocol, _ = split_protocol(part_or_url)
+        # TODO: start and end should be passed to cat_file, not sliced
+        return self.fss[protocol].cat_file(part_or_url, start=start0, 
end=end0)[
+            start:end
+        ]
 
     def pipe_file(self, path, value, **_):
         """Temporarily add binary data or reference as a file"""
@@ -245,19 +266,62 @@
         callback.absolute_update(len(data))
 
     def get(self, rpath, lpath, recursive=False, **kwargs):
-        if self.fs.async_impl:
-            return sync(self.loop, self._get, rpath, lpath, recursive, 
**kwargs)
-        return AbstractFileSystem.get(self, rpath, lpath, recursive=recursive, 
**kwargs)
-
-    def cat(self, path, recursive=False, **kwargs):
-        if self.fs.async_impl:
-            return sync(self.loop, self._cat, path, recursive, **kwargs)
-        elif isinstance(path, list):
-            if recursive or any("*" in p for p in path):
-                raise NotImplementedError
-            return {p: AbstractFileSystem.cat_file(self, p, **kwargs) for p in 
path}
-        else:
-            return AbstractFileSystem.cat_file(self, path)
+        if isinstance(lpath, list):
+            # because we have to figure out here which lpath goes with which 
path
+            # after grouping
+            raise NotImplementedError
+        proto_dict = _protocol_groups(rpath, self.references)
+        for proto, paths in proto_dict.items():
+            if self.fss[proto].async_impl:
+                sync(self.loop, self._get, paths, lpath, recursive, **kwargs)
+            else:
+                AbstractFileSystem.get(
+                    self, paths, lpath, recursive=recursive, **kwargs
+                )
+
+    def cat(self, path, recursive=False, on_error="raise", **kwargs):
+        proto_dict = _protocol_groups(path, self.references)
+        out = {}
+        for proto, paths in proto_dict.items():
+            if proto is None:
+                # binary/string
+                for p in paths:
+                    try:
+                        out[p] = AbstractFileSystem.cat_file(self, p, **kwargs)
+                    except Exception as e:
+                        if on_error == "raise":
+                            raise
+                        if on_error == "return":
+                            out[p] = e
+
+            elif self.fss[proto].async_impl:
+                # TODO: asyncio.gather on multiple async FSs
+                out.update(
+                    sync(
+                        self.loop,
+                        self._cat,
+                        paths,
+                        recursive,
+                        on_error=on_error,
+                        **kwargs,
+                    )
+                )
+            elif isinstance(paths, list):
+                if recursive or any("*" in p for p in paths):
+                    raise NotImplementedError
+                for p in paths:
+                    try:
+                        out[p] = AbstractFileSystem.cat_file(self, p, **kwargs)
+                    except Exception as e:
+                        if on_error == "raise":
+                            raise
+                        if on_error == "return":
+                            out[p] = e
+            else:
+                out.update(AbstractFileSystem.cat_file(self, paths))
+        if len(out) == 1 and isinstance(path, str) and "*" not in path:
+            return _first(out)
+        return out
 
     def _process_dataframe(self):
         self._process_templates(self.templates)
@@ -462,6 +526,10 @@
         out0 = [o for o in out if o["name"] == path]
         if not out0:
             return {"name": path, "type": "directory", "size": 0}
+        if out0[0]["size"] is None:
+            # if this is a whole remote file, update size using remote FS
+            prot, _ = split_protocol(self.references[path][0])
+            out0[0]["size"] = self.fss[prot].size(self.references[path][0])
         return out0[0]
 
     async def _info(self, path, **kwargs):  # calls fast sync code
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' 
old/filesystem_spec-2022.02.0/fsspec/implementations/tar.py 
new/filesystem_spec-2022.3.0/fsspec/implementations/tar.py
--- old/filesystem_spec-2022.02.0/fsspec/implementations/tar.py 2022-02-22 
18:44:54.000000000 +0100
+++ new/filesystem_spec-2022.3.0/fsspec/implementations/tar.py  2022-03-31 
19:47:04.000000000 +0200
@@ -88,7 +88,7 @@
         out = {}
         for ti in self.tar:
             info = ti.get_info()
-            info["type"] = typemap[info["type"]]
+            info["type"] = typemap.get(info["type"], "file")
             name = ti.get_info()["name"].rstrip("/")
             out[name] = (info, ti.offset_data)
 
@@ -96,14 +96,17 @@
         # TODO: save index to self.index_store here, if set
 
     def _get_dirs(self):
-
         if self.dir_cache is not None:
             return
 
-        self.dir_cache = {}
+        # This enables ls to get directories as children as well as files
+        self.dir_cache = {
+            dirname + "/": {"name": dirname + "/", "size": 0, "type": 
"directory"}
+            for dirname in self._all_dirnames(self.tar.getnames())
+        }
         for member in self.tar.getmembers():
             info = member.get_info()
-            info["type"] = typemap[info["type"]]
+            info["type"] = typemap.get(info["type"], "file")
             self.dir_cache[info["name"]] = info
 
     def _open(self, path, mode="rb", **kwargs):
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' 
old/filesystem_spec-2022.02.0/fsspec/implementations/tests/test_archive.py 
new/filesystem_spec-2022.3.0/fsspec/implementations/tests/test_archive.py
--- old/filesystem_spec-2022.02.0/fsspec/implementations/tests/test_archive.py  
2022-02-22 18:44:54.000000000 +0100
+++ new/filesystem_spec-2022.3.0/fsspec/implementations/tests/test_archive.py   
2022-03-31 19:47:04.000000000 +0200
@@ -249,7 +249,7 @@
     def test_mapping(self, scenario: ArchiveTestScenario):
         with scenario.provider(archive_data) as archive:
             fs = fsspec.filesystem(scenario.protocol, fo=archive)
-            m = fs.get_mapper("")
+            m = fs.get_mapper()
             assert list(m) == ["a", "b", "deeply/nested/path"]
             assert m["b"] == archive_data["b"]
 
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' 
old/filesystem_spec-2022.02.0/fsspec/implementations/tests/test_git.py 
new/filesystem_spec-2022.3.0/fsspec/implementations/tests/test_git.py
--- old/filesystem_spec-2022.02.0/fsspec/implementations/tests/test_git.py      
2022-02-22 18:44:54.000000000 +0100
+++ new/filesystem_spec-2022.3.0/fsspec/implementations/tests/test_git.py       
2022-03-31 19:47:04.000000000 +0200
@@ -17,8 +17,8 @@
     d = tempfile.mkdtemp()
     try:
         os.chdir(d)
-        subprocess.call("git init", shell=True, cwd=d)
-        subprocess.call("git init", shell=True, cwd=d)
+        subprocess.call("git init -b master", shell=True, cwd=d)
+        subprocess.call("git init -b master", shell=True, cwd=d)
         subprocess.call('git config user.email "[email protected]"', 
shell=True, cwd=d)
         subprocess.call('git config user.name "Your Name"', shell=True, cwd=d)
         open(os.path.join(d, "file1"), "wb").write(b"data0")
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' 
old/filesystem_spec-2022.02.0/fsspec/implementations/tests/test_memory.py 
new/filesystem_spec-2022.3.0/fsspec/implementations/tests/test_memory.py
--- old/filesystem_spec-2022.02.0/fsspec/implementations/tests/test_memory.py   
2022-02-22 18:44:54.000000000 +0100
+++ new/filesystem_spec-2022.3.0/fsspec/implementations/tests/test_memory.py    
2022-03-31 19:47:04.000000000 +0200
@@ -9,7 +9,7 @@
     files = m.find("")
     assert files == ["/afiles/and/another", "/somefile"]
 
-    files = sorted(m.get_mapper("/"))
+    files = sorted(m.get_mapper())
     assert files == ["afiles/and/another", "somefile"]
 
 
@@ -150,3 +150,9 @@
     f.seek(1)
     assert f.read(1) == "a"
     assert f.tell() == 2
+
+
+def test_remove_all(m):
+    m.touch("afile")
+    m.rm("/", recursive=True)
+    assert not m.ls("/")
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' 
old/filesystem_spec-2022.02.0/fsspec/implementations/tests/test_reference.py 
new/filesystem_spec-2022.3.0/fsspec/implementations/tests/test_reference.py
--- 
old/filesystem_spec-2022.02.0/fsspec/implementations/tests/test_reference.py    
    2022-02-22 18:44:54.000000000 +0100
+++ new/filesystem_spec-2022.3.0/fsspec/implementations/tests/test_reference.py 
2022-03-31 19:47:04.000000000 +0200
@@ -37,6 +37,21 @@
     assert fs.find("", withdirs=True) == ["a", "b", "c", "c/d"]
 
 
+def test_info(server):  # noqa: F811
+    refs = {
+        "a": b"data",
+        "b": (realfile, 0, 5),
+        "c/d": (realfile, 1, 6),
+        "e": (realfile,),
+    }
+    h = fsspec.filesystem("http", headers={"give_length": "true", "head_ok": 
"true"})
+    fs = fsspec.filesystem("reference", fo=refs, fs=h)
+    assert fs.size("a") == 4
+    assert fs.size("b") == 5
+    assert fs.size("c/d") == 6
+    assert fs.info("e")["size"] == len(data)
+
+
 def test_defaults(server):  # noqa: F811
     refs = {"a": b"data", "b": (None, 0, 5)}
     fs = fsspec.filesystem(
@@ -231,3 +246,71 @@
     fs.get("c", str(tmpdir / "c"), recursive=True)
     assert (tmpdir / "c").isdir()
     assert (tmpdir / "c" / "d").read_binary() == b"123456"
+
+
+def test_multi_fs_provided(m, tmpdir):
+    localfs = LocalFileSystem()
+
+    real = tmpdir / "file"
+    real.write_binary(b"0123456789")
+
+    m.pipe("afile", b"hello")
+
+    # local URLs are file:// by default
+    refs = {
+        "a": b"data",
+        "b": ("file://" + str(real), 0, 5),
+        "c/d": ("file://" + str(real), 1, 6),
+        "c/e": ["memory://afile"],
+    }
+
+    fs = fsspec.filesystem("reference", fo=refs, fs={"file": localfs, 
"memory": m})
+    assert fs.cat("c/e") == b"hello"
+    assert fs.cat(["c/e", "a", "b"]) == {
+        "a": b"data",
+        "b": b"01234",
+        "c/e": b"hello",
+    }
+
+
+def test_multi_fs_created(m, tmpdir):
+    real = tmpdir / "file"
+    real.write_binary(b"0123456789")
+
+    m.pipe("afile", b"hello")
+
+    # local URLs are file:// by default
+    refs = {
+        "a": b"data",
+        "b": ("file://" + str(real), 0, 5),
+        "c/d": ("file://" + str(real), 1, 6),
+        "c/e": ["memory://afile"],
+    }
+
+    fs = fsspec.filesystem("reference", fo=refs, fs={"file": {}, "memory": {}})
+    assert fs.cat("c/e") == b"hello"
+    assert fs.cat(["c/e", "a", "b"]) == {
+        "a": b"data",
+        "b": b"01234",
+        "c/e": b"hello",
+    }
+
+
+def test_missing_nonasync(m):
+    zarr = pytest.importorskip("zarr")
+    zarray = {
+        "chunks": [1],
+        "compressor": None,
+        "dtype": "<f8",
+        "fill_value": "NaN",
+        "filters": [],
+        "order": "C",
+        "shape": [10],
+        "zarr_format": 2,
+    }
+    refs = {".zarray": json.dumps(zarray)}
+
+    m = fsspec.get_mapper("reference://", fo=refs, remote_protocol="memory")
+
+    a = zarr.open_array(m)
+    assert str(a[0]) == "nan"
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' 
old/filesystem_spec-2022.02.0/fsspec/implementations/tests/test_tar.py 
new/filesystem_spec-2022.3.0/fsspec/implementations/tests/test_tar.py
--- old/filesystem_spec-2022.02.0/fsspec/implementations/tests/test_tar.py      
2022-02-22 18:44:54.000000000 +0100
+++ new/filesystem_spec-2022.3.0/fsspec/implementations/tests/test_tar.py       
2022-03-31 19:47:04.000000000 +0200
@@ -1,12 +1,17 @@
 import os
 import shutil
+import tarfile
 import tempfile
+from io import BytesIO
+from pathlib import Path
+from typing import Dict
 
 import pytest
 
 import fsspec
 from fsspec.core import OpenFile
 from fsspec.implementations.cached import WholeFileCacheFileSystem
+from fsspec.implementations.tar import TarFileSystem
 from fsspec.implementations.tests.test_archive import archive_data, temptar
 
 
@@ -171,7 +176,6 @@
     ids=["tar", "tar-gz", "tar-bz2", "tar-xz"],
 )
 def test_url_to_fs_direct(recipe, tmpdir):
-
     with temptar(archive_data, mode=recipe["mode"], suffix=recipe["suffix"]) 
as tf:
         url = f"tar://inner::file://{tf}"
         fs, url = fsspec.core.url_to_fs(url=url)
@@ -189,8 +193,48 @@
     ids=["tar", "tar-gz", "tar-bz2", "tar-xz"],
 )
 def test_url_to_fs_cached(recipe, tmpdir):
-
     with temptar(archive_data, mode=recipe["mode"], suffix=recipe["suffix"]) 
as tf:
         url = f"tar://inner::simplecache::file://{tf}"
         fs, url = fsspec.core.url_to_fs(url=url)
         assert fs.cat("b") == b"hello"
+
+
[email protected](
+    "compression", ["", "gz", "bz2", "xz"], ids=["tar", "tar-gz", "tar-bz2", 
"tar-xz"]
+)
+def test_ls_with_folders(compression: str, tmp_path: Path):
+    """
+    Create a tar file that doesn't include the intermediate folder structure,
+    but make sure that the reading filesystem is still able to resolve the
+    intermediate folders, like the ZipFileSystem.
+    """
+    tar_data: Dict[str, bytes] = {
+        "a.pdf": b"Hello A!",
+        "b/c.pdf": b"Hello C!",
+        "d/e/f.pdf": b"Hello F!",
+        "d/g.pdf": b"Hello G!",
+    }
+    if compression:
+        temp_archive_file = tmp_path / f"test_tar_file.tar.{compression}"
+    else:
+        temp_archive_file = tmp_path / "test_tar_file.tar"
+    with open(temp_archive_file, "wb") as fd:
+        # We need to manually write the tarfile here, because temptar
+        # creates intermediate directories which is not how tars are always 
created
+        with tarfile.open(fileobj=fd, mode=f"w:{compression}") as tf:
+            for tar_file_path, data in tar_data.items():
+                content = data
+                info = tarfile.TarInfo(name=tar_file_path)
+                info.size = len(content)
+                tf.addfile(info, BytesIO(content))
+    with open(temp_archive_file, "rb") as fd:
+        fs = TarFileSystem(fd)
+        assert fs.find("/", withdirs=True) == [
+            "a.pdf",
+            "b/",
+            "b/c.pdf",
+            "d/",
+            "d/e/",
+            "d/e/f.pdf",
+            "d/g.pdf",
+        ]
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/filesystem_spec-2022.02.0/fsspec/mapping.py 
new/filesystem_spec-2022.3.0/fsspec/mapping.py
--- old/filesystem_spec-2022.02.0/fsspec/mapping.py     2022-02-22 
18:44:54.000000000 +0100
+++ new/filesystem_spec-2022.3.0/fsspec/mapping.py      2022-03-31 
19:47:04.000000000 +0200
@@ -187,7 +187,7 @@
 
 
 def get_mapper(
-    url,
+    url="",
     check=False,
     create=False,
     missing_exceptions=None,
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/filesystem_spec-2022.02.0/fsspec/parquet.py 
new/filesystem_spec-2022.3.0/fsspec/parquet.py
--- old/filesystem_spec-2022.02.0/fsspec/parquet.py     2022-02-22 
18:44:54.000000000 +0100
+++ new/filesystem_spec-2022.3.0/fsspec/parquet.py      2022-03-31 
19:47:04.000000000 +0200
@@ -96,7 +96,7 @@
     # Make sure we have an `AbstractFileSystem` object
     # to work with
     if fs is None:
-        fs = url_to_fs(path, storage_options=(storage_options or {}))[0]
+        fs = url_to_fs(path, **(storage_options or {}))[0]
 
     # For now, `columns == []` not supported. Just use
     # default `open` command with `path` input
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/filesystem_spec-2022.02.0/fsspec/registry.py 
new/filesystem_spec-2022.3.0/fsspec/registry.py
--- old/filesystem_spec-2022.02.0/fsspec/registry.py    2022-02-22 
18:44:54.000000000 +0100
+++ new/filesystem_spec-2022.3.0/fsspec/registry.py     2022-03-31 
19:47:04.000000000 +0200
@@ -251,3 +251,11 @@
     """
     cls = get_filesystem_class(protocol)
     return cls(**storage_options)
+
+
+def available_protocols():
+    """Return a list of the implemented protocols.
+
+    Note that any given protocol may require extra packages to be importable.
+    """
+    return list(known_implementations)
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/filesystem_spec-2022.02.0/fsspec/spec.py 
new/filesystem_spec-2022.3.0/fsspec/spec.py
--- old/filesystem_spec-2022.02.0/fsspec/spec.py        2022-02-22 
18:44:54.000000000 +0100
+++ new/filesystem_spec-2022.3.0/fsspec/spec.py 2022-03-31 19:47:04.000000000 
+0200
@@ -1153,7 +1153,7 @@
         # all instances already also derive from pyarrow
         return self
 
-    def get_mapper(self, root, check=False, create=False):
+    def get_mapper(self, root="", check=False, create=False):
         """Create key/value store based on this file-system
 
         Makes a MutableMapping interface to the FS at the given root path.
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' 
old/filesystem_spec-2022.02.0/fsspec/tests/test_mapping.py 
new/filesystem_spec-2022.3.0/fsspec/tests/test_mapping.py
--- old/filesystem_spec-2022.02.0/fsspec/tests/test_mapping.py  2022-02-22 
18:44:54.000000000 +0100
+++ new/filesystem_spec-2022.3.0/fsspec/tests/test_mapping.py   2022-03-31 
19:47:04.000000000 +0200
@@ -5,6 +5,7 @@
 import pytest
 
 import fsspec
+from fsspec.implementations.local import LocalFileSystem
 from fsspec.implementations.memory import MemoryFileSystem
 
 
@@ -143,3 +144,8 @@
         dtype="<m8[ns]",
     )  # timedelta64 scalar
     assert m["c"] == b',M"\x9e\xc6\x99A\x065\x1c\xf0Rn4\xcb+'
+
+
+def test_empty_url():
+    m = fsspec.get_mapper()
+    assert isinstance(m.fs, LocalFileSystem)
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/filesystem_spec-2022.02.0/setup.py 
new/filesystem_spec-2022.3.0/setup.py
--- old/filesystem_spec-2022.02.0/setup.py      2022-02-22 18:44:54.000000000 
+0100
+++ new/filesystem_spec-2022.3.0/setup.py       2022-03-31 19:47:04.000000000 
+0200
@@ -21,6 +21,7 @@
         "Programming Language :: Python :: 3.7",
         "Programming Language :: Python :: 3.8",
         "Programming Language :: Python :: 3.9",
+        "Programming Language :: Python :: 3.10",
     ],
     description="File-system specification",
     long_description=long_description,
@@ -54,6 +55,7 @@
         "fuse": ["fusepy"],
         "libarchive": ["libarchive-c"],
         "gui": ["panel"],
+        "tqdm": ["tqdm"],
     },
     zip_safe=False,
 )

Reply via email to