Hello community, here is the log from the commit of package python-s3fs for openSUSE:Factory checked in at 2019-11-22 10:27:05 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Comparing /work/SRC/openSUSE:Factory/python-s3fs (Old) and /work/SRC/openSUSE:Factory/.python-s3fs.new.26869 (New) ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Package is "python-s3fs" Fri Nov 22 10:27:05 2019 rev:5 rq:749967 version:0.4.0 Changes: -------- --- /work/SRC/openSUSE:Factory/python-s3fs/python-s3fs.changes 2019-10-30 14:47:58.202191945 +0100 +++ /work/SRC/openSUSE:Factory/.python-s3fs.new.26869/python-s3fs.changes 2019-11-22 10:27:19.165240020 +0100 @@ -1,0 +2,12 @@ +Wed Nov 20 14:10:34 UTC 2019 - Tomáš Chvátal <tchva...@suse.com> + +- Update to 0.4.0: + * New instances no longer need reconnect (:pr:`244`) by Martin Durant + * Always use multipart uploads when not autocommitting (:pr:`243`) by Marius van Niekerk + * Use autofunction for S3Map sphinx autosummary (:pr:`251`) by James Bourbeau + * Miscellaneous doc updates (:pr:`252`) by James Bourbeau + * Support for Python 3.8 (:pr:`264`) by Tom Augspurger + * Improved performance for isdir (:pr:`259`) by Nate Yoder + * Increased the minimum required version of fsspec to 0.6.0 + +------------------------------------------------------------------- Old: ---- s3fs-0.3.5.tar.gz New: ---- s3fs-0.4.0.tar.gz ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Other differences: ------------------ ++++++ python-s3fs.spec ++++++ --- /var/tmp/diff_new_pack.oyuOgq/_old 2019-11-22 10:27:20.221239708 +0100 +++ /var/tmp/diff_new_pack.oyuOgq/_new 2019-11-22 10:27:20.221239708 +0100 @@ -1,7 +1,7 @@ # # spec file for package python-s3fs # -# Copyright (c) 2019 SUSE LINUX GmbH, Nuernberg, Germany. +# Copyright (c) 2019 SUSE LLC # # All modifications and additions to the file contributed by third parties # remain the property of their copyright owners, unless otherwise agreed @@ -19,7 +19,7 @@ %{?!python_module:%define python_module() python-%{**} python3-%{**}} %define skip_python2 1 Name: python-s3fs -Version: 0.3.5 +Version: 0.4.0 Release: 0 Summary: Python filesystem interface over S3 License: BSD-3-Clause @@ -27,7 +27,7 @@ Source: https://files.pythonhosted.org/packages/source/s/s3fs/s3fs-%{version}.tar.gz BuildRequires: %{python_module boto3 >= 1.9.91} BuildRequires: %{python_module botocore >= 1.12.91} -BuildRequires: %{python_module fsspec >= 0.2.2} +BuildRequires: %{python_module fsspec >= 0.6.0} BuildRequires: %{python_module moto >= 1.3.12} BuildRequires: %{python_module pytest >= 4.2.0} BuildRequires: %{python_module setuptools} @@ -35,7 +35,7 @@ BuildRequires: python-rpm-macros Requires: python-boto3 >= 1.9.91 Requires: python-botocore >= 1.12.91 -Requires: python-fsspec >= 0.2.2 +Requires: python-fsspec >= 0.6.0 BuildArch: noarch %python_subpackages ++++++ s3fs-0.3.5.tar.gz -> s3fs-0.4.0.tar.gz ++++++ diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/s3fs-0.3.5/PKG-INFO new/s3fs-0.4.0/PKG-INFO --- old/s3fs-0.3.5/PKG-INFO 2019-10-06 18:26:35.000000000 +0200 +++ new/s3fs-0.4.0/PKG-INFO 2019-11-13 17:59:48.000000000 +0100 @@ -1,6 +1,6 @@ Metadata-Version: 1.2 Name: s3fs -Version: 0.3.5 +Version: 0.4.0 Summary: Convenient Filesystem interface over S3 Home-page: http://github.com/dask/s3fs/ Maintainer: Martin Durant @@ -21,8 +21,8 @@ .. |Build Status| image:: https://travis-ci.org/dask/s3fs.svg?branch=master :target: https://travis-ci.org/dask/s3fs :alt: Build Status - .. |Doc Status| image:: http://readthedocs.io/projects/s3fs/badge/?version=latest - :target: http://s3fs.readthedocs.io/en/latest/?badge=latest + .. |Doc Status| image:: https://readthedocs.org/projects/s3fs/badge/?version=latest + :target: https://s3fs.readthedocs.io/en/latest/?badge=latest :alt: Documentation Status Keywords: s3,boto @@ -34,4 +34,5 @@ Classifier: Programming Language :: Python :: 3.5 Classifier: Programming Language :: Python :: 3.6 Classifier: Programming Language :: Python :: 3.7 +Classifier: Programming Language :: Python :: 3.8 Requires-Python: >= 3.5 diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/s3fs-0.3.5/README.rst new/s3fs-0.4.0/README.rst --- old/s3fs-0.3.5/README.rst 2016-12-19 23:19:53.000000000 +0100 +++ new/s3fs-0.4.0/README.rst 2019-11-11 22:18:24.000000000 +0100 @@ -13,6 +13,6 @@ .. |Build Status| image:: https://travis-ci.org/dask/s3fs.svg?branch=master :target: https://travis-ci.org/dask/s3fs :alt: Build Status -.. |Doc Status| image:: http://readthedocs.io/projects/s3fs/badge/?version=latest - :target: http://s3fs.readthedocs.io/en/latest/?badge=latest +.. |Doc Status| image:: https://readthedocs.org/projects/s3fs/badge/?version=latest + :target: https://s3fs.readthedocs.io/en/latest/?badge=latest :alt: Documentation Status diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/s3fs-0.3.5/docs/source/api.rst new/s3fs-0.4.0/docs/source/api.rst --- old/s3fs-0.3.5/docs/source/api.rst 2017-05-15 17:34:07.000000000 +0200 +++ new/s3fs-0.4.0/docs/source/api.rst 2019-11-11 22:18:24.000000000 +0100 @@ -46,7 +46,7 @@ .. currentmodule:: s3fs.mapping -.. autoclass:: S3Map +.. autofunction:: S3Map .. currentmodule:: s3fs.utils diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/s3fs-0.3.5/docs/source/changelog.rst new/s3fs-0.4.0/docs/source/changelog.rst --- old/s3fs-0.3.5/docs/source/changelog.rst 1970-01-01 01:00:00.000000000 +0100 +++ new/s3fs-0.4.0/docs/source/changelog.rst 2019-11-13 17:56:36.000000000 +0100 @@ -0,0 +1,21 @@ +Changelog +========= + +Version 0.4.0 +------------- + +- New instances no longer need reconnect (:pr:`244`) by `Martin Durant`_ +- Always use multipart uploads when not autocommitting (:pr:`243`) by `Marius van Niekerk`_ +- Create ``CONTRIBUTING.md`` (:pr:`248`) by `Jacob Tomlinson`_ +- Use autofunction for ``S3Map`` sphinx autosummary (:pr:`251`) by `James Bourbeau`_ +- Miscellaneous doc updates (:pr:`252`) by `James Bourbeau`_ +- Support for Python 3.8 (:pr:`264`) by `Tom Augspurger`_ +- Improved performance for ``isdir`` (:pr:`259`) by `Nate Yoder`_ +* Increased the minimum required version of fsspec to 0.6.0 + +.. _`Martin Durant`: https://github.com/martindurant +.. _`Marius van Niekerk`: https://github.com/mariusvniekerk +.. _`Jacob Tomlinson`: https://github.com/jacobtomlinson +.. _`James Bourbeau`: https://github.com/jrbourbeau +.. _`Tom Augspurger`: https://github.com/TomAugspurger +.. _`Nate Yoder`: https://github.com/nateyoder \ No newline at end of file diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/s3fs-0.3.5/docs/source/index.rst new/s3fs-0.4.0/docs/source/index.rst --- old/s3fs-0.3.5/docs/source/index.rst 2019-09-09 15:14:23.000000000 +0200 +++ new/s3fs-0.4.0/docs/source/index.rst 2019-11-11 22:18:24.000000000 +0100 @@ -17,7 +17,7 @@ ``seek``), such that functions expecting a file can access S3. Only binary read and write modes are implemented, with blocked caching. -This uses and is based upon `fsspec`_ +S3Fs uses and is based upon `fsspec`_. .. _fsspec: https://filesystem-spec.readthedocs.io/en/latest/ @@ -72,45 +72,45 @@ This project is meant for convenience, rather than feature completeness. The following are known current omissions: -- file access is always binary (although readline and iterating by line are -possible) +- file access is always binary (although ``readline`` and iterating by line + are possible) -- no permissions/access-control (i.e., no chmod/chown methods) +- no permissions/access-control (i.e., no ``chmod``/``chown`` methods) Logging ------- The logger ``s3fs.core.logger`` provides information about the operations of the -file system. To see messages, set its level to DEBUG. You can also achieve this via +file system. To see messages, set its level to ``DEBUG``. You can also achieve this via an environment variable ``S3FS_LOGGING_LEVEL=DEBUG``. Credentials ----------- -The AWS key and secret may be provided explicitly when creating an S3FileSystem. +The AWS key and secret may be provided explicitly when creating an ``S3FileSystem``. A more secure way, not including the credentials directly in code, is to allow boto to establish the credentials automatically. Boto will try the following methods, in order: -- aws_access_key_id, aws_secret_access_key, and aws_session_token environment -variables +- ``aws_access_key_id``, ``aws_secret_access_key``, and ``aws_session_token`` + environment variables -- configuration files such as `~/.aws/credentials` +- configuration files such as ``~/.aws/credentials`` - for nodes on EC2, the IAM metadata provider In a distributed environment, it is not expected that raw credentials should be passed between machines. In the explicitly provided credentials case, the -method `get_delegated_s3pars()` can be used to obtain temporary credentials. +method ``get_delegated_s3pars()`` can be used to obtain temporary credentials. When not using explicit credentials, it should be expected that every machine also has the appropriate environment variables, config files or IAM roles available. If none of the credential methods are available, only anonymous access will -work, and `anon=True` must be passed to the constructor. +work, and ``anon=True`` must be passed to the constructor. -Furthermore, `S3FileSystem.current()` will return the most-recently created +Furthermore, ``S3FileSystem.current()`` will return the most-recently created instance, so this method could be used in preference to the constructor in cases where the code must be agnostic of the credentials/config used. @@ -134,7 +134,7 @@ --------------------- For some buckets/files you may want to use some of s3's server side encryption -features. `s3fs` supports these in a few ways +features. ``s3fs`` supports these in a few ways .. code-block:: python @@ -145,21 +145,21 @@ This will create an s3 filesystem instance that will append the ServerSideEncryption argument to all s3 calls (where applicable). -The same applies for `s3.open`. Most of the methods on the filesystem object +The same applies for ``s3.open``. Most of the methods on the filesystem object will also accept and forward keyword arguments to the underlying calls. The most recently specified argument is applied last in the case where both -`s3_additional_kwargs` and a method's `**kwargs` are used. +``s3_additional_kwargs`` and a method's ``**kwargs`` are used. -The `s3.utils.SSEParams` provides some convenient helpers for the serverside +The ``s3.utils.SSEParams`` provides some convenient helpers for the serverside encryption parameters in particular. An instance can be passed instead of a -regular python dictionary as the `s3_additional_kwargs` parameter. +regular python dictionary as the ``s3_additional_kwargs`` parameter. Bucket Version Awareness ------------------------ If your bucket has object versioning enabled then you can add version-aware support -to s3fs. This ensures that if a file is opened at a particular point in time that +to ``s3fs``. This ensures that if a file is opened at a particular point in time that version will be used for reading. This mitigates the issue where more than one user is concurrently reading and writing @@ -167,16 +167,12 @@ .. code-block:: python - s3 = s3fs.S3FileSytem(version_aware=True) - + >>> s3 = s3fs.S3FileSytem(version_aware=True) # Open the file at the latest version - fo = s3.open('versioned_bucket/object') - - versions = s3.object_version_info('versioned_bucket/object') - - # open the file at a particular version - fo_old_version = s3.open('versioned_bucket/object', version_id='SOMEVERSIONID') - >>> + >>> fo = s3.open('versioned_bucket/object') + >>> versions = s3.object_version_info('versioned_bucket/object') + # Open the file at a particular version + >>> fo_old_version = s3.open('versioned_bucket/object', version_id='SOMEVERSIONID') In order for this to function the user must have the necessary IAM permissions to perform a GetObjectVersion @@ -188,6 +184,7 @@ .. toctree:: install api + changelog :maxdepth: 2 diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/s3fs-0.3.5/requirements.txt new/s3fs-0.4.0/requirements.txt --- old/s3fs-0.3.5/requirements.txt 2019-07-02 15:57:22.000000000 +0200 +++ new/s3fs-0.4.0/requirements.txt 2019-11-13 17:56:36.000000000 +0100 @@ -1,3 +1,3 @@ boto3>=1.9.91 botocore>=1.12.91 -fsspec>=0.2.2 +fsspec>=0.6.0 diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/s3fs-0.3.5/s3fs/_version.py new/s3fs-0.4.0/s3fs/_version.py --- old/s3fs-0.3.5/s3fs/_version.py 2019-10-06 18:26:35.000000000 +0200 +++ new/s3fs-0.4.0/s3fs/_version.py 2019-11-13 17:59:48.000000000 +0100 @@ -8,11 +8,11 @@ version_json = ''' { - "date": "2019-10-06T11:15:43-0400", + "date": "2019-11-13T10:59:17-0600", "dirty": false, "error": null, - "full-revisionid": "571a6463ac7aaaf1a6f80ee776e79e3b0d76a4f4", - "version": "0.3.5" + "full-revisionid": "85b863170f76063b270671442448b15e821f92b6", + "version": "0.4.0" } ''' # END VERSION_JSON diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/s3fs-0.3.5/s3fs/core.py new/s3fs-0.4.0/s3fs/core.py --- old/s3fs-0.3.5/s3fs/core.py 2019-10-06 17:15:47.000000000 +0200 +++ new/s3fs-0.4.0/s3fs/core.py 2019-11-13 17:40:01.000000000 +0100 @@ -3,7 +3,6 @@ import os import socket import time -from hashlib import md5 from fsspec import AbstractFileSystem from fsspec.spec import AbstractBufferedFile @@ -23,9 +22,6 @@ if "S3FS_LOGGING_LEVEL" in os.environ: logger.setLevel(os.environ["S3FS_LOGGING_LEVEL"]) -logging.getLogger('boto3').setLevel(logging.WARNING) -logging.getLogger('botocore').setLevel(logging.WARNING) - try: from boto3.s3.transfer import S3_RETRYABLE_ERRORS except ImportError: @@ -36,17 +32,6 @@ _VALID_FILE_MODES = {'r', 'w', 'a', 'rb', 'wb', 'ab'} -def tokenize(*args, **kwargs): - """ Deterministic token - - >>> tokenize('Hello') == tokenize('Hello') - True - """ - if kwargs: - args += (kwargs,) - return md5(str(tuple(args)).encode()).hexdigest() - - def split_path(path): """ Normalise S3 path string into bucket and key. @@ -143,6 +128,7 @@ read_timeout = 15 default_block_size = 5 * 2**20 protocol = 's3' + _extra_tokenize_attributes = ('default_block_size',) def __init__(self, anon=False, key=None, secret=None, token=None, use_ssl=True, client_kwargs=None, requester_pays=False, @@ -159,7 +145,6 @@ if password: secret = password - super().__init__() self.anon = anon self.session = None self.passed_in_session = session @@ -185,6 +170,7 @@ self.use_ssl = use_ssl self.s3 = self.connect() self._kwargs_helper = ParamKwargsHelper(self.s3) + super().__init__() def _filter_kwargs(self, s3_method, kwargs): return self._kwargs_helper.filter_dict(s3_method.__name__, kwargs) @@ -270,7 +256,7 @@ 'token': cred['SessionToken'], 'anon': False} def _open(self, path, mode='rb', block_size=None, acl='', version_id=None, - fill_cache=None, cache_type=None, autocommit=True, **kwargs): + fill_cache=None, cache_type=None, autocommit=True, requester_pays=None, **kwargs): """ Open a file for reading or writing Parameters @@ -299,6 +285,9 @@ cache_type : str See fsspec's documentation for available cache_type values. Set to "none" if no caching is desired. If None, defaults to ``self.default_cache_type``. + requester_pays : bool (optional) + If RequesterPays buckets are supported. If None, defaults to the + value used when creating the S3FileSystem (which defaults to False.) kwargs: dict-like Additional parameters used for s3 methods. Typically used for ServerSideEncryption. @@ -307,6 +296,8 @@ block_size = self.default_block_size if fill_cache is None: fill_cache = self.default_fill_cache + if requester_pays is None: + requester_pays = bool(self.req_kw) acl = acl or self.s3_additional_kwargs.get('ACL', '') kw = self.s3_additional_kwargs.copy() @@ -321,7 +312,7 @@ return S3File(self, path, mode, block_size=block_size, acl=acl, version_id=version_id, fill_cache=fill_cache, s3_additional_kwargs=kw, cache_type=cache_type, - autocommit=autocommit) + autocommit=autocommit, requester_pays=requester_pays) def _lsdir(self, path, refresh=False, max_items=None): if path.startswith('s3://'): @@ -412,23 +403,6 @@ self.dircache[''] = files return self.dircache[''] - def __getstate__(self): - if self.passed_in_session: - raise NotImplementedError - d = self.__dict__.copy() - del d['s3'] - del d['session'] - del d['_kwargs_helper'] - del d['dircache'] - logger.debug("Serialize with state: %s", d) - return d - - def __setstate__(self, state): - self.__dict__.update(state) - self.s3 = self.connect() - self.dircache = {} - self._kwargs_helper = ParamKwargsHelper(self.s3) - def _ls(self, path, refresh=False): """ List files in given bucket, or list of buckets. @@ -511,6 +485,31 @@ raise ValueError('Failed to head path %r: %s' % (path, e)) return super().info(path) + def isdir(self, path): + path = self._strip_protocol(path).strip("/") + # Send buckets to super + if "/" not in path: + return super(S3FileSystem, self).isdir(path) + + if path in self.dircache: + for fp in self.dircache[path]: + # For files the dircache can contain itself. + # If it contains anything other than itself it is a directory. + if fp["name"] != path: + return True + return False + + parent = self._parent(path) + if parent in self.dircache: + for f in self.dircache[parent]: + if f["name"] == path: + # If we find ourselves return whether we are a directory + return f["type"] == "directory" + return False + + # This only returns things within the path and NOT the path object itself + return bool(self._lsdir(path)) + def ls(self, path, detail=False, refresh=False, **kwargs): """ List single "directory" with or without details @@ -647,7 +646,7 @@ `Metadata Reference`_. Parameters - --------- + ---------- kw_args : key-value pairs like field="value", where the values must be strings. Does not alter existing fields, unless the field appears here - if the value is None, delete the @@ -919,6 +918,8 @@ Optional version to read the file at. If not specified this will default to the current version of the object. This is only used for reading. + requester_pays : bool (False) + If RequesterPays buckets are supported. Examples -------- @@ -937,7 +938,7 @@ def __init__(self, s3, path, mode='rb', block_size=5 * 2 ** 20, acl="", version_id=None, fill_cache=True, s3_additional_kwargs=None, - autocommit=True, cache_type='bytes'): + autocommit=True, cache_type='bytes', requester_pays=False): bucket, key = split_path(path) if not key: raise ValueError('Attempt to open non key-like path: %s' % path) @@ -951,6 +952,7 @@ self.parts = None self.fill_cache = fill_cache self.s3_additional_kwargs = s3_additional_kwargs or {} + self.req_kw = {'RequestPayer': 'requester'} if requester_pays else {} super().__init__(s3, path, mode, block_size, autocommit=autocommit, cache_type=cache_type) self.s3 = self.fs # compatibility @@ -972,7 +974,9 @@ self.details = self.fs.info(self.path) self.version_id = self.details.get('VersionId') + # when not using autocommit we want to have transactional state to manage self.append_block = False + if 'a' in mode and s3.exists(path): loc = s3.info(path)['size'] if loc < 5 * 2 ** 20: @@ -987,7 +991,7 @@ **kwargs) def _initiate_upload(self): - if not self.append_block and self.tell() < self.blocksize: + if not self.autocommit and not self.append_block and self.tell() < self.blocksize: # only happens when closing small file, use on-shot PUT return logger.debug("Initiate upload for %s" % self) @@ -1053,14 +1057,14 @@ return self.fs.url(self.path, **kwargs) def _fetch_range(self, start, end): - return _fetch_range(self.fs.s3, self.bucket, self.key, self.version_id, start, end) + return _fetch_range(self.fs.s3, self.bucket, self.key, self.version_id, start, end, req_kw=self.req_kw) def _upload_chunk(self, final=False): bucket, key = split_path(self.path) logger.debug("Upload for %s, final=%s, loc=%s, buffer loc=%s" % ( self, final, self.loc, self.buffer.tell() )) - if not self.append_block and final and self.tell() < self.blocksize: + if not self.autocommit and not self.append_block and final and self.tell() < self.blocksize: # only happens when closing small file, use on-shot PUT data1 = False else: diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/s3fs-0.3.5/s3fs/tests/test_s3fs.py new/s3fs-0.4.0/s3fs/tests/test_s3fs.py --- old/s3fs-0.3.5/s3fs/tests/test_s3fs.py 2019-08-27 20:52:18.000000000 +0200 +++ new/s3fs-0.4.0/s3fs/tests/test_s3fs.py 2019-11-13 17:40:01.000000000 +0100 @@ -9,7 +9,7 @@ from itertools import chain import fsspec.core from s3fs.core import S3FileSystem -from s3fs.utils import seek_delimiter, ignoring, SSEParams +from s3fs.utils import ignoring, SSEParams import moto import boto3 from unittest import mock @@ -94,6 +94,7 @@ for flist in [files, csv_files, text_files, glob_files]: for f, data in flist.items(): client.put_object(Bucket=test_bucket_name, Key=f, Body=data) + S3FileSystem.clear_instance_cache() s3 = S3FileSystem(anon=False) s3.invalidate_cache() yield s3 @@ -164,13 +165,6 @@ assert s3.connect(refresh=True).meta.config.signature_version == 's3v4' -def test_tokenize(): - from s3fs.core import tokenize - a = (1, 2, 3) - assert isinstance(tokenize(a), (str, bytes)) - assert tokenize(a) != tokenize(a, other=1) - - def test_idempotent_connect(s3): con1 = s3.connect() con2 = s3.connect(refresh=False) @@ -677,23 +671,6 @@ assert len(f.cache) < len(out) -def test_seek_delimiter(s3): - fn = 'test/accounts.1.json' - data = files[fn] - with s3.open('/'.join([test_bucket_name, fn])) as f: - seek_delimiter(f, b'}', 0) - assert f.tell() == 0 - f.seek(1) - seek_delimiter(f, b'}', 5) - assert f.tell() == data.index(b'}') + 1 - seek_delimiter(f, b'\n', 5) - assert f.tell() == data.index(b'\n') + 1 - f.seek(1, 1) - ind = data.index(b'\n') + data[data.index(b'\n') + 1:].index(b'\n') + 1 - seek_delimiter(f, b'\n', 5) - assert f.tell() == ind + 1 - - def test_read_s3_block(s3): data = files['test/accounts.1.json'] lines = io.BytesIO(data).readlines() @@ -1244,18 +1221,17 @@ def test_pickle_without_passed_in_session(s3): + import pickle s3 = S3FileSystem() - try: - s3.__getstate__() - except NotImplementedError: - pytest.fail("Unexpected NotImplementedError") + pickle.dumps(s3) def test_pickle_with_passed_in_session(s3): + import pickle session = boto3.session.Session() s3 = S3FileSystem(session=session) - with pytest.raises(NotImplementedError): - s3.__getstate__() + with pytest.raises((AttributeError, NotImplementedError, TypeError)): + pickle.dumps(s3) def test_cache_after_copy(s3): @@ -1297,6 +1273,16 @@ fo.commit() +def test_autocommit_mpu(s3): + """When not autocommitting we always want to use multipart uploads""" + path = test_bucket_name + '/auto_commit_with_mpu' + with s3.open(path, 'wb', autocommit=True) as fo: + fo.write(b'1') + # fo.flush() + assert fo.mpu is not None + assert len(fo.parts) == 1 + + def test_touch(s3): # create fn = test_bucket_name + "/touched" @@ -1338,3 +1324,28 @@ size = 17562187 d3 = f.read(size) assert len(d3) == size + + +def test_connect_many(): + from multiprocessing.pool import ThreadPool + + def task(i): + S3FileSystem(anon=False).ls("") + return True + + pool = ThreadPool(processes=20) + out = pool.map(task, range(40)) + assert all(out) + pool.close() + pool.join() + + +def test_requester_pays(): + fn = test_bucket_name + "/myfile" + with moto.mock_s3(): + s3 = S3FileSystem(requester_pays=True) + assert s3.req_kw["RequestPayer"] == "requester" + s3.mkdir(test_bucket_name) + s3.touch(fn) + with s3.open(fn, "rb") as f: + assert f.req_kw["RequestPayer"] == "requester" diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/s3fs-0.3.5/s3fs/tests/test_utils.py new/s3fs-0.4.0/s3fs/tests/test_utils.py --- old/s3fs-0.3.5/s3fs/tests/test_utils.py 2018-07-27 18:11:16.000000000 +0200 +++ new/s3fs-0.4.0/s3fs/tests/test_utils.py 2019-11-13 16:22:36.000000000 +0100 @@ -1,51 +0,0 @@ -from s3fs.utils import read_block, seek_delimiter -import io - - -def test_read_block(): - delimiter = b'\n' - data = delimiter.join([b'123', b'456', b'789']) - f = io.BytesIO(data) - - assert read_block(f, 1, 2) == b'23' - assert read_block(f, 0, 1, delimiter=b'\n') == b'123\n' - assert read_block(f, 0, 2, delimiter=b'\n') == b'123\n' - assert read_block(f, 0, 3, delimiter=b'\n') == b'123\n' - assert read_block(f, 0, 5, delimiter=b'\n') == b'123\n456\n' - assert read_block(f, 0, 8, delimiter=b'\n') == b'123\n456\n789' - assert read_block(f, 0, 100, delimiter=b'\n') == b'123\n456\n789' - assert read_block(f, 1, 1, delimiter=b'\n') == b'' - assert read_block(f, 1, 5, delimiter=b'\n') == b'456\n' - assert read_block(f, 1, 8, delimiter=b'\n') == b'456\n789' - - for ols in [[(0, 3), (3, 3), (6, 3), (9, 2)], - [(0, 4), (4, 4), (8, 4)]]: - out = [read_block(f, o, l, b'\n') for o, l in ols] - assert b"".join(filter(None, out)) == data - - -def test_seek_delimiter_endline(): - f = io.BytesIO(b'123\n456\n789') - - # if at zero, stay at zero - seek_delimiter(f, b'\n', 5) - assert f.tell() == 0 - - # choose the first block - for bs in [1, 5, 100]: - f.seek(1) - seek_delimiter(f, b'\n', blocksize=bs) - assert f.tell() == 4 - - # handle long delimiters well, even with short blocksizes - f = io.BytesIO(b'123abc456abc789') - for bs in [1, 2, 3, 4, 5, 6, 10]: - f.seek(1) - seek_delimiter(f, b'abc', blocksize=bs) - assert f.tell() == 6 - - # End at the end - f = io.BytesIO(b'123\n456') - f.seek(5) - seek_delimiter(f, b'\n', 5) - assert f.tell() == 7 diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/s3fs-0.3.5/s3fs/utils.py new/s3fs-0.4.0/s3fs/utils.py --- old/s3fs-0.3.5/s3fs/utils.py 2019-05-31 01:25:48.000000000 +0200 +++ new/s3fs-0.4.0/s3fs/utils.py 2019-11-13 16:22:36.000000000 +0100 @@ -1,10 +1,6 @@ -import array from contextlib import contextmanager import sys -PY2 = sys.version_info[0] == 2 -PY3 = sys.version_info[0] == 3 - @contextmanager def ignoring(*exceptions): @@ -14,104 +10,6 @@ pass -def seek_delimiter(file, delimiter, blocksize): - """ Seek current file to next byte after a delimiter bytestring - - This seeks the file to the next byte following the delimiter. It does - not return anything. Use ``file.tell()`` to see location afterwards. - - Parameters - ---------- - file: a file - delimiter: bytes - a delimiter like ``b'\n'`` or message sentinel - blocksize: int - Number of bytes to read from the file at once. - """ - - if file.tell() == 0: - return - - last = b'' - while True: - current = file.read(blocksize) - if not current: - return - full = last + current - try: - i = full.index(delimiter) - file.seek(file.tell() - (len(full) - i) + len(delimiter)) - return - except ValueError: - pass - last = full[-len(delimiter):] - - -def read_block(f, offset, length, delimiter=None): - """ Read a block of bytes from a file - - Parameters - ---------- - f: file - offset: int - Byte offset to start read - length: int - Number of bytes to read - delimiter: bytes (optional) - Ensure reading starts and stops at delimiter bytestring - - If using the ``delimiter=`` keyword argument we ensure that the read - starts and stops at delimiter boundaries that follow the locations - ``offset`` and ``offset + length``. If ``offset`` is zero then we - start at zero. The bytestring returned WILL include the - terminating delimiter string. - - Examples - -------- - - >>> from io import BytesIO # doctest: +SKIP - >>> f = BytesIO(b'Alice, 100\\nBob, 200\\nCharlie, 300') # doctest: +SKIP - >>> read_block(f, 0, 13) # doctest: +SKIP - b'Alice, 100\\nBo' - - >>> read_block(f, 0, 13, delimiter=b'\\n') # doctest: +SKIP - b'Alice, 100\\nBob, 200\\n' - - >>> read_block(f, 10, 10, delimiter=b'\\n') # doctest: +SKIP - b'Bob, 200\\nCharlie, 300' - """ - if delimiter: - f.seek(offset) - seek_delimiter(f, delimiter, 2 ** 16) - start = f.tell() - length -= start - offset - - f.seek(start + length) - seek_delimiter(f, delimiter, 2 ** 16) - end = f.tell() - - offset = start - length = end - start - - f.seek(offset) - b = f.read(length) - return b - - -def raises(exc, lamda): - try: - lamda() - return False - except exc: - return True - - -def ensure_writable(b): - if PY2 and isinstance(b, array.array): - return b.tostring() - return b - - def title_case(string): """ TitleCases a given string. diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/s3fs-0.3.5/s3fs.egg-info/PKG-INFO new/s3fs-0.4.0/s3fs.egg-info/PKG-INFO --- old/s3fs-0.3.5/s3fs.egg-info/PKG-INFO 2019-10-06 18:26:35.000000000 +0200 +++ new/s3fs-0.4.0/s3fs.egg-info/PKG-INFO 2019-11-13 17:59:48.000000000 +0100 @@ -1,6 +1,6 @@ Metadata-Version: 1.2 Name: s3fs -Version: 0.3.5 +Version: 0.4.0 Summary: Convenient Filesystem interface over S3 Home-page: http://github.com/dask/s3fs/ Maintainer: Martin Durant @@ -21,8 +21,8 @@ .. |Build Status| image:: https://travis-ci.org/dask/s3fs.svg?branch=master :target: https://travis-ci.org/dask/s3fs :alt: Build Status - .. |Doc Status| image:: http://readthedocs.io/projects/s3fs/badge/?version=latest - :target: http://s3fs.readthedocs.io/en/latest/?badge=latest + .. |Doc Status| image:: https://readthedocs.org/projects/s3fs/badge/?version=latest + :target: https://s3fs.readthedocs.io/en/latest/?badge=latest :alt: Documentation Status Keywords: s3,boto @@ -34,4 +34,5 @@ Classifier: Programming Language :: Python :: 3.5 Classifier: Programming Language :: Python :: 3.6 Classifier: Programming Language :: Python :: 3.7 +Classifier: Programming Language :: Python :: 3.8 Requires-Python: >= 3.5 diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/s3fs-0.3.5/s3fs.egg-info/SOURCES.txt new/s3fs-0.4.0/s3fs.egg-info/SOURCES.txt --- old/s3fs-0.3.5/s3fs.egg-info/SOURCES.txt 2019-10-06 18:26:35.000000000 +0200 +++ new/s3fs-0.4.0/s3fs.egg-info/SOURCES.txt 2019-11-13 17:59:48.000000000 +0100 @@ -6,6 +6,7 @@ setup.py versioneer.py docs/source/api.rst +docs/source/changelog.rst docs/source/index.rst docs/source/install.rst s3fs/__init__.py @@ -18,7 +19,6 @@ s3fs.egg-info/SOURCES.txt s3fs.egg-info/dependency_links.txt s3fs.egg-info/not-zip-safe -s3fs.egg-info/pbr.json s3fs.egg-info/requires.txt s3fs.egg-info/top_level.txt s3fs/tests/__init__.py diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/s3fs-0.3.5/s3fs.egg-info/pbr.json new/s3fs-0.4.0/s3fs.egg-info/pbr.json --- old/s3fs-0.3.5/s3fs.egg-info/pbr.json 2019-04-18 15:27:24.000000000 +0200 +++ new/s3fs-0.4.0/s3fs.egg-info/pbr.json 1970-01-01 01:00:00.000000000 +0100 @@ -1 +0,0 @@ -{"git_version": "f8edb22", "is_release": true} \ No newline at end of file diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/s3fs-0.3.5/s3fs.egg-info/requires.txt new/s3fs-0.4.0/s3fs.egg-info/requires.txt --- old/s3fs-0.3.5/s3fs.egg-info/requires.txt 2019-10-06 18:26:35.000000000 +0200 +++ new/s3fs-0.4.0/s3fs.egg-info/requires.txt 2019-11-13 17:59:48.000000000 +0100 @@ -1,3 +1,3 @@ boto3>=1.9.91 botocore>=1.12.91 -fsspec>=0.2.2 +fsspec>=0.6.0 diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/s3fs-0.3.5/setup.py new/s3fs-0.4.0/setup.py --- old/s3fs-0.3.5/setup.py 2019-07-02 15:57:22.000000000 +0200 +++ new/s3fs-0.4.0/setup.py 2019-11-13 16:22:36.000000000 +0100 @@ -14,6 +14,7 @@ 'Programming Language :: Python :: 3.5', 'Programming Language :: Python :: 3.6', 'Programming Language :: Python :: 3.7', + 'Programming Language :: Python :: 3.8', ], description='Convenient Filesystem interface over S3', url='http://github.com/dask/s3fs/',