perbu opened a new issue #1424: Documentation: Objectstore example is broken / libcloud does seek() on stream URL: https://github.com/apache/libcloud/issues/1424 ## Summary I'm running the example on https://libcloud.readthedocs.io/en/stable/storage/examples.html that creates a tarfile and uploads it via stream to an objectstore. I'm using GCS. libcloud tries to do a seek() on the supplied iterator, this fails and the program stops. ## Detailed Information apache-libcloud==2.8.0 Python 3.7.6 macos Catalina The example in the doc is Python 2, I've rewritten it for python 3. I'm using GCS. ```python #!/usr/bin/env python import os import subprocess from datetime import datetime from libcloud.storage.types import Provider, ContainerDoesNotExistError from libcloud.storage.providers import get_driver from dotenv import load_dotenv load_dotenv() cls = get_driver(Provider.GOOGLE_STORAGE) driver = cls(os.getenv('GOOGLE_ACCOUNT'), os.getenv('AUTH_TOKEN'), project='foo') directory = os.getenv('FOLDER') cmd = 'tar cvzpf - %s' % (directory) object_name = 'backup-%s.tar.gz' % (datetime.now().strftime('%Y-%m-%d')) container_name = os.getenv('WORKSPACE') # Create a container if it doesn't already exist try: container = driver.get_container(container_name=container_name) except ContainerDoesNotExistError: container = driver.create_container(container_name=container_name) pipe = subprocess.Popen(cmd, bufsize=0, shell=True, stdout=subprocess.PIPE) return_code = pipe.poll() print('Uploading object...') while return_code is None: # Compress data in our directory and stream it directly to CF obj = container.upload_object_via_stream(iterator=pipe.stdout, object_name=object_name) return_code = pipe.poll() print('Upload complete, transferred: %s KB' % ((obj.size / 1024))) ``` This returns the following exception: ``` Traceback (most recent call last): File "./bug.py", line 38, in <module> object_name=object_name) File "/Users/perbu/.virtualenvs/ar1/lib/python3.7/site-packages/libcloud/storage/base.py", line 159, in upload_object_via_stream iterator, self, object_name, extra=extra, **kwargs) File "/Users/perbu/.virtualenvs/ar1/lib/python3.7/site-packages/libcloud/storage/drivers/s3.py", line 698, in upload_object_via_stream storage_class=ex_storage_class) File "/Users/perbu/.virtualenvs/ar1/lib/python3.7/site-packages/libcloud/storage/drivers/s3.py", line 842, in _put_object headers=headers, file_path=file_path, stream=stream) File "/Users/perbu/.virtualenvs/ar1/lib/python3.7/site-packages/libcloud/storage/base.py", line 627, in _upload_object self._get_hash_function()) File "/Users/perbu/.virtualenvs/ar1/lib/python3.7/site-packages/libcloud/storage/base.py", line 657, in _hash_buffered_stream stream.seek(0) OSError: [Errno 29] Illegal seek ``` The offending code in libcloud/storage/base.py looks like this: ```python if hasattr(stream, '__next__') or hasattr(stream, 'next'): # Ensure we start from the begining of a stream in case stream is # not at the beginning if hasattr(stream, 'seek'): stream.seek(0) ``` I'm not entirely sure why the iterator get "seek". I've been able to work around the issue by creating a SimpleIterator class that only supplies __next__ and then taking the output from Popen.stdout and subclassing it into the SimpleIterator. If I just comment out the seek(0) everything seems to work. Thanks for an excellent project. Let me know if you need anything more from me. Cheers, Per.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
