anirudhacharya commented on a change in pull request #12572: Make Gluon
download function to be atomic
URL: https://github.com/apache/incubator-mxnet/pull/12572#discussion_r218228329
##########
File path: python/mxnet/gluon/utils.py
##########
@@ -242,23 +277,34 @@ def download(url, path=None, overwrite=False,
sha1_hash=None, retries=5, verify_
dirname = os.path.dirname(os.path.abspath(os.path.expanduser(fname)))
if not os.path.exists(dirname):
os.makedirs(dirname)
- while retries+1 > 0:
+ while retries + 1 > 0:
# Disable pyling too broad Exception
# pylint: disable=W0703
try:
- print('Downloading %s from %s...'%(fname, url))
+ print('Downloading {} from {}...'.format(fname, url))
r = requests.get(url, stream=True, verify=verify_ssl)
if r.status_code != 200:
- raise RuntimeError("Failed downloading url %s"%url)
- with open(fname, 'wb') as f:
+ raise RuntimeError('Failed downloading url {}'.format(url))
+ # create uuid for temporary files
+ random_uuid = str(uuid.uuid4())
+ with open('{}.{}'.format(fname, random_uuid), 'wb') as f:
+ # create uuid for temporary files
for chunk in r.iter_content(chunk_size=1024):
if chunk: # filter out keep-alive new chunks
f.write(chunk)
+ # if the target file exists(created by other processes),
+ # delete the temporary file
+ if os.path.exists(fname):
+ os.remove('{}.{}'.format(fname, random_uuid))
Review comment:
here if `fname` local path exists, then the newly downloaded file is
silently deleted. We should probably raise a user warning saying that a local
file by the same name exists and the downloaded file has not been saved.
It might also be better to first check for the presence of local file and
then make the HTTP call in line 285. This way we will prevent making
unnecessary calls, instead of actually downloading the file and then deleting
it.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services