[issue17436] pass a file object to hashlib.update

2013-03-17 Thread STINNER Victor

STINNER Victor added the comment:

> obj.update(buffer[:size])

This code does an useless memory copy: obj.update(memoryview(buffer)[:size]) 
can be used instead.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17436] pass a file object to hashlib.update

2013-03-16 Thread anatoly techtonik

anatoly techtonik added the comment:

I don't get that. I thought that buffered reading should be faster, although I 
agree that OS should handle this better. Why have the buffering turned on by 
default then? (I miss the ability to fork discussions from tracker, but there 
is no choice).

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17436] pass a file object to hashlib.update

2013-03-16 Thread STINNER Victor

STINNER Victor added the comment:

> Why unbuffered will be faster??

Well, I'm not sure that it is faster. But I would prefer to avoid
buffering if it is not needed.

2013/3/16 anatoly techtonik :
>
> anatoly techtonik added the comment:
>
> Why unbuffered will be faster??
>
> --
>
> ___
> Python tracker 
> 
> ___

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17436] pass a file object to hashlib.update

2013-03-16 Thread anatoly techtonik

anatoly techtonik added the comment:

Even though I mentioned passing file object in the title of this bugreport, 
what I really need is the following API:

  hexhash = hashlib.sha256().readfile(filename).hexdigest()

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17436] pass a file object to hashlib.update

2013-03-16 Thread anatoly techtonik

anatoly techtonik added the comment:

Why unbuffered will be faster??

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17436] pass a file object to hashlib.update

2013-03-16 Thread Jesús Cea Avión

Changes by Jesús Cea Avión :


--
nosy: +jcea

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17436] pass a file object to hashlib.update

2013-03-16 Thread STINNER Victor

STINNER Victor added the comment:

> It makes sense to allow hashlib.update accept file like object
> to read from.

Not update directly, but I agree that an helper would be convinient.

Here is another proposition using unbuffered file and readinto() with 
bytearray. It should be faster, but I didn't try with a benchmark. I also wrote 
two functions, because sometimes you have a file object, not a file path.

---
import hashlib, sys

def hash_readfile_obj(obj, fp, buffersize=64 * 1024):
buffer = bytearray(buffersize)
while True:
size = fp.readinto(buffer)
if not size:
break
if size == buffersize:
obj.update(buffer)
else:
obj.update(buffer[:size])

def hash_readfile(obj, filepath, buffersize=64 * 1024):
with open(filepath, 'rb', buffering=0) as fp:
hash_readfile_obj(obj, fp, buffersize)

def file_sha256(filepath):
sha = hashlib.sha256()
hash_readfile(sha, filepath)
return sha.hexdigest()

for name in sys.argv[1:]:
print("%s %s" % (file_sha256(name), name))
---

readfile() and readfile_obj() should be methods of an hash object.

--
nosy: +haypo

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17436] pass a file object to hashlib.update

2013-03-16 Thread anatoly techtonik

anatoly techtonik added the comment:

Otherwise you need to repeat this code.

def filehash(filepath):
blocksize = 64*1024
sha = hashlib.sha256()
with open(filepath, 'rb') as fp:
while True:
data = fp.read(blocksize)
if not data:
break
sha.update(data)
return sha.hexdigest()

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17436] pass a file object to hashlib.update

2013-03-16 Thread anatoly techtonik

Changes by anatoly techtonik :


--
title: pass a string to hashlib.update -> pass a file object to hashlib.update

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com