[issue13645] import machinery vulnerable to timestamp collisions

2020-04-24 Thread Ammar Askar


Change by Ammar Askar :


--
pull_requests:  -19023

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13645] import machinery vulnerable to timestamp collisions

2020-04-24 Thread Ammar Askar


Change by Ammar Askar :


--
nosy: +ammar2
nosy_count: 8.0 -> 9.0
pull_requests: +19023
pull_request: https://github.com/python/cpython/pull/19651

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13645] import machinery vulnerable to timestamp collisions

2012-01-13 Thread Antoine Pitrou

Antoine Pitrou pit...@free.fr added the comment:

 One is possibly deprecating path_mtime() so people don't waste time
 implementing it (we actually never need to remove it thanks to the
 ABC; otherwise we need to make sure the docs strongly state to only
 bother with path_stats()).

Ok, I saw I also forgot to update some importlib docs.

 The other is to say the mtime key should contain a value that is a
 real number (ie. float and any other numeric type that can cast to an
 integer).

Ok.

 And is there any efficient way to get the stat info on a file AND its
 contents in a single call?

I don't think so.  os.fstat() on an open fd looks minimally faster than
os.stat() on the filename (0.5µs faster here on Linux), but opening the
file has its own cost.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13645
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13645] import machinery vulnerable to timestamp collisions

2012-01-13 Thread Roundup Robot

Roundup Robot devn...@psf.upfronthosting.co.za added the comment:

New changeset 87331661042b by Antoine Pitrou in branch 'default':
Issue #13645: pyc files now contain the size of the corresponding source
http://hg.python.org/cpython/rev/87331661042b

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13645
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13645] import machinery vulnerable to timestamp collisions

2012-01-13 Thread Antoine Pitrou

Antoine Pitrou pit...@free.fr added the comment:

Now pushed in. Thanks for the reviews!

--
resolution:  - fixed
stage: patch review - committed/rejected
status: open - closed

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13645
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13645] import machinery vulnerable to timestamp collisions

2012-01-12 Thread Brett Cannon

Brett Cannon br...@python.org added the comment:

LGTM (although I didn't run the unit tests and focused mainly on the 
importlib._bootstrap and abc changes). Only two things I would change. One is 
possibly deprecating path_mtime() so people don't waste time implementing it 
(we actually never need to remove it thanks to the ABC; otherwise we need to 
make sure the docs strongly state to only bother with path_stats()). The other 
is to say the mtime key should contain a value that is a real number (ie. float 
and any other numeric type that can cast to an integer). That way you get your 
higher resolution in the dict while still being able to use the value in 
writing the bytecode.

And is there any efficient way to get the stat info on a file AND its contents 
in a single call? If so we might want a method to support that (eg. add a 
'source_bytes' key or something), but I can't think of any low-level call that 
supports that kind of thing.

--
assignee:  - pitrou

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13645
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13645] import machinery vulnerable to timestamp collisions

2012-01-11 Thread Antoine Pitrou

Antoine Pitrou pit...@free.fr added the comment:

 Then a method that returns some object representing all of the needed info
 on the source file is needed, but that does NOT blindly return the stat
info

Then it's simpler for the object to be a dict, and backwards compatibility is 
straightforward (by ignoring absent as well as unknown keys). Patch attached.

--
Added file: http://bugs.python.org/file24209/impsize2.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13645
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13645] import machinery vulnerable to timestamp collisions

2012-01-10 Thread Brett Cannon

Brett Cannon br...@python.org added the comment:

On Mon, Jan 9, 2012 at 18:05, Antoine Pitrou rep...@bugs.python.org wrote:


 Antoine Pitrou pit...@free.fr added the comment:

  I'm not suggesting two stat calls (in the general case); you would
  call one or the other depending on the magic number of the pyc file.

 The proposal is to store both mtime and size, actually, to make false
 positives less likely.

Then a method that returns some object representing all of the needed info
on the source file is needed, but that does NOT blindly return the stat
info (eg. modification date, file size, and even source as bytes can be
in this object).  It could even have methods to generate the bytecode, etc.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13645
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13645] import machinery vulnerable to timestamp collisions

2012-01-09 Thread Brett Cannon

Brett Cannon br...@python.org added the comment:

I'm not suggesting two stat calls (in the general case); you would call one or 
the other depending on the magic number of the pyc file.

Anyway, it would probably be best to have some method that is expected to 
return a specific object which embodies all the desired information for 
bytecode generation (and if you encompass source code with this object then you 
can get rid of get_source() as well). But it shouldn't be a raw stat object 
since not all bits of information will come from a stat call (eg. storing 
bytecode in a sqlite3 database) and thus require bogus data. If you want to 
move towards that kind of API I can support that.

As for path_mtime() returning an int instead of some number that can be 
converted to an int, that's because I didn't plan for the Antoine wants to 
muck with the .pyc format contingency (IOW I just didn't think about it). =) 
It wouldn't be a big deal to change the API to take a keyword-only argument 
specifying you want the highest resolution number instead of specifically an 
int.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13645
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13645] import machinery vulnerable to timestamp collisions

2012-01-09 Thread Antoine Pitrou

Antoine Pitrou pit...@free.fr added the comment:

 I'm not suggesting two stat calls (in the general case); you would
 call one or the other depending on the magic number of the pyc file.

The proposal is to store both mtime and size, actually, to make false
positives less likely.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13645
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13645] import machinery vulnerable to timestamp collisions

2012-01-01 Thread Antoine Pitrou

Antoine Pitrou pit...@free.fr added the comment:

 You could add the requisite path_size() method to get the value, and
 assume 0 means unsupported

I thought:
- calling two methods means two stat calls per file, this could be slightly 
inefficient
- if future extensions of the import mechanism require yet more stat 
information (for example owner or chmod), it will be yet another bunch of 
stat'ing methods to create

(besides, calling int() on the timestamp is a loss of information, I don't 
understand why this must be done in path_mtime() rather than let the consumer 
do whatever it wants with the higher-precision timestamp)

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13645
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13645] import machinery vulnerable to timestamp collisions

2012-01-01 Thread Charles-François Natali

Charles-François Natali neolo...@free.fr added the comment:

The patch looks good to me.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13645
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13645] import machinery vulnerable to timestamp collisions

2011-12-30 Thread Brett Cannon

Brett Cannon br...@python.org added the comment:

Why change importlib's API and instead add to it? You could add the requisite 
path_size() method to get the value, and assume 0 means unsupported (at least 
until some future version where a deprecation warning about not requiring file 
size comes about). But I don't see how you can get around not exposing these 
details as stored in some API, whether it is by method or object attribute 
since you can't assume stat results since support for non-file storage 
back-ends would be unduly burdened.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13645
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13645] import machinery vulnerable to timestamp collisions

2011-12-29 Thread Antoine Pitrou

Antoine Pitrou pit...@free.fr added the comment:

Here is a patch adding the source code size to the pyc header.
The number of places where details of the pyc file format are hard coded is 
surprisingly high...
Unfortunately, I had to modify importlib's public API (path_mtime - 
path_stats). I find it unfortunate that importlib's API is vulnerable to pyc 
format changes.

--
keywords: +patch
title: test_import fails after test_coding - import machinery vulnerable to 
timestamp collisions
versions:  -Python 3.2
Added file: http://bugs.python.org/file24107/impsize.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13645
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13645] import machinery vulnerable to timestamp collisions

2011-12-29 Thread Arfrever Frehtes Taifersar Arahesis

Changes by Arfrever Frehtes Taifersar Arahesis arfrever@gmail.com:


--
nosy: +Arfrever

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13645
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com