[issue36534] tarfile: handling Windows (path) illegal characters in archive member names

2020-10-07 Thread Eryk Sun
Eryk Sun added the comment: > extract the sanitizing function into a common module > (could be *pathlib*?) to avoid duplicates I would prefer something common, cross-platform, and function-based such as os.path.isreservedname and os.path.sanitizename. In posixpath, it would just have to res

[issue36534] tarfile: handling Windows (path) illegal characters in archive member names

2020-10-07 Thread Cristi Fati
Cristi Fati added the comment: As I see things now, there are multiple things (not necessarily related to this issue) to deal with: 1. Update *tarfile* and add *\_sanitize\_windows\_name* (name can change), that uses *pathlib.\_WindowsFlavour.reserved\_names* (or some public wrapper), and a

[issue36534] tarfile: handling Windows (path) illegal characters in archive member names

2020-10-07 Thread STINNER Victor
STINNER Victor added the comment: > IIRC there's already an open issue for that. Ah, I found bpo-27827 "pathlib is_reserved fails for some reserved paths on Windows", open since 2016 (by you ;-)). -- ___ Python tracker

[issue36534] tarfile: handling Windows (path) illegal characters in archive member names

2020-10-06 Thread Eryk Sun
Eryk Sun added the comment: > This issue is about tarfile. Maybe create another issue to enhance > the pathlib module? IIRC there's already an open issue for that. But in case anyone were to look to pathlib as an example of what should be reserved, I wanted to highlight here how its reserve

[issue36534] tarfile: handling Windows (path) illegal characters in archive member names

2020-10-06 Thread STINNER Victor
STINNER Victor added the comment: > pathlib._WindowsFlavour.is_reserved() fails to reserve names (...) This issue is about tarfile. Maybe create another issue to enhance the pathlib module? -- ___ Python tracker

[issue36534] tarfile: handling Windows (path) illegal characters in archive member names

2020-10-06 Thread Eryk Sun
Eryk Sun added the comment: > The pathlib module has _WindowsFlavour.reserved_names list of > Windows reserved names: pathlib._WindowsFlavour.reserved_names is missing "CONIN$" and "CONOUT$". Prior to Windows 8 these two are reserved as relative names. In Windows 8+, they're also reserved i

[issue36534] tarfile: handling Windows (path) illegal characters in archive member names

2020-10-06 Thread STINNER Victor
STINNER Victor added the comment: > Also, while _sanitize_windows_name() handles trailing dots, for some reason > it overlooks trailing spaces. It also doesn't handle reserved DOS device > names. The pathlib module has _WindowsFlavour.reserved_names list of Windows reserved names: >>> pprin

[issue36534] tarfile: handling Windows (path) illegal characters in archive member names

2019-04-05 Thread Eryk Sun
Eryk Sun added the comment: _sanitize_windows_name() fails to translate the reserved control characters (0x01-0x1F) and backslash in names. What I've seen done in some cases (e.g. Unix network shares mapped to SMB) is to translate names using the private use area block, e.g. 0xF001 - 0xF07F

[issue36534] tarfile: handling Windows (path) illegal characters in archive member names

2019-04-05 Thread Karthikeyan Singaravelan
Change by Karthikeyan Singaravelan : -- components: +Windows nosy: +lars.gustaebel, paul.moore, steve.dower, tim.golden, zach.ware ___ Python tracker ___ __

[issue36534] tarfile: handling Windows (path) illegal characters in archive member names

2019-04-05 Thread Cristi Fati
New submission from Cristi Fati : Although tar is a Nix based (and mostly used) format, it gains popularity on Win too. As tarfile is running on Win, I think it should handle (work around) path incompatibilities, as zipfile (`ZipFile._sanitize_windows_name`) does. Applies to all branches. M