[issue2128] sys.argv is wrong for unicode strings

2013-01-14 Thread STINNER Victor

STINNER Victor added the comment:

> is it correct that this bug no longer appears in Python 2.7.3?

Martin wrote that it cannot be fixed in Python 2: "For 2.6, I don't think 
fixing it is feasible."

The "fix" is to upgrade your application to Python 3.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2128] sys.argv is wrong for unicode strings

2013-01-13 Thread Michael Herrmann

Michael Herrmann added the comment:

Hi,

is it correct that this bug no longer appears in Python 2.7.3? I checked the 
changelogs of 2.7, but couldn't find anything.

Thanks!
Michael

--
nosy: +mherrmann.at

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2128] sys.argv is wrong for unicode strings

2011-01-14 Thread STINNER Victor

Changes by STINNER Victor :


--
nosy: +haypo

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2128] sys.argv is wrong for unicode strings

2011-01-08 Thread David-Sarah Hopwood

David-Sarah Hopwood  added the comment:

Sorry, missed out the imports:

from ctypes import WINFUNCTYPE, windll, POINTER, byref, c_int
from ctypes.wintypes import LPWSTR, LPCWSTR

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2128] sys.argv is wrong for unicode strings

2011-01-08 Thread David-Sarah Hopwood

David-Sarah Hopwood  added the comment:

The following code is being used to work around this issue for Python 2.x in 
Tahoe-LAFS:

# This works around .
GetCommandLineW = WINFUNCTYPE(LPWSTR)(("GetCommandLineW", windll.kernel32))
CommandLineToArgvW = WINFUNCTYPE(POINTER(LPWSTR), LPCWSTR, POINTER(c_int)) \
(("CommandLineToArgvW", windll.shell32))

argc = c_int(0)
argv_unicode = CommandLineToArgvW(GetCommandLineW(), byref(argc))

argv = [argv_unicode[i].encode('utf-8') for i in range(0, argc.value)]

if not hasattr(sys, 'frozen'):
# If this is an executable produced by py2exe or bbfreeze, then it will
# have been invoked directly. Otherwise, unicode_argv[0] is the Python
# interpreter, so skip that.
argv = argv[1:]

# Also skip option arguments to the Python interpreter.
while len(argv) > 0:
arg = argv[0]
if not arg.startswith("-") or arg == "-":
break
argv = argv[1:]
if arg == '-m':
# sys.argv[0] should really be the absolute path of the module 
source,
# but never mind
break
if arg == '-c':
argv[0] = '-c'
break

--
nosy: +davidsarah
versions: +Python 2.5, Python 2.6, Python 2.7

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2128] sys.argv is wrong for unicode strings

2008-04-07 Thread Benjamin Peterson

Benjamin Peterson <[EMAIL PROTECTED]> added the comment:

Martin, you are right that they are not from the same reason as that issue.

gcc -c -arch ppc -arch i386 -isysroot /Developer/SDKs/MacOSX10.4u.sdk/ 
-fno-strict-aliasing -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes 
-I. -IInclude -I./Include   -DPy_BUILD_CORE -o Modules/main.o Modules/main.c
Modules/main.c: In function 'Py_Main':
Modules/main.c:478: warning: passing argument 1 of 'Py_SetProgramName'
from incompatible pointer type
Modules/main.c: In function 'Py_Main':
Modules/main.c:478: warning: passing argument 1 of 'Py_SetProgramName'
from incompatible pointer type

__
Tracker <[EMAIL PROTECTED]>

__
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2128] sys.argv is wrong for unicode strings

2008-04-06 Thread Martin v. Löwis

Martin v. Löwis <[EMAIL PROTECTED]> added the comment:

What warnings precisely are you seeing? I didn't see anything in the 3k
branch (not even for #2388, as PyErr_Format doesn't have the GCC format
attribute in 3k, unlike 2.x).

__
Tracker <[EMAIL PROTECTED]>

__
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2128] sys.argv is wrong for unicode strings

2008-04-06 Thread Benjamin Peterson

Benjamin Peterson <[EMAIL PROTECTED]> added the comment:

MvL's recent commit creates compiler warnings for Unicode UCS4 for the
same reason as #2388.

--
nosy: +benjamin.peterson

__
Tracker <[EMAIL PROTECTED]>

__
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2128] sys.argv is wrong for unicode strings

2008-04-05 Thread Martin v. Löwis

Martin v. Löwis <[EMAIL PROTECTED]> added the comment:

This is now fixed in r62178 for Py3k. For 2.6, I don't think fixing it
is feasible.

--
resolution:  -> fixed
status: open -> closed
versions:  -Python 2.6

__
Tracker <[EMAIL PROTECTED]>

__
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2128] sys.argv is wrong for unicode strings

2008-03-10 Thread Martin v. Löwis

Martin v. Löwis <[EMAIL PROTECTED]> added the comment:

Here is a patch that redoes the entire argv handling, in terms of
wchar_t. As a side effect, it also changes the sys.path handling to use
wchar_t.

--
keywords: +patch
Added file: http://bugs.python.org/file9647/wchar.diff

__
Tracker <[EMAIL PROTECTED]>

__
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2128] sys.argv is wrong for unicode strings

2008-02-21 Thread Martin v. Löwis

Martin v. Löwis added the comment:

> mbstowcs uses LC_CTYPE. Is that correct and consistent with the way
> default encoding under UNIX is handled by Py3k?

It's correct, but it's not consistent with the default encoding - there
isn't really any default encoding in Py3k. More specifically,
PyUnicode_FromString uses UTF-8, but not as a (changeable) default,
but as part of its API specification.
Command line arguments are in the locale's charset, so the LC_CTYPE
must be used to convert them.

> Would a Py_MainW or similar wrapper be easier on the UNIX guys? I'm just
> asking, I don't have a definite idea.

See above. The current POSIX implementation is incorrect also. It should
use the locale's encoding, but doesn't.

__
Tracker <[EMAIL PROTECTED]>

__
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2128] sys.argv is wrong for unicode strings

2008-02-21 Thread Giovanni Bajo

Giovanni Bajo added the comment:

mbstowcs uses LC_CTYPE. Is that correct and consistent with the way
default encoding under UNIX is handled by Py3k?

Would a Py_MainW or similar wrapper be easier on the UNIX guys? I'm just
asking, I don't have a definite idea.

__
Tracker <[EMAIL PROTECTED]>

__
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2128] sys.argv is wrong for unicode strings

2008-02-21 Thread Martin v. Löwis

Martin v. Löwis added the comment:

I dislike the double decoding, and would prefer if sys.argv would be
created directly from the wide command line.

In addition, I think the patch is incorrect: it ignores the arguments to
Py_Main, which is a documented API function.

One solution might be to declare all these functions (Py_Main,
SetProgramName, GetArgcArgv) to operate on Py_UNICODE*, and then
convert the POSIX callers of Py_Main to use mbstowcs when going
from the command line to Py_Main. WinMain could then become 
recompiled for Unicode directly, likewise Modules/python.c

--
nosy: +loewis

__
Tracker <[EMAIL PROTECTED]>

__
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2128] sys.argv is wrong for unicode strings

2008-02-17 Thread Giovanni Bajo

Giovanni Bajo added the comment:

I'm attaching a simple patch that seems to work under Py3k. The trick is
that Py3k already attempts (not sure how or why) to decode argv using
utf-8. So it's sufficient to setup argv as UTF8-encoded strings.

Notice that brings the output of "python à" from this:

Fatal Python error: no mem for sys.argv
UnicodeDecodeError: 'utf8' codec can't decode bytes in position 0-2:
invalid data

to this:

TypeError: zipimporter() argument 1 must be string without null bytes,
not str

which is expected since zipimporter_init() doesn't even know to ignore
unicode strings (let alone handle them correctly...).

Added file: http://bugs.python.org/file9449/argv_unicode.patch

__
Tracker <[EMAIL PROTECTED]>

__
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2128] sys.argv is wrong for unicode strings

2008-02-16 Thread Christian Heimes

Christian Heimes added the comment:

The issue is related to #1342

Since we have dropped support for older versions of Windows (9x, ME,
NT4) I like to get the Python interface to argv, env and files fixed.

--
components: +Windows
nosy: +tiran
priority:  -> high
versions: +Python 2.6

__
Tracker <[EMAIL PROTECTED]>

__
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2128] sys.argv is wrong for unicode strings

2008-02-16 Thread Giovanni Bajo

New submission from Giovanni Bajo:

Under Windows, sys.argv is created through the Windows ANSI API.

When you have a file/directory which can't be represented in the 
system encoding (eg: a japanese-named file or directory on a Western 
Windows), Windows will encode the filename to the system encoding using
what we call the "replace" policy, and thus sys.argv[] will contain an
entry like "c:\\foo\\??.dat".

My suggestion is that:

* At the Python level, we still expose a single sys.argv[], which will 
contain unicode strings. I think this exactly matches what Py3k does now. 

* At the C level, I believe it involves using GetCommandLineW() and 
CommandLineToArgvW() in WinMain.c, but should Py_Main/PySys_SetArgv() be 
changed to also accept wchar_t** arguments? Or is it better to allow for 
NULL to be passed (under Windows at least), so that the Windows
code-path in there can use GetCommandLineW()/CommandLineToArgvW() to get
the current process' arguments?

--
components: Interpreter Core
messages: 62458
nosy: giovannibajo
severity: normal
status: open
title: sys.argv is wrong for unicode strings
type: behavior
versions: Python 3.0

__
Tracker <[EMAIL PROTECTED]>

__
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com