Terry J. Reedy added the comment:
IDLE avoids the problem of calculating a location for a '^' below the bad line
by instead asking tk to give the marked character (and maybe more) a 'ERROR'
tag, which shows as a red background. So it marks the '$' of 'A_I_U_E_O$' and
the 'alid' slice of
STINNER Victor added the comment:
The issue #10384 has been marked as a duplicate of this issue: it's a similar
issue, identifier which contains invisible character.
--
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue2382
Alexander Belopolsky added the comment:
The original problem is still present
Python 3.5.0a0 (default:5313b4c0bb6c, Sep 30 2014, 18:55:45)
A_I_U_E_O$ = None
File stdin, line 1
A_I_U_E_O$ = None
^
SyntaxError: invalid syntax
Replace A_I_U_E_O above with the Japanese script. I
Roundup Robot added the comment:
New changeset eb7565c212f1 by Serhiy Storchaka in branch '3.3':
Issue #2382: SyntaxError cursor ^ now is written at correct position in most
http://hg.python.org/cpython/rev/eb7565c212f1
New changeset ea34b2b0b8ae by Serhiy Storchaka in branch 'default':
Issue
Changes by Serhiy Storchaka storch...@gmail.com:
--
assignee: serhiy.storchaka -
stage: patch review - needs patch
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue2382
___
Serhiy Storchaka added the comment:
If no one complain I'll commit last patch tomorrow.
--
assignee: - serhiy.storchaka
stage: - patch review
type: - behavior
versions: +Python 3.4 -Python 3.2
___
Python tracker rep...@bugs.python.org
Serhiy Storchaka added the comment:
Added tests. I think it will be worth apply this patch which fixes the issue
for most Europeans and than continue working on the issue of wide characters.
--
Added file: http://bugs.python.org/file31874/adjust_offset_2.patch
Changes by Serhiy Storchaka storch...@gmail.com:
Removed file: http://bugs.python.org/file27506/adjust_offset-3.3.patch
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue2382
___
Alexander Belopolsky added the comment:
haypo The purpose of this issue is to handle CJK characters taking 2 haypo
columns instead of 1 in a terminal, or did I misunderstand it?
That's the other half of the problem, but the more common issue is misplaced
caret when non-ascii characters are
Alexander Belopolsky added the comment:
Serhiy's patch is lacking tests, but it passes the test I proposed at #10382 at
attaching here.
--
Added file: http://bugs.python.org/file30534/test.py
___
Python tracker rep...@bugs.python.org
Changes by Arfrever Frehtes Taifersar Arahesis arfrever@gmail.com:
--
nosy: +Arfrever
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue2382
___
Changes by Serhiy Storchaka storch...@gmail.com:
--
nosy: +serhiy.storchaka
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue2382
___
___
Serhiy Storchaka added the comment:
Here is a patch upgraded to Python 3.3. It uses a little different approach and
works with invalid encoded data. unicode_utf8size.patch is not needed.
This patch fixes a half of the issue - working with non-ascii non-wide
characters. It's enough for many
STINNER Victor added the comment:
This patch fixes a half of the issue - working with non-ascii
non-wide characters.
The purpose of this issue is to handle CJK characters taking 2 columns instead
of 1 in a terminal, or did I misunderstand it?
--
Petri Lehtinen pe...@digip.org added the comment:
What's the status of this issue?
FWIW, this is not only a problem with east asian characters:
ä äää
File stdin, line 1
ä äää
^
SyntaxError: invalid syntax
--
nosy: +petri.lehtinen
versions: +Python 3.2, Python 3.3
STINNER Victor victor.stin...@haypocalc.com added the comment:
I just created the issue #12568 for unicode_width.patch.
--
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue2382
___
Changes by Ezio Melotti ezio.melo...@gmail.com:
--
nosy: +ezio.melotti
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue2382
___
___
Python-bugs-list
STINNER Victor victor.stin...@haypocalc.com added the comment:
Proof of concept of patch fixing this issue:
- parse_syntax_error() reads the text line into a PyUnicodeObject*
instead of a const char**
- create utf8_to_unicode_offset(): convert byte offset to a number of
characters. The
STINNER Victor victor.stin...@haypocalc.com added the comment:
For an easier review, I splitted my patch in multiple small patches:
- unicode_utf8size.patch: create _PyUnicode_UTF8Size() function:
Number of bytes needed to encode the unicode character as UTF-8
- unicode_width.patch: create
Changes by STINNER Victor victor.stin...@haypocalc.com:
Added file: http://bugs.python.org/file13357/unicode_width.patch
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue2382
___
Changes by STINNER Victor victor.stin...@haypocalc.com:
Added file: http://bugs.python.org/file13358/adjust_offset.patch
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue2382
___
Changes by STINNER Victor victor.stin...@haypocalc.com:
Added file: http://bugs.python.org/file13359/print_exception.patch
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue2382
___
STINNER Victor victor.stin...@haypocalc.com added the comment:
Comments about my own patches.
unicode_width.patch:
* error messages should be improved:
ValueError(Unable to compute string width) for Windows
IOError(strerror(errno)) otherwise
adjust_offset.patch:
*
STINNER Victor victor.stin...@haypocalc.com added the comment:
This issue is a problem of units. The error text is an utf8 *byte*
string and offset is a number of *bytes*. The goal is to get the text
*width* of a *character* string. We have to:
1- convert offset from bytes number to character
David W. Lambert lamber...@corning.com added the comment:
Resolution of this may be applicable to Issue3446 as well.
center, ljust and rjust are inconsistent with unicode parameters
--
nosy: +LambertDW
___
Python tracker rep...@bugs.python.org
Hirokazu Yamamoto [EMAIL PROTECTED] added the comment:
At least my one unicode char is one space suggestion corrects the case
of Western languages, and all messages with single-width characters.
I'm not happy with this solution. ;-(
Doesn't the exact width depend on
the terminal
Hirokazu Yamamoto [EMAIL PROTECTED] added the comment:
Experimental patch was experimental, wcswidth(3) returns 1 for East
Asian Ambiguous character.
debian:~/python-dev/py3k# ./python /mnt/windows/a.py
File /mnt/windows/a.py, line 3
♪xÅx abc
^
should point 'c'. And another
Amaury Forgeot d'Arc [EMAIL PROTECTED] added the comment:
For the moment, I'd suggest that one unicode character has a the same
with as the space character, assuming that stdout.encoding correctly
matches the terminal.
Then the C implementation could do something similar to the statements I
Amaury Forgeot d'Arc [EMAIL PROTECTED] added the comment:
This seems to be a difficult problem. Doesn't the exact width depend on
the terminal capabilities? and fonts, and combining diacritics...
An easy way to put the caret at the same exact position is to repeat the
beginning of the line up
STINNER Victor [EMAIL PROTECTED] added the comment:
See also a related issue: issue3975.
--
nosy: +haypo
___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue2382
___
Amaury Forgeot d'Arc [EMAIL PROTECTED] added the comment:
I think that your patch works only for terminals where one byte of the
encoded text is displayed as one character on the terminal. This is not
true for utf-8 terminals, for example.
In the attached patch, I tried to write some unit
Changes by Hirokazu Yamamoto [EMAIL PROTECTED]:
Removed file: http://bugs.python.org/file9786/fix.patch
___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue2382
___
Hirokazu Yamamoto [EMAIL PROTECTED] added the comment:
Patch revised.
--
components: +Interpreter Core -None
Added file:
http://bugs.python.org/file11548/py3k_adjust_cursor_at_syntax_error.patch
___
Python tracker [EMAIL PROTECTED]
Hirokazu Yamamoto [EMAIL PROTECTED] added the comment:
(I assumed get_length_in_bytes(f, , 1) == 1 but I'm not sure
this is always true in other platforms. Probably nicer and more
general solution may exist)
This assumption still lives, but I cannot find better solution.
I'm thinking now
Changes by Hirokazu Yamamoto [EMAIL PROTECTED]:
Removed file: http://bugs.python.org/file9723/experimental.patch
__
Tracker [EMAIL PROTECTED]
http://bugs.python.org/issue2382
__
___
Hirokazu Yamamoto [EMAIL PROTECTED] added the comment:
I tried to fix this problem, but I'm not sure how to fix this.
Quick observation...
///
// Possible Solution
1. Convert err-text to console compatible encoding (not to source
encoding like in python2.x)
New submission from Hirokazu Yamamoto [EMAIL PROTECTED]:
Hello. I found another problem related to issue2301.
SyntaxError cursor ^ is shifted when multibyte
characters are in line (before ^).
I think this is because err-text is stored as UTF-8
which requires 3 bytes for multibyte character,
but
37 matches
Mail list logo