[issue28749] Fixed the documentation of the mapping codec APIs

2017-03-24 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:


New changeset c85a26628ceb9624c96c3064e8b99033c026d8a3 by Serhiy Storchaka in 
branch 'master':
bpo-28749: Fixed the documentation of the mapping codec APIs. (#487)
https://github.com/python/cpython/commit/c85a26628ceb9624c96c3064e8b99033c026d8a3


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28749] Fixed the documentation of the mapping codec APIs

2017-03-24 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:


New changeset 69eab3123ed1de4bed4b7dedecabe415f6139bb6 by Serhiy Storchaka in 
branch '3.6':
bpo-28749: Fixed the documentation of the mapping codec APIs. (#487) (#714)
https://github.com/python/cpython/commit/69eab3123ed1de4bed4b7dedecabe415f6139bb6


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28749] Fixed the documentation of the mapping codec APIs

2017-03-24 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:


New changeset 88b32eb7b317dd7c7943433f980e17e34e50f8f8 by Serhiy Storchaka in 
branch '3.5':
bpo-28749: Fixed the documentation of the mapping codec APIs. (#487) (#715)
https://github.com/python/cpython/commit/88b32eb7b317dd7c7943433f980e17e34e50f8f8


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28749] Fixed the documentation of the mapping codec APIs

2017-03-21 Thread Serhiy Storchaka

Changes by Serhiy Storchaka :


--
resolution:  -> fixed
stage: patch review -> resolved
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28749] Fixed the documentation of the mapping codec APIs

2017-03-19 Thread Serhiy Storchaka

Changes by Serhiy Storchaka :


--
pull_requests: +635

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28749] Fixed the documentation of the mapping codec APIs

2017-03-19 Thread Serhiy Storchaka

Changes by Serhiy Storchaka :


--
pull_requests: +634

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28749] Fixed the documentation of the mapping codec APIs

2017-03-12 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

Do you still have objections Marc-Andre?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28749] Fixed the documentation of the mapping codec APIs

2017-03-05 Thread Serhiy Storchaka

Changes by Serhiy Storchaka :


--
pull_requests: +400

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28749] Fixed the documentation of the mapping codec APIs

2017-02-02 Thread Serhiy Storchaka

Changes by Serhiy Storchaka :


--
assignee: docs@python -> serhiy.storchaka

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28749] Fixed the documentation of the mapping codec APIs

2017-01-23 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

"bytes ordinals" is good term. Thank you. Here is an updated patch.

> No, I'd prefer this deprecation to be undone as long as we
> don't have a proper alternative for the API.

This is different issue.

--
Added file: http://bugs.python.org/file46389/docs-PyUnicode_Translate-4.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28749] Fixed the documentation of the mapping codec APIs

2017-01-23 Thread Marc-Andre Lemburg

Marc-Andre Lemburg added the comment:

>> The only part that is not correct is "single string characters".
>> This should read "single bytes" or "bytes strings of length 1".
>
> This is not correct. Decoding mappings map not bytes strings, but
integers.

Looking at the implementation, you're right. AFAIR, the first
incarnation of the charmap codec used single chars in Python 2.
I guess the documentation was never updated when the change was
made to use integers instead.

> And this is not the only incorrect part. Decoding mappings can map to
> multicharacter Unicode strings, not to single Unicode  characters. Not
just
> None, but the integer 0xfffe and Unicode string '\ufffe' mean "undefined
> mapping".

Yes, this was added later on as well. Apparently the docs
were never updated.

> There are similar incorrectnesses about encoding mappings.

Ok, fair enough, let's remove the two paragraphs.

>> I also don't see where you copied the description. Without some
>> description of what "mappings" are in the context of the charmap
>> codec, it's not easy to understand what the purpose of these
>> APIs is. Please just fix the bytes wording instead of removing the
>> whole intro.
>
> Decoding mappings were desribed in the introduction and in the
description of
> PyUnicode_DecodeCharmap() (both are outdated and incomplete). I merged
and
> corrected descriptions and left it only in one place, since
> PyUnicode_DecodeCharmap() is the only function that needs this. Same for
> encoding mappings. Both decoding and encoding mappings do not have a
relation
> to PyUnicode_Translate(). The paragraph about a LookupError in the
> introduction was totally wrong. I left in the introduction only common
part.
> Other details are too different in decoding, encoding and translation
mappings.
>
>> >> Also, this wording needs to be corrected: "bytes (integers in the
range
>> >> from 0 to 255)". Bytes are not integers. I'd suggest to use the more
>> >> correct wording "bytes strings of length 1".>
>> > The word "bytes" means here not Python bytes object, but is used in
more
>> > common meaning: an integer in the range from 0 to 255.
>> That's confusing, since we use the term "bytes" as referring
>> to the bytes object in Python. Please use "integers in the range
>> 0-255".
>
> Okay, I'll remove the word "bytes" here. But how would you formulate the
> following sentence: "Unmapped bytes (ones which cause a
:exc:`LookupError`) as
> well as mapped to ``None``, ``0xFFFE`` or ``'\ufffe'`` are treated as
"undefined
> mapping" and cause an error."?

Better:

"""
If *mapping* is *NULL*, Latin-1 decoding will be applied.  Else
*mapping* must map bytes ordinals (integers in the range from 0 to 255)
to Unicode strings, integers (which are then interpreted as Unicode
ordinals) or ``None``. Unmapped data bytes - ones which cause a
:exc:`LookupError`, as well as ones which get mapped to ``None``,
``0xFFFE`` or ``'\ufffe'``, are treated as undefined mappings and cause
an error.
"""

>> Aside: The deprecation of PyUnicode_EncodeCharmap() also seems misplaced
>> in this context, since only the Py_UNICODE version of the API is
>> deprecated. The functionality still exists and is useful. An API
>> similar to the _PyUnicode_EncodeCharmap() API should be made publicly
>> available to accommodate for the deprecation, since the mentioned
>> PyUnicode_AsCharmapString() and PyUnicode_AsEncodedString()
>> APIs are not suitable as replacement. PyUnicode_AsCharmapString()
>> doesn't support error handling (strange, BTW) and
>> PyUnicode_AsEncodedString() has a completely unrelated meaning (no
>> idea why it's mentioned here at all).
>
> Only PyUnicode_EncodeCharmap() is deprecated,
PyUnicode_AsCharmapString() is
> not deprecated. I placed the deprecated function just after its
non-deprecated
> counerpart following the pattern for other deprecated functions. If
you prefer
> I'll move both deprecated functions (PyUnicode_EncodeCharmap and
> PyUnicode_TranslateCharmap) together at the end of this section.

No, I'd prefer this deprecation to be undone as long as we
don't have a proper alternative for the API.

Looking at the various deprecations for the Py_UNICODE APIs,
I find that the Unicode API symmetry was severely broken.
In the Python 2 API, we always have an PyUnicode_Encode...() and
corresponding PyUnicode_Decode...() API for every codec.

In Python 3, the encode APIs were apparently all deprecated
due to their use of Py_UNICODE and only the the much less useful
PyUnicode_As...String() APIs were left, which intentionally do not
have an error argument, because they were intended as quick
replacement for PyString_AsString() uses in Python 2.

> I don't know why PyUnicode_AsCharmapString() don't support the errors
> argument. I added PyUnicode_AsEncodedString() as a replacement
(issue19569)
> because this is the only public non-deprecated way to do a charmap
encoding
> with errors handling. There is no exact equivalent, but
> PyUnicode_AsCharmapString() and 

[issue28749] Fixed the documentation of the mapping codec APIs

2017-01-22 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

How can we move this issue forward? Marc-Andre, have I answered to your 
objections?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28749] Fixed the documentation of the mapping codec APIs

2016-11-28 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

> The only part that is not correct is "single string characters".
> This should read "single bytes" or "bytes strings of length 1".

This is not correct. Decoding mappings map not bytes strings, but integers. 
And this is not the only incorrect part. Decoding mappings can map to 
multicharacter Unicode strings, not to single Unicodecharacters. Not just 
None, but the integer 0xfffe and Unicode string '\ufffe' mean "undefined 
mapping".

There are similar incorrectnesses about encoding mappings.

> I also don't see where you copied the description. Without some
> description of what "mappings" are in the context of the charmap
> codec, it's not easy to understand what the purpose of these
> APIs is. Please just fix the bytes wording instead of removing the
> whole intro.

Decoding mappings were desribed in the introduction and in the description of 
PyUnicode_DecodeCharmap() (both are outdated and incomplete). I merged and 
corrected descriptions and left it only in one place, since 
PyUnicode_DecodeCharmap() is the only function that needs this. Same for 
encoding mappings. Both decoding and encoding mappings do not have a relation 
to PyUnicode_Translate(). The paragraph about a LookupError in the 
introduction was totally wrong. I left in the introduction only common part. 
Other details are too different in decoding, encoding and translation mappings.

> >> Also, this wording needs to be corrected: "bytes (integers in the range
> >> from 0 to 255)". Bytes are not integers. I'd suggest to use the more
> >> correct wording "bytes strings of length 1".> 
> > The word "bytes" means here not Python bytes object, but is used in more
> > common meaning: an integer in the range from 0 to 255.
> That's confusing, since we use the term "bytes" as referring
> to the bytes object in Python. Please use "integers in the range
> 0-255".

Okay, I'll remove the word "bytes" here.  But how would you formulate the 
following sentence: "Unmapped bytes (ones which cause a :exc:`LookupError`) as 
well as mapped to ``None``, ``0xFFFE`` or ``'\ufffe'`` are treated as 
"undefined 
mapping" and cause an error."?

> Aside: The deprecation of PyUnicode_EncodeCharmap() also seems misplaced
> in this context, since only the Py_UNICODE version of the API is
> deprecated. The functionality still exists and is useful. An API
> similar to the _PyUnicode_EncodeCharmap() API should be made publicly
> available to accommodate for the deprecation, since the mentioned
> PyUnicode_AsCharmapString() and PyUnicode_AsEncodedString()
> APIs are not suitable as replacement. PyUnicode_AsCharmapString()
> doesn't support error handling (strange, BTW) and
> PyUnicode_AsEncodedString() has a completely unrelated meaning (no
> idea why it's mentioned here at all).

Only PyUnicode_EncodeCharmap() is deprecated, PyUnicode_AsCharmapString() is 
not deprecated. I placed the deprecated function just after its non-deprecated 
counerpart following the pattern for other deprecated functions. If you prefer 
I'll move both deprecated functions (PyUnicode_EncodeCharmap and 
PyUnicode_TranslateCharmap) together at the end of this section.

I don't know why PyUnicode_AsCharmapString() don't support the errors 
argument. I added PyUnicode_AsEncodedString() as a replacement (issue19569) 
because this is the only public non-deprecated way to do a charmap encoding 
with errors handling. There is no exact equivalent, but 
PyUnicode_AsCharmapString() and PyUnicode_AsEncodedString() cover different 
areas of using PyUnicode_EncodeCharmap().

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28749] Fixed the documentation of the mapping codec APIs

2016-11-28 Thread Marc-Andre Lemburg

Marc-Andre Lemburg added the comment:

On 28.11.2016 12:24, Serhiy Storchaka wrote:
>> Why are you removing the introductory section on how mappings work ?
> 
> Because it is not correct. I copied it to descriptions of concrete functions 
> with correcting it according to the peculiarity of particular function.

The only part that is not correct is "single string characters".
This should read "single bytes" or "bytes strings of length 1".

I also don't see where you copied the description. Without some
description of what "mappings" are in the context of the charmap
codec, it's not easy to understand what the purpose of these
APIs is. Please just fix the bytes wording instead of removing the
whole intro.

>> Also, this wording needs to be corrected: "bytes (integers in the range from 
>> 0 to 255)". Bytes are not integers. I'd suggest to use the more correct 
>> wording "bytes strings of length 1".
> 
> The word "bytes" means here not Python bytes object, but is used in more 
> common meaning: an integer in the range from 0 to 255.

That's confusing, since we use the term "bytes" as referring
to the bytes object in Python. Please use "integers in the range
0-255".

Aside: The deprecation of PyUnicode_EncodeCharmap() also seems misplaced
in this context, since only the Py_UNICODE version of the API is
deprecated. The functionality still exists and is useful. An API
similar to the _PyUnicode_EncodeCharmap() API should be made publicly
available to accommodate for the deprecation, since the mentioned
PyUnicode_AsCharmapString() and PyUnicode_AsEncodedString()
APIs are not suitable as replacement. PyUnicode_AsCharmapString()
doesn't support error handling (strange, BTW) and
PyUnicode_AsEncodedString() has a completely unrelated meaning (no
idea why it's mentioned here at all).

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28749] Fixed the documentation of the mapping codec APIs

2016-11-28 Thread Serhiy Storchaka

Changes by Serhiy Storchaka :


Added file: http://bugs.python.org/file45669/docs-PyUnicode_Translate-3.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28749] Fixed the documentation of the mapping codec APIs

2016-11-28 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

Thanks Xiang. I forgot about comments in headers, the updated path updates them 
too.

> Why are you removing the introductory section on how mappings work ?

Because it is not correct. I copied it to descriptions of concrete functions 
with correcting it according to the peculiarity of particular function.

> Also, this wording needs to be corrected: "bytes (integers in the range from 
> 0 to 255)". Bytes are not integers. I'd suggest to use the more correct 
> wording "bytes strings of length 1".

The word "bytes" means here not Python bytes object, but is used in more common 
meaning: an integer in the range from 0 to 255.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28749] Fixed the documentation of the mapping codec APIs

2016-11-28 Thread Marc-Andre Lemburg

Marc-Andre Lemburg added the comment:

Why are you removing the introductory section on how mappings work ?

Also, this wording needs to be corrected: "bytes (integers in the range from 0 
to 255)". Bytes are not integers. I'd suggest to use the more correct wording 
"bytes strings of length 1".

--
nosy: +lemburg

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28749] Fixed the documentation of the mapping codec APIs

2016-11-24 Thread Xiang Zhang

Xiang Zhang added the comment:

v2 LGTM.

--
nosy: +xiang.zhang

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28749] Fixed the documentation of the mapping codec APIs

2016-11-24 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

Updated patch addresses Xiang's and Victor's comments.

--
Added file: http://bugs.python.org/file45623/docs-PyUnicode_Translate-2.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28749] Fixed the documentation of the mapping codec APIs

2016-11-20 Thread Serhiy Storchaka

New submission from Serhiy Storchaka:

Proposed patch adds the documentation of PyUnicode_Translate() and fixes the 
documentation of other mapping codec APIs (it is incorrect in Python 3): 
PyUnicode_DecodeCharmap(), PyUnicode_AsCharmapString(), 
PyUnicode_EncodeCharmap(), and PyUnicode_TranslateCharmap().

--
assignee: docs@python
components: Documentation, Unicode
files: docs-PyUnicode_Translate.patch
keywords: patch
messages: 281259
nosy: docs@python, ezio.melotti, haypo, serhiy.storchaka
priority: normal
severity: normal
stage: patch review
status: open
title: Fixed the documentation of the mapping codec APIs
type: behavior
versions: Python 3.5, Python 3.6, Python 3.7
Added file: http://bugs.python.org/file45557/docs-PyUnicode_Translate.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com