[issue21279] str.translate documentation incomplete

2015-08-05 Thread Zachary Ware

Zachary Ware added the comment:

Very minor grammatical fixes, reflowed the .rst docs, and re-added the codecs 
module mention in a less obtrusive manner, but the patch is committed.  Thank 
you Kinga, Martin, and John!

--
nosy: +zach.ware

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue21279
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21279] str.translate documentation incomplete

2015-08-05 Thread Roundup Robot

Roundup Robot added the comment:

New changeset ae53bd5decae by Zachary Ware in branch '3.4':
Issue #21279: Flesh out str.translate docs
https://hg.python.org/cpython/rev/ae53bd5decae

New changeset 064b569e38fe by Zachary Ware in branch '3.5':
Issue #21279: Merge with 3.4
https://hg.python.org/cpython/rev/064b569e38fe

New changeset 967c9a9fe724 by Zachary Ware in branch 'default':
Closes #21279: Merge with 3.5
https://hg.python.org/cpython/rev/967c9a9fe724

--
nosy: +python-dev
resolution:  - fixed
stage: commit review - resolved
status: open - closed

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue21279
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21279] str.translate documentation incomplete

2015-06-20 Thread Martin Panter

Martin Panter added the comment:

Patch v6 looks okay, so I think it is ready to commit.

--
stage: patch review - commit review
versions: +Python 3.6

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue21279
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21279] str.translate documentation incomplete

2015-01-25 Thread John Posner

John Posner added the comment:

Per Martin's suggestion, deltas from issue21279.v5.patch:

* no change to patch for doc/library/stdtypes.rst

* doc string reflowed in patch for objects/unicodeobject.c

--
Added file: http://bugs.python.org/file37855/issue21279.v6.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue21279
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21279] str.translate documentation incomplete

2015-01-25 Thread Berker Peksag

Changes by Berker Peksag berker.pek...@gmail.com:


--
nosy: +berker.peksag

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue21279
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21279] str.translate documentation incomplete

2015-01-24 Thread Martin Panter

Martin Panter added the comment:

I’m happy with the new wording in v5. Maybe the docstring in the C module could 
be reflowed though.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue21279
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21279] str.translate documentation incomplete

2014-12-23 Thread John Posner

John Posner added the comment:

issue21279.v5.patch tries to apply the comments in msg233013, msg233014, and 
msg233025 to the Doc/library/stdtypes.rst writeup. Then it applies some of the 
same language to the docstring in Objects/unicodeobject.c.

--
Added file: http://bugs.python.org/file37536/issue21279.v5.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue21279
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21279] str.translate documentation incomplete

2014-12-22 Thread Terry J. Reedy

Terry J. Reedy added the comment:

I agree with Serhiy: no bullet points, links to glossary (at least in doc), 
without repeating.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue21279
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21279] str.translate documentation incomplete

2014-12-22 Thread Martin Panter

Martin Panter added the comment:

The problem with mappings and sequences is that they both require len() and 
iter() implementations, but str.translate() only requires __getitem__(). 
Perhaps a qualifier could work, like:

The table must implement the __getitem__() method of mappings and sequences.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue21279
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21279] str.translate documentation incomplete

2014-12-21 Thread Martin Panter

Martin Panter added the comment:

Patch v4 with John’s doc string wording

--
Added file: http://bugs.python.org/file37522/issue21279.v4.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue21279
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21279] str.translate documentation incomplete

2014-12-21 Thread John Posner

John Posner added the comment:

Patch of 12-21 looks good, Martin.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue21279
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21279] str.translate documentation incomplete

2014-12-21 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

Proposed wording looks superfluously verbose to me.

Look also at description in Include/unicodeobject.h:

/* Translate a string by applying a character mapping table to it and
   return the resulting Unicode object.

   The mapping table must map Unicode ordinal integers to Unicode
   ordinal integers or None (causing deletion of the character).

   Mapping tables may be dictionaries or sequences. Unmapped character
   ordinals (ones which cause a LookupError) are left untouched and
   are copied as-is.

*/

It is repeated (more detailed) in Doc/c-api/unicode.rst. Isn't it pretty clear?

--
components: +Unicode
nosy: +ezio.melotti, georg.brandl, haypo, serhiy.storchaka

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue21279
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21279] str.translate documentation incomplete

2014-12-21 Thread Martin Panter

Martin Panter added the comment:

Serhiy can you point out which bits are too verbose? Perhaps you prefer it 
without the bullet list like in the earlier 2014-12-13 version of the patch.

Looking at the C API, I see a couple problems there:
* Omits mentioning that an ordinal can map to a replacement string
* It looks like the documented None behaviour applies when errors=ignore, 
otherwise it invokes a codec error handler

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue21279
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21279] str.translate documentation incomplete

2014-12-21 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

 Serhiy can you point out which bits are too verbose? Perhaps you prefer it
 without the bullet list like in the earlier 2014-12-13 version of the
 patch.

I prefer it without the bullet list and without LookupError expansion (there 
is a link to LookupError definition where IndexError and KeyError should be 
mentioned). Instead of new term subscriptable objects use mappings or 
sequences with links to glossary.

 Looking at the C API, I see a couple problems there:

Yes, it is slightly outdated and needs updates.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue21279
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21279] str.translate documentation incomplete

2014-12-19 Thread John Posner

John Posner added the comment:

Regarding Martin's patch of 12-18:

stdtypes.rst -- looks good to me

unicodeobject.c -- I suggest changing this sentence:

If a character is not in the table, the subscript operation should raise 
LookupError, and the character is left untouched.

 ... to:

If the subscript operation raises a LookupError, the character is left 
untouched.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue21279
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21279] str.translate documentation incomplete

2014-12-17 Thread Martin Panter

Martin Panter added the comment:

Here is a new patch based on John’s suggestion

--
Added file: http://bugs.python.org/file37487/issue21279.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue21279
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21279] str.translate documentation incomplete

2014-12-15 Thread John Posner

John Posner added the comment:

Kindly ignore message #2 on the Rietveld page (sorry for the channel noise). 
Here's my suggested revision:

Return a copy of the string *str* in which each character has been mapped 
through the given translation *table*. The table must be a subscriptable 
object, for instance a list or dictionary; when subscripted (indexed) by a 
Unicode ordinal (an integer in range(1048576)), the table object can:

* return a Unicode ordinal or a string, to map the character to one or more 
other characters.

* return None, to delete the character from the return string.

* raise a LookupError (possibly an instance of subclass IndexError or 
KeyError), to map the character to itself.

--
nosy: +jjposner

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue21279
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21279] str.translate documentation incomplete

2014-12-15 Thread Martin Panter

Martin Panter added the comment:

I’m largely happy with any of these revisions. If I end up doing another patch 
I would omit the *str* (it is a class name, not a parameter). Also I would omit 
the range(2^20) claim. Unless people think it is important; why is it different 
to sys.maxunicode + 1 = 0x11?

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue21279
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21279] str.translate documentation incomplete

2014-12-13 Thread Terry J. Reedy

Terry J. Reedy added the comment:

Many people may not know that IndexError and KeyError are subclasses of 
LookupError. I have not decided what to add yet, but I think we are close.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue21279
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21279] str.translate documentation incomplete

2014-12-12 Thread Martin Panter

Martin Panter added the comment:

Update patch with typo fixed, removed note about the “codecs” module (which I 
never found useful either), and updated the doc string with similar wording.

Terry, do you think the wording in the patch is good enough, or do you think 
some of your proposed wording should be included?

--
Added file: http://bugs.python.org/file37436/issue21279.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue21279
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21279] str.translate documentation incomplete

2014-04-18 Thread Kinga Farkas

Kinga Farkas added the comment:

I have created a patch based on Martin Panter's suggestions.  Please let me 
know if it is off or there should be additional changes included.

--
keywords: +patch
nosy: +lilbludot
Added file: http://bugs.python.org/file34966/issue21279.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue21279
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21279] str.translate documentation incomplete

2014-04-18 Thread Terry J. Reedy

Terry J. Reedy added the comment:

The docstring is more accurate.
 str.translate.__doc__
'S.translate(table) - str\n\nReturn a copy of the string S, where all 
characters have been mapped\nthrough the given translation table, which must be 
a mapping of\nUnicode ordinals to Unicode ordinals, strings, or None.\nUnmapped 
characters are left untouched. Characters mapped to None\nare deleted.'

To me, even this is a bit unclear on exceptions and 'unmapped'. Based on 
experiments and then reading the C source, I determined that LookupErrors mean 
'unmapped' while other exceptions are passed on and terminate the translation.

Return a copy of the string S, where all characters have been mapped through 
the given translation table. When subscripted by a Unicode ordinal (integer in 
range(1048576)), the table must return a Unicode ordinal, string, or None, or 
else raise a LookupError. A LookupError, which includes instances of subclasses 
IndexError and KeyError, indicates that the character is unmapped and should be 
left untouched. Characters mapped to None are deleted.

class Table:
def __getitem__(self, key):
if key == 99:   raise LookupError() #'c'
elif key == 100: return None  # 'd'
elif key == 101: return 'xyz'  # 'e'
else: return key+1

print('abcdef'.translate(Table()))
# bccxyzg

The current doc ends with Note
An even more flexible approach is to create a custom character mapping codec 
using the codecs module (see encodings.cp1251 for an example).

I don't see how this is supposed to help. Encodings.cp1251 uses a string of 256 
chars as a lookup table.

--
nosy: +terry.reedy

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue21279
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21279] str.translate documentation incomplete

2014-04-18 Thread Terry J. Reedy

Terry J. Reedy added the comment:

I see that we mostly added the same info.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue21279
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21279] str.translate documentation incomplete

2014-04-17 Thread Josh Rosenberg

Josh Rosenberg added the comment:

For the record, I have intentionally used bytes.maketrans to make translation 
table for str.translate for precisely this reason; it's much faster to look up 
a ordinal in a bytes object than in a dictionary. Before the recent (partial) 
patch for str.translate performance (#21118), this was a huge improvement if 
you only needed to worry about latin-1 characters (though encoding to latin-1, 
using bytes.translate, then decoding again was still faster). It's still faster 
than using a dictionary even with the patch from #21118, but it's not nearly as 
significant.

--
nosy: +josh.rosenberg

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue21279
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21279] str.translate documentation incomplete

2014-04-16 Thread bob gailer

New submission from bob gailer:

Documentation for str.translate only mentions a dictionary for the translation 
table. Actually any iterable can be used, as long as its elements are integer, 
None or str.

Recommend wording:

str.translate(translation_table)

Return a copy of the s where all characters have been mapped through the 
translation_table - which must be either a dictionary mapping Unicode ordinals 
(integers) to Unicode ordinals, strings or None,
or an iterable. In this case the ord() of each character in s is used as an 
index into the iterable; the corresponding element of the iterable replaces the 
character. If ord() of the character exceeds the index range of the iterator, 
no substitution is made.

Example: to shift any of the first 255 ASCII characters to the next:

 'Now is the time for all good men'.translate(range(1, 256))
'Opx!jt!uif!ujnf!gps!bmm!hppe!nfo'

COMMENT: I placed mapped in quotes as technically this only applies to 
dictionaries. Not sure what the best word is.

--
assignee: docs@python
components: Documentation
messages: 216630
nosy: bgailer, docs@python
priority: normal
severity: normal
status: open
title: str.translate documentation incomplete
type: enhancement
versions: Python 3.3

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue21279
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21279] str.translate documentation incomplete

2014-04-16 Thread Raymond Hettinger

Changes by Raymond Hettinger raymond.hettin...@gmail.com:


--
keywords: +easy
stage:  - patch review
versions: +Python 3.4, Python 3.5 -Python 3.3

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue21279
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21279] str.translate documentation incomplete

2014-04-16 Thread Martin Panter

Martin Panter added the comment:

I suspect “iterable” is the wrong term.

 isinstance(set(), Iterable)
True
 abc.translate(set())
TypeError: 'set' object does not support indexing
 abc.translate(object())
TypeError: 'object' object is not subscriptable

Maybe “indexable” or “subscriptable” would be more correct? If this behaviour 
is part of the API, it would be nice to document, because it would have saved 
me a few times from implementing the __len__() and __iter__() methods of the 
mapping interface in my custom lookup tables.

Here is my suggestion:

str.translate(table):

Return a copy of the string where all characters have been mapped through 
“table”, a lookup table. The lookup table must be a subscriptable object, for 
instance a dictionary or list, mapping Unicode ordinals (integers) to Unicode 
ordinals, strings or None. If a character is not in the table, the subscript 
operation should raise LookupError, and the character is left untouched. 
Characters mapped to None are deleted.

--
nosy: +vadmium

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue21279
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com