[issue12014] str.format parses replacement field incorrectly

Ben Wolfson Fri, 03 Jun 2011 17:47:49 -0700

Ben Wolfson <wolf...@gmail.com> added the comment:

"""
>From the PEP: "Format strings consist of intermingled character data and 
>markup."
"""


I know. Here is an example of a format string:

"hello, {0}"

Here is the character data from that format string:

"hello, "

Here is the markup:

"{0}"

This follows *directly* from the definition of "character data", which I've 
quoted several times now. In the following expression:

"{0}".format(1)

there is NO character data, because there is NOTHING which is "which is 
transferred unchanged from the format string to the output string".

The "{0}" doesn't appear in the output string at all. And the 1 isn't 
transferred unchanged: it has str() called on it. Since there is nothing which 
meets the definition of character data, there is nothing which *is* character 
data in the string, regarded as a format string. It is pure markup---it 
consists solely of a replacement field delimited by curly braces. I really 
don't see why this matters at all, but, nevertheless, I apologize if I'm 
explaining it poorly.

"""
Again, I'm not sure what you're getting at. The inner "{0}" is not interpreted 
(per the PEP). So the entire string is replaced by d['{0}'], or 'spam'.

Let me try to explain it again. str.format() parses the string, looking for 
matched sets of braces. In your last example above, the very first character 
'{' is matched to the very last character '}'. They match, in sense that all of 
the nested ones inside match. Once the markup is separated from the character 
data, the interpretation of what's inside the markup is then done. In this 
example, there is no character data.
"""

Yes, there is no character data. And I understand perfectly what is happening. 
Here's the problem: your description of what the implementation does is 
incorrect. You say that 

"""
The current implementation of str.format() finds matched pairs of braces and 
call what's inside "markup", then parse that markup.
"""

Now, the only reason for thinking that this:

"{0[}]}"

should be treated differently from this:

"{0[a]}"

is that inside square brackets curly brackets indicate replacement fields. If 
you want to justify what the current implementation does as an implementation 
of the PEP and an interpretation of what the PEP says, you *have* to think 
that. But if you think that, then the current implementation should *not* treat 
this:

"{0[{0}]}"

the way it does, because it does *not* treat the interior curly braces as 
indications of a replacement field---or rather, it does at one point in the 
source (in MarkupIterator_next) and it doesn't at another (in 
FieldNameIterator). I agree that what the current implementation does in the 
last example is in fact correct. But if it's correct in the one case, it's 
incorrect in the other, and vice versa. There is no justification, in terms of 
the PEP, for the present behavior.

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue12014>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue12014] str.format parses replacement field incorrectly

Reply via email to