[issue12014] str.format parses replacement field incorrectly

2014-04-06 Thread Benjamin Peterson

Changes by Benjamin Peterson benja...@python.org:


--
resolution:  - fixed
status: open - closed

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12014
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12014] str.format parses replacement field incorrectly

2013-11-26 Thread Benjamin Peterson

Benjamin Peterson added the comment:

Should be generally patched up in 3.4. Try it out.

--
nosy: +benjamin.peterson

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12014
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12014] str.format parses replacement field incorrectly

2013-10-20 Thread Ben Wolfson

Changes by Ben Wolfson wolf...@gmail.com:


--
versions: +Python 3.4

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12014
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12014] str.format parses replacement field incorrectly

2013-07-15 Thread Ben Wolfson

Ben Wolfson added the comment:

Ping.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12014
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12014] str.format parses replacement field incorrectly

2013-02-19 Thread Ben Wolfson

Ben Wolfson added the comment:

My own preference is to let this quote from PEP 3101 dominate the behaviour: 
The rules for parsing an item key are very simple. If it starts with a digit, 
then it is treated as a number, otherwise it is used as a string.

That means Petri's suggested solution (allowing any character except a closing 
square bracket and braces in the item key) sounds good to me.

But ... that isn't what the quotation from the PEP says, since it doesn't 
exclude braces. I also don't really see why the PEP should be given much 
authority in this issue, since it pays extremely cursory attention to this part 
of the format.

In any case, judging by the filename and description (god knows I can't 
remember, having written it nine months ago), strformat-no-braces.diff 
implements that behavior. (Oh, now I see from an earlier comment of mine that 
that is, in fact, what it does.)

Meanwhile, it was five months ago that Eric Smith said It's on my list of 
things to look at. I have a project due next week, then I'll have some time.

I understand that this is not the biggest deal, but the patch is also pretty 
compact and (I think) easily understood. Petri seemed to think it was mostly ok 
in May 2012, when, IIRC, several people on python-dev agreed that the current 
behavior should be changed. God only knows how unicode_format.h has changed in 
the interim. Peer review for academic papers moves substantially faster than 
this.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12014
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12014] str.format parses replacement field incorrectly

2013-02-18 Thread Nick Coghlan

Nick Coghlan added the comment:

This actually came up on the core-mentorship list (someone was trying to 
translate old mod-formatting code that used a colon in the lookup names and 
discovered this odd behaviour)

My own preference is to let this quote from PEP 3101 dominate the behaviour: 
The rules for parsing an item key are very simple. If it starts with a digit, 
then it is treated as a number, otherwise it is used as a string.

That means Petri's suggested solution (allowing any character except a closing 
square bracket and braces in the item key) sounds good to me.

--
nosy: +ncoghlan

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12014
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12014] str.format parses replacement field incorrectly

2012-09-29 Thread Barry A. Warsaw

Changes by Barry A. Warsaw ba...@python.org:


--
nosy: +barry

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12014
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12014] str.format parses replacement field incorrectly

2012-09-06 Thread Alexander Belopolsky

Changes by Alexander Belopolsky alexander.belopol...@gmail.com:


--
nosy: +belopolsky

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12014
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12014] str.format parses replacement field incorrectly

2012-08-30 Thread Éric Araujo

Éric Araujo added the comment:

You can bring this up to python-dev to get other developers’ opinion.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12014
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12014] str.format parses replacement field incorrectly

2012-07-21 Thread Ben Wolfson

Ben Wolfson wolf...@gmail.com added the comment:

Ping!

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12014
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12014] str.format parses replacement field incorrectly

2012-06-17 Thread Ben Wolfson

Ben Wolfson wolf...@gmail.com added the comment:

I can certainly address those issues---I'll hold off on doing so, though, until 
it's clearer whether more substantive things come up, so I can just do it in a 
swoop.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12014
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12014] str.format parses replacement field incorrectly

2012-06-17 Thread Florent Xicluna

Changes by Florent Xicluna florent.xicl...@gmail.com:


--
nosy: +flox

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12014
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12014] str.format parses replacement field incorrectly

2012-05-25 Thread Petri Lehtinen

Petri Lehtinen pe...@digip.org added the comment:

I added some comments on rietveld. These are only nit-picking about style and 
mostly reflect my personal taste, not show stoppers in any case.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12014
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12014] str.format parses replacement field incorrectly

2012-05-24 Thread Ben Wolfson

Ben Wolfson wolf...@gmail.com added the comment:

Here's a patch that works against the current unicode_format.h and implements 
what Petri suggested.

--
Added file: http://bugs.python.org/file25699/strformat-no-braces.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12014
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12014] str.format parses replacement field incorrectly

2012-05-23 Thread Petri Lehtinen

Petri Lehtinen pe...@digip.org added the comment:

Ben Wolfson wrote:
 Maybe, but the last time it went to python-dev (in December) there
 was little discussion at all, and the patches that exist now worked
 on the codebase as it existed then.

Maybe it's pointless to bring it up on python-dev then. I just thought
that people might feel strongly about this.

 Anyway, it seems as if progress is being made on PEP 420, so perhaps
 better to let Eric take a look before bringing it up again?

Let's wait for Eric's comments, as he implemented format() in the
first place.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12014
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12014] str.format parses replacement field incorrectly

2012-05-22 Thread Petri Lehtinen

Petri Lehtinen pe...@digip.org added the comment:

Ben,

As I've said, I think that we should go for the documented behavior with the 
addition of not allowing braces inside the format string (with the exception of 
format_spec).

So AFAICS, index_string would become

index_string  ::=  any source character except ] or { or } +

 Anyway, as far as I can tell the patches would have to be reworked in
 the light of recent changes anyway. I am willing to do this if there's
 actually interest. 

Are you still willing to rework the patches?

And as I said already earlier, it wouldn't hurt if this was taken to python-dev 
once more. If there's a good, working patch ready, it might make it easier to 
gain consensus.

--
versions:  -Python 3.1

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12014
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12014] str.format parses replacement field incorrectly

2012-05-22 Thread Ben Wolfson

Ben Wolfson wolf...@gmail.com added the comment:

 Are you still willing to rework the patches?

Sure. Now that I've actually looked at unicode_format.h it looks like the 
biggest (relevant) difference might just be that the file isn't named 
string_format.h, so I suspect it will be pretty straightforward.

 And as I said already earlier, it wouldn't hurt if this was taken to 
 python-dev once more. If there's a good, working patch ready, it might 
 make it easier to gain consensus.

Maybe, but the last time it went to python-dev (in December) there was little 
discussion at all, and the patches that exist now worked on the codebase as it 
existed then. Anyway, it seems as if progress is being made on PEP 420, so 
perhaps better to let Eric take a look before bringing it up again?

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12014
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12014] str.format parses replacement field incorrectly

2012-05-19 Thread Ben Wolfson

Ben Wolfson wolf...@gmail.com added the comment:

Ping!

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12014
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12014] str.format parses replacement field incorrectly

2012-05-19 Thread Eric V. Smith

Eric V. Smith e...@trueblade.com added the comment:

I'll look at it when I'm done with PEP 420.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12014
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12014] str.format parses replacement field incorrectly

2012-03-13 Thread Ben Wolfson

Ben Wolfson wolf...@gmail.com added the comment:

just curious if there are any developments here since the first 3.3 alpha has 
been released.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12014
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12014] str.format parses replacement field incorrectly

2011-12-15 Thread Eric V. Smith

Changes by Eric V. Smith e...@trueblade.com:


--
assignee:  - eric.smith

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12014
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12014] str.format parses replacement field incorrectly

2011-11-30 Thread Ben Wolfson

Ben Wolfson wolf...@gmail.com added the comment:

All three patches look different to me.

Yeah, I verified that later; I'm not sure what made me think otherwise except 
that I eyeballed them sloppily. (It's still true that they'd need to target a 
different file for 3.3 now.)

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12014
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12014] str.format parses replacement field incorrectly

2011-11-29 Thread Petri Lehtinen

Petri Lehtinen pe...@digip.org added the comment:

 I just noticed that the patch labelled strformat-as-document is
 actually the same as the other one, owing to my incompetence.

All three patches look different to me.

 Anyway, as far as I can tell the patches would have to be reworked
 in the light of recent changes anyway. I am willing to do this if
 there's actually interest. Otherwise, is there anything else I can
 do here? Is it necessary to write a PEP or take this to python-ideas
 or something?

There's still interest, at least from me :)

In my opinion we should have the documented behavior (integer or identifier as 
field_name), AND braces should be disallowed inside the format string, with the 
exception of one level of nesting in the format_spec part.

This should probably be taken to python-dev once more, as the previous 
discussion didn't reach consesus, except that the current approach is bad and 
something needs to be done.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12014
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12014] str.format parses replacement field incorrectly

2011-11-28 Thread Ben Wolfson

Ben Wolfson wolf...@gmail.com added the comment:

I just noticed that the patch labelled strformat-as-document is actually the 
same as the other one, owing to my incompetence. Anyway, as far as I can tell 
the patches would have to be reworked in the light of recent changes anyway. I 
am willing to do this if there's actually interest. Otherwise, is there 
anything else I can do here? Is it necessary to write a PEP or take this to 
python-ideas or something?

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12014
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12014] str.format parses replacement field incorrectly

2011-07-06 Thread Ben Wolfson

Ben Wolfson wolf...@gmail.com added the comment:

This patch differs from the previous one; its goal is to bring the actual 
behavior of the interpreter into line with the documentation (with the 
exception of using only decimal integers, rather than any integers, wherever 
the documentation for str.format currently has integer: this does, however, 
conform with current behavior).

--
Added file: http://bugs.python.org/file22598/strformat-as-documented.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12014
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12014] str.format parses replacement field incorrectly

2011-07-06 Thread Ben Wolfson

Ben Wolfson wolf...@gmail.com added the comment:

And here is a patch for Greg Ewing's proposal: 
http://mail.python.org/pipermail/python-dev/2011-June/111934.html

Again, decimal integers rather than any kind of integers are used.

Both patches alter the exceptions expected in various places in test_unicode's 
test_format:

{0.}.format() raises a ValueError (because the format string is invalid) 
rather than an IndexError (because there is no argument)

{0[}.format(), likewise.

{0]}.format() raises a ValueError (because the format string is invalid) 
rather than a KeyError (because 0] is taken to be the name of a keyword 
argument---meaning that the test suite was testing the actual behavior of the 
implementation rather than the documented behavior).

{c]}.format(), likewise.

In this patch, {0[{1}]}.format('abcdef', 4) raises a ValueError rather than a 
TypeError, because {1}, being neither a decimalinteger nor an identifier, 
invalidates the replacement field.

Both patches also add tests for constructions like this:

{[0]}.format([3]) -- '3'
{.__class__}.format(3) -- type 'int'

This conforms with the documentation (and current behavior), since in it 
arg_name is defined to be optional, but it is not currently covered in 
test_format, that I could tell, anyway.

--
Added file: 
http://bugs.python.org/file22599/strformat-just-identifiers-please.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12014
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12014] str.format parses replacement field incorrectly

2011-07-06 Thread Raymond Hettinger

Raymond Hettinger raymond.hettin...@gmail.com added the comment:

Please stick with integer instead of decimalinteger.   In an effort to make 
the docs more precise, there is an unintended effect of making them harder to 
understand.

--
nosy: +rhettinger

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12014
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12014] str.format parses replacement field incorrectly

2011-07-06 Thread Ben Wolfson

Ben Wolfson wolf...@gmail.com added the comment:

undo integer - decimalinteger in docs

--
Added file: http://bugs.python.org/file22601/strformat-as-documented.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12014
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12014] str.format parses replacement field incorrectly

2011-07-06 Thread Ben Wolfson

Changes by Ben Wolfson wolf...@gmail.com:


Removed file: http://bugs.python.org/file22598/strformat-as-documented.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12014
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12014] str.format parses replacement field incorrectly

2011-07-06 Thread Ben Wolfson

Ben Wolfson wolf...@gmail.com added the comment:

(same as previous)

--
Added file: 
http://bugs.python.org/file22602/strformat-just-identifiers-please.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12014
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12014] str.format parses replacement field incorrectly

2011-07-06 Thread Ben Wolfson

Changes by Ben Wolfson wolf...@gmail.com:


Removed file: 
http://bugs.python.org/file22599/strformat-just-identifiers-please.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12014
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12014] str.format parses replacement field incorrectly

2011-06-04 Thread Petri Lehtinen

Petri Lehtinen pe...@digip.org added the comment:

 PEP 3101 defines format strings as intermingled character data and markup. 
 Markup defines replacement fields and is delimited by braces. Only after 
 markup is extracted does the PEP talk about interpreting the contents of the 
 markup.
 
 So, given {0[a}b]} the parser first parses out the character data and the 
 markup. The first piece of markup is {0[a}. That gives a syntax error 
 because it's missing a right bracket.
 
 I realize you'd like the parser to find the markup as the entire string, but 
 that's not how I read the PEP.

This is a good point, although the support of further replacement
fields inside format_specifiers requies the parser to count matching
braces, if the markup is to be extracted before its interpreted.

But disallowing unmathced '}' inside the replacement field doesn't
still explain why this shouldn't work:

'{0[!]!r}'.format({'!': 'foo'})

I'm completely fine with disallowing '}', but it seems to me that
there's absolutely no reason to not parse the element_index and later
fields correctly with respect to '!' and ':'.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12014
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12014] str.format parses replacement field incorrectly

2011-06-03 Thread R. David Murray

R. David Murray rdmur...@bitdance.com added the comment:

The documentation is, in principle, wrong.  The actual authority for the 
correct implementation is PEP3101, which says the following:

The str.format() function will have
a minimalist parser which only attempts to figure out when it is
done with an identifier (by finding a '.' or a ']', or '}',
etc.).

Changing that specification would require a discussion on python-dev.

--
nosy: +r.david.murray

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12014
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12014] str.format parses replacement field incorrectly

2011-06-03 Thread R. David Murray

R. David Murray rdmur...@bitdance.com added the comment:

Note that the PEP also explicitly addresses your concern about getattr, as well 
(validation of the name is delegated to the object's __getattr__).

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12014
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12014] str.format parses replacement field incorrectly

2011-06-03 Thread Ben Wolfson

Ben Wolfson wolf...@gmail.com added the comment:

Hm. As I interpret this:

The str.format() function will have
a minimalist parser which only attempts to figure out when it is
done with an identifier (by finding a '.' or a ']', or '}',
etc.).

The present implementation is at variance with both the documentation *and* the 
PEP, since the present implementation does not in fact figure out when it's 
done with an identifier that way. However, this statement is actually a very 
thin reed on which to make any decisions: a real authority shouldn't say etc. 
like that! And, of course, we have to add an implicit depending on what it's 
currently looking at to the parenthetical, because the two strings {0[a.b]} 
and {0[a].b} are, and should be, treated differently. In particular, although 
one could find a '.' in the element_index in the former string, the 
minimalist parser should not (and does not) conclude that it's done with the 
identifier *there*:

 {0[a.b]}.format({a.b:1})
'1'

Instead it treats the '.' as just another character with no particular 
syntactic significance, the same way it does 'a' and 'b'. It's a shame that the 
PEP doesn't go into more detail than it does about this sort of thing.

The same should go for '}', when we're looking at an element_index field. It 
should be treated as just another character with no particular syntactic 
significance. At present that is not the case:

 {0[a}b]}.format({a}b:1})
Traceback (most recent call last):
  File stdin, line 1, in module
ValueError: Missing ']' in format string

If the attached patch were used, the above expression would evaluate to '1' 
just as did the first one. Now, given the fact that the PEP actually says quite 
little about how this sort of thing is to be handled, and given (as 
demonstrated above with the case of the '.' character) that we can't take the 
little list it gives as indicating when it's done with an identifier regardless 
of context, I don't think this change would constitute a change *to the 
specification*; it does, admittedly, constitute an interpretation of the 
specification, but then, so does the present implementation, and the present 
implementation is at variance with the PEP *anyway*, as regards the characters 
':' and '!'.

The paragraph prior to the one quoted by R. David Murray reads:

Because keys are not quote-delimited, it is not possible to
specify arbitrary dictionary keys (e.g., the strings 10 or
:-]) from within a format string.

I take it that this means (in the first place) that, because a sequence of 
digits is interpreted as a number, the following will fail:

'{0[10]}'.format({10:4})

And indeed it does. The second example is rather unfortunate, though: is the 
reason one can't use that key because it contains a colon? Or because it 
contains a right square bracket? Even if the present patch is accepted one 
couldn't use a right square bracket, since a parser that could figure out where 
to draw the lines in something like this:

'{0[foo ] bar]}'

would not be very minimalist. However, as I have noted previously, there is no 
reason to rule out colons and exclamation points in the element_index field. 
The PEP doesn't actually take up this question in detail. (It hardly does so at 
all.) However, according to what I think the most reasonable interpretation of 
the PEP is, the present implementation is at variance with the PEP. The present 
implementation is certainly at variance with the documentation, which 
represents to some extent an interpretation and specification of the PEP. 

Consequently, to the extent that changing a specification requires discussion 
on python-dev, it seems to me that the present implementation is already a de 
facto change to the specification, while accepting the attached patch would 
bring the implementation into *greater* accord with the specification---so that 
(to conclude cheekily) *not* accepting the patch is what should require 
discussion on python-dev. However, if it is thought necessary, I'll be happy to 
start the discussion.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12014
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12014] str.format parses replacement field incorrectly

2011-06-03 Thread R. David Murray

R. David Murray rdmur...@bitdance.com added the comment:

I agree that the current situation is a bit murky and ought to be clarified, 
but I'm going to leave it to Eric to point they way forward, as he is far more 
knowledgeable about this area than I.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12014
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12014] str.format parses replacement field incorrectly

2011-06-03 Thread Petri Lehtinen

Petri Lehtinen pe...@digip.org added the comment:

I've played around with the str.format() code for a few weeks now, to
investigate its poor performance compared to the % operator.

Having written a few parsers before, I would change it to parse each
part separately:

1. field_name
2a. if followed by '[': element_index (anything until ']')
2b. elif followed by '.': attribute_name
3. if followed by '!': conversion
4. if followed by '}': format_spec (anything until '}')

It seems to me that the documentation also suggests this behavior, and
that this bug report is correct.

What comes to parsing identifiers, it seems to me that stopping at
'.', ']', and '}' is not enough. In field_name, '[', ':' and '!' would
also be needed, and ':' and '!' in attribute_name. It's a shame that
PEP3101 is so vague on this subject.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12014
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12014] str.format parses replacement field incorrectly

2011-06-03 Thread Eric V. Smith

Eric V. Smith e...@trueblade.com added the comment:

PEP 3101 defines format strings as intermingled character data and markup. 
Markup defines replacement fields and is delimited by braces. Only after markup 
is extracted does the PEP talk about interpreting the contents of the markup.

So, given {0[a}b]} the parser first parses out the character data and the 
markup. The first piece of markup is {0[a}. That gives a syntax error because 
it's missing a right bracket.

I realize you'd like the parser to find the markup as the entire string, but 
that's not how I read the PEP.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12014
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12014] str.format parses replacement field incorrectly

2011-06-03 Thread Ben Wolfson

Ben Wolfson wolf...@gmail.com added the comment:


PEP 3101 defines format strings as intermingled character data and markup. 
Markup defines replacement fields and is delimited by braces. Only after markup 
is extracted does the PEP talk about interpreting the contents of the markup.

So, given {0[a}b]} the parser first parses out the character data and the 
markup. The first piece of markup is {0[a}. That gives a syntax error because 
it's missing a right bracket.


The intermingling of character data and markup is irrelevant; character data is 
defined as data which is transferred unchanged from the format string to the 
output string, and nothing in {0[a]} is transferred unchanged.

Two parts of the PEP suggest that the markup in the above should be {0[a} 
rather than {0[a}]}:

Brace characters ('curly braces') are used to indicate a
replacement field within the string:

[...]

Braces can be escaped by doubling:

and

Note that the doubled '}' at the end, which would normally be
escaped, is not escaped in this case.  The reason is because
the '{{' and '}}' syntax for escapes is only applied when used
*outside* of a format field.  Within a format field, the brace
characters always have their normal meaning.

The first statement obviously doesn't mean that the exclusive use of braces in 
a format string is to indicate replacement fields, since it's immediately 
acknowledged that sometimes braces can occur without indicating a replacement 
field, when they're escaped. The second occurs specifically in the context of 
talking about escaping braces, so the following interpretation remains 
available: within a format field, a brace is a brace is a brace---that is, a 
pair of braces is a pair of braces, not an escape for a single brace.

In fact, though the following argument may appear Jesuitical, it does, I think, 
hold water: The second quotation above mentions braces within a *format field*. 
What is a format field? Well, we know that The element with the braces is 
called a 'field', but format field is more specific; the whole thing between 
braces isn't (necessarily!) the format field. And we know that

Fields consist
of a 'field name', which can either be simple or compound, and an
optional 'format specifier'.

So, perhaps a format field is the part of the broader field where the format 
specifier lives. And lo, it's in the part of the PEP talking about Format 
Specifiers that we get the second quotation above.

Each field can also specify an optional set of 'format
specifiers' which can be used to adjust the format of that field.
Format specifiers follow the field name, with a colon (':')
character separating the two:

So even if you think that the claim that within a format field, the brace 
characters always have their normal meaning means not the brace characters 
aren't escaped but the brace characters indicate a replacement field, that 
statement could just mean that they only have this significance in *part* of 
the *replacement* field---the part having to do with the formatting of the 
replacement text---and not the whole replacement field. So that, for instance, 
the following does what you'd expect:


 {0[{4}]}.format({{4}:3})
'3'

And it *does* do what you'd expect, in the *current* implementation---that is, 
the braces here don't have the meaning of introducing a replacement field 
[they're kinda-sorta parsed as if they were introduced a replacement field but 
that is obviously not their semantics], but are instead just treated as braces. 
They also aren't escaped: 

 {0[{{4}}]}.format({{{4}}:3})
'3'

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12014
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12014] str.format parses replacement field incorrectly

2011-06-03 Thread Eric V. Smith

Eric V. Smith e...@trueblade.com added the comment:

The intermingling of character data and markup is far from irrelevant: that's 
exactly what str.format() does! I don't see how it can be irrelevant to a 
discussion of how the string is parsed.

Note that there are no restrictions, in general, on what's in a format 
specifier. Braces can be in format specifiers, if they make sense for that 
type. For example:

 from datetime import datetime
 format(datetime.now(), '{}%Y-%m-%d}{')
'{}2011-06-03}{'

It's definitely true that you can have valid format specifiers that cannot be 
represented in strings parsed by str.format(). The PEP talks about both format 
specifiers in the abstract (stand alone) and format specifiers contained in 
str.format() strings.

The current implementation of str.format() finds matched pairs of braces and 
call what's inside markup, then parse that markup. This indeed restricts 
what's inside the markup. I believe the implementation is compliant with the 
PEP.

It's also true that other interpretations of the PEP are possible. I'm just not 
sure the benefit to be gained justifies changing all of the extant str.format() 
implementations, in addition to explaining the different behavior.

Many useful features for str.format() were rejected in order to keep the 
implementation and documentation simple.

I'm not saying change and improvement is impossible. I'm just not convinced 
it's worthwhile.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12014
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12014] str.format parses replacement field incorrectly

2011-06-03 Thread Ben Wolfson

Ben Wolfson wolf...@gmail.com added the comment:

str.format doesn't intermingle character data and markup. The PEP is quite 
clear about the terms in this case, at least: the *argument* to str.format 
consists of character data (passed through unchanged) and markup (processed). 
That's what it means to say that Character data is data which is transferred 
unchanged from the format string to the output string. In My name is {0}, 
My name is  is transferred unchanged from the format string to the output 
string when the string is formatted. We're talking about how the *markup* is 
defined.


The current implementation of str.format() finds matched pairs of braces and 
call what's inside markup, then parse that markup.


This is false, as I demonstrated.

 d = {{0}: spam}
 # a matched pair of braces. What's inside is considered markup.
... 
 {0}.format(d)
{'{0}': 'spam'}
 # a matched pair of braces. Inside is a matched pair of braces, and what's 
 inside of that is not considered markup.
... 
 {0[{0}]}.format(d)
'spam'
 


It's also true that other interpretations of the PEP are possible. I'm just not 
sure the benefit to be gained justifies changing all of the extant str.format() 
implementations, in addition to explaining the different behavior.


Well, the beauty of it is, you wouldn't have to explain the different behavior, 
because the patch makes it the case that the explanation already in the 
documentation is correct. It is currently not correct. That's why I found out 
about this current state of affairs: I read the documentation's explanation and 
believed it, and only after digging into the code did I understand the actual 
behavior.

It is also not a difficult change to make, would be backwards-compatible 
(anyway I rather doubt anyone was relying on a {0[:]}.format(whatever) 
raising an exception [1]), and relaxes a restriction that is not well motivated 
by the text of the PEP, is not consistently applied in the implementation (see 
above), and is confusing and limits the usefulness of the format method. It is 
true that I don't know where else, beyond the implementation in 
string_format.h, modifications would need to be made, but I'd be willing to 
undertake the task.

[1] and given that the present implementation does that, it's already 
noncompliant with the PEP, regardless of what one makes of curly braces.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12014
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12014] str.format parses replacement field incorrectly

2011-06-03 Thread Eric V. Smith

Eric V. Smith e...@trueblade.com added the comment:

From the PEP: Format strings consist of intermingled character data and 
markup.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12014
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12014] str.format parses replacement field incorrectly

2011-06-03 Thread Eric V. Smith

Eric V. Smith e...@trueblade.com added the comment:


 d = {{0}: spam}
 # a matched pair of braces. What's inside is considered markup.
... 
 {0}.format(d)
{'{0}': 'spam'}
 # a matched pair of braces. Inside is a matched pair of braces, and what's 
 inside of that is not considered markup.


I'm not sure what' you're getting at. {0} (which is indeed markup) is 
replaced by str(d), which is {'{0}': 'spam'}.


... 
 {0[{0}]}.format(d)
'spam'
 


Again, I'm not sure what you're getting at. The inner {0} is not interpreted 
(per the PEP). So the entire string is replaced by d['{0}'], or 'spam'.

Let me try to explain it again. str.format() parses the string, looking for 
matched sets of braces. In your last example above, the very first character 
'{' is matched to the very last character '}'. They match, in sense that all of 
the nested ones inside match. Once the markup is separated from the character 
data, the interpretation of what's inside the markup is then done. In this 
example, there is no character data.

I apologize if I'm explaining this poorly.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12014
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12014] str.format parses replacement field incorrectly

2011-06-03 Thread Ben Wolfson

Ben Wolfson wolf...@gmail.com added the comment:


From the PEP: Format strings consist of intermingled character data and 
markup.


I know. Here is an example of a format string:

hello, {0}

Here is the character data from that format string:

hello, 

Here is the markup:

{0}

This follows *directly* from the definition of character data, which I've 
quoted several times now. In the following expression:

{0}.format(1)

there is NO character data, because there is NOTHING which is which is 
transferred unchanged from the format string to the output string.

The {0} doesn't appear in the output string at all. And the 1 isn't 
transferred unchanged: it has str() called on it. Since there is nothing which 
meets the definition of character data, there is nothing which *is* character 
data in the string, regarded as a format string. It is pure markup---it 
consists solely of a replacement field delimited by curly braces. I really 
don't see why this matters at all, but, nevertheless, I apologize if I'm 
explaining it poorly.


Again, I'm not sure what you're getting at. The inner {0} is not interpreted 
(per the PEP). So the entire string is replaced by d['{0}'], or 'spam'.

Let me try to explain it again. str.format() parses the string, looking for 
matched sets of braces. In your last example above, the very first character 
'{' is matched to the very last character '}'. They match, in sense that all of 
the nested ones inside match. Once the markup is separated from the character 
data, the interpretation of what's inside the markup is then done. In this 
example, there is no character data.


Yes, there is no character data. And I understand perfectly what is happening. 
Here's the problem: your description of what the implementation does is 
incorrect. You say that 


The current implementation of str.format() finds matched pairs of braces and 
call what's inside markup, then parse that markup.


Now, the only reason for thinking that this:

{0[}]}

should be treated differently from this:

{0[a]}

is that inside square brackets curly brackets indicate replacement fields. If 
you want to justify what the current implementation does as an implementation 
of the PEP and an interpretation of what the PEP says, you *have* to think 
that. But if you think that, then the current implementation should *not* treat 
this:

{0[{0}]}

the way it does, because it does *not* treat the interior curly braces as 
indications of a replacement field---or rather, it does at one point in the 
source (in MarkupIterator_next) and it doesn't at another (in 
FieldNameIterator). I agree that what the current implementation does in the 
last example is in fact correct. But if it's correct in the one case, it's 
incorrect in the other, and vice versa. There is no justification, in terms of 
the PEP, for the present behavior.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12014
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12014] str.format parses replacement field incorrectly

2011-06-03 Thread Eric V. Smith

Eric V. Smith e...@trueblade.com added the comment:

We're going to have to agree to disagree. I believe that {0[}]} is the markup 
{0[} followed by the character data ]}.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12014
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12014] str.format parses replacement field incorrectly

2011-05-30 Thread Petri Lehtinen

Changes by Petri Lehtinen pe...@digip.org:


--
nosy: +petri.lehtinen

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12014
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12014] str.format parses replacement field incorrectly

2011-05-11 Thread Éric Araujo

Changes by Éric Araujo mer...@netwok.org:


--
keywords: +needs review
stage:  - patch review
versions:  -Python 2.6, Python 3.4

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12014
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12014] str.format parses replacement field incorrectly

2011-05-10 Thread Ben Wolfson

Ben Wolfson wolf...@gmail.com added the comment:

Actually, that's the wrong place in MarkupIterator_next to include that loop. 
The attached diff has it in the right place. The results of make test here 
are:

328 tests OK.
1 test failed:
test_unicode
25 tests skipped:
test_codecmaps_cn test_codecmaps_hk test_codecmaps_jp
test_codecmaps_kr test_codecmaps_tw test_curses test_dbm_gnu
test_epoll test_gdb test_largefile test_msilib test_ossaudiodev
test_readline test_smtpnet test_socketserver test_startfile
test_timeout test_tk test_ttk_guionly test_urllib2net
test_urllibnet test_winreg test_winsound test_xmlrpc_net
test_zipfile64
1 skip unexpected on darwin:
test_readline
make: [test] Error 1 (ignored)

test_unicode fails because it expects {0[}.format() to raise an IndexError; 
instead, it raises a ValueError (unmatched '{' in format) because it 
interprets the } as an *index*.

This can be avoided by changing the line 

while (self-str.ptr  self-str.end  *self-str.ptr != ']') {

to 

while (self-str.ptr  self-str.end-1  *self-str.ptr != 
']') {

In which case the test passes as is, or, obviously, by changing the expected 
exception in test_unicode.py.

--
keywords: +patch
versions: +Python 2.6, Python 3.4
Added file: http://bugs.python.org/file21963/strformat.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12014
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12014] str.format parses replacement field incorrectly

2011-05-06 Thread Mark Dickinson

Changes by Mark Dickinson dicki...@gmail.com:


--
nosy: +eric.smith, mark.dickinson

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12014
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12014] str.format parses replacement field incorrectly

2011-05-06 Thread Eric V. Smith

Eric V. Smith e...@trueblade.com added the comment:

I haven't had time to completely review this, I will do so later today.

But let me just say that the string is first parsed for replacement strings 
inside curly braces. There's no issue with that, here.

Next, the string is parsed for conversion and format_spec, looking for ! and 
: respectively. In your first example that gives:

field_name: '0['
conversion : ']'

It then tries to parse the field_name and gives you the first error.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12014
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12014] str.format parses replacement field incorrectly

2011-05-06 Thread Ben Wolfson

Ben Wolfson wolf...@gmail.com added the comment:

The semantics the docs suggest for index fields (namely that whatever is in the 
index field is just passed to getitem) do seem to be right, no other processing 
is done here, for instance:

 d = {{0}:hi}
 {0[{0}]}.format(d)
'hi'
 import string
 list(string.Formatter().parse({0[{0}]}))
[('', '0[{0}]', '', None)]
 

Which is what you'd expect, but makes me think that treating ! and : in the 
index field separately is definitely wrong.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12014
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12014] str.format parses replacement field incorrectly

2011-05-06 Thread Éric Araujo

Changes by Éric Araujo mer...@netwok.org:


--
nosy: +eric.araujo
versions: +Python 2.7, Python 3.2, Python 3.3 -Python 2.6

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12014
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12014] str.format parses replacement field incorrectly

2011-05-06 Thread Eric V. Smith

Eric V. Smith e...@trueblade.com added the comment:

 but makes me think that treating ! and : in the index field separately is 
 definitely wrong.

But it doesn't know they're in an index field when it's doing the parsing for 
':' or '!'.

It might be possible to change this so that the field name is fully parsed 
first, but I'm not sure the benefit would justify the effort. What's your use 
case where you need this feature?

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12014
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12014] str.format parses replacement field incorrectly

2011-05-06 Thread Eric V. Smith

Eric V. Smith e...@trueblade.com added the comment:

Note also that the nested expansion is only allowed in the format_spec part, 
per the documentation. Your last examples are attempting to do it in the 
field_name, which leads to the errors you see. Your very last example doesn't 
look right to me. I'll have to investigate why it's giving you that error 
message.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12014
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12014] str.format parses replacement field incorrectly

2011-05-06 Thread Ben Wolfson

Ben Wolfson wolf...@gmail.com added the comment:

My last examples were actually just attempting to figure out what triggered the 
unexpected behavior. I don't want to do expansion inside the field_name part!

(I'll have a reply to your previous comment about use-cases shortly.)

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12014
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12014] str.format parses replacement field incorrectly

2011-05-06 Thread Ben Wolfson

Ben Wolfson wolf...@gmail.com added the comment:

Here's my use case.

I'm writing a python version of the ruby library HighLine for CLI interaction, 
to be called, uncreatively, PyLine. One of the moderately neat things about the 
library is that it allows for color information to be embedded in the strings 
one passes to its methods, so, if h is a HighLine object, you could say:

h.say %= color('this will be red', :red) % but this won't be

So I wanted to be able to provide some kind of similar facility and realized 
that the __getitem__ method supported by format(), along with some 
__getattribute__ trickery, would work: so if p is a PyLine object, you could 
say:

p.say({colors.red.bold.on_black[this will be bold with red text on a black 
background]} but this will be just be regular text)

Thus:

 effectize_string({colors.red.bold.on_black[this will be bold with red text 
 on a black background]} but this will just be regular text)
'\x1b[31m\x1b[1m\x1b[40mthis will be bold with red text on a black 
background\x1b[0m but this will just be regular text\x1b[0m'

Obviously, I'll already have to watch out for stray ]s in the string passed 
to the object's __getitem__, so you might think, well, it's not much more work 
to also have to watch out for stray :, !, }, and { (but, oddly I won't 
need to watch out for *match* { and }!).

But it's obvious that something here should change. For one thing, as it 
stands, the documentation is wrong; it is not the case that an index_string can 
contain any character except ']'. But the documentation describes the way 
things rationally ought to be; there's a good reason not to allow a ']' in the 
index_string (and one can see why simplicity suggests not allowing for 
*escapes*, though I think that ideally there would be an escaping mechanism). 
But there's no reason not to allow stray {, }, :, and ! in the 
index_string. The only reason it's true at this point that it doesn't know 
they're in an index field when it's doing the parsing for ':' or '!' is that 
(assuming one takes the grammar in the documentation to be accurate) the parser 
is written incorrectly.

It contains, for instance, incorrect comments (in string_format.h:parse_field):
code
/* Search for the field name.  it's terminated by the end of
   the string, or a ':' or '!' */
field_name-ptr = str-ptr;
while (str-ptr  str-end) {
switch (c = *(str-ptr++)) {
case ':':
case '!':
break;
default:
continue;
}
break;
}
/code

(hopefully lt;codegt; does the right thing here...)

That's the culprit for the mishandling of : and !, but it is simply not the 
case---again, according to the grammar given in the documentation---that the 
field name can be delimited this way, in two ways.*

And, given that no nested expansion is done in the field_name part of the 
replacement, there's no real reason to retain the present parsing strategy; 
none of !, :, {, or } has any semantic significance in this part of of the 
replacement string, so why should the parsing code treat them specially? 
Surely, even if you think my use case is not so great, there's value in doing 
it right.

The : and ! problem is not super hard to get around. Witness the following 
dirty hack:

code
void 
advance_beyond_field(SubString *str)
{
if (str-ptr  str-end) return;
switch (*++str-ptr) {
case '[':
while(str-ptr  str-end  *(str-ptr) != ']') 
str-ptr++;
advance_beyond_field(str);
break;
case '.':
while(str-ptr  str-end)
switch(*++str-ptr) {
case ':':
case '!':
str-ptr--;
return;
case '[':
advance_beyond_field(str);
str-ptr--;
break;
default:
continue;
}
break;
default:
return;
}   
}
/code
Followed by replacing the switch statement as above thus:
code
switch (c = *(str-ptr++)) {
case '.':
case '[':
str-ptr -= 2;
advance_beyond_field(str);
continue;
case ':':
case '!':
break;
default:
continue;
}
/code
Of course, there is already in the FieldNameIterator plumbing a more certain 
mechanism for actually getting the fields out.

Then one can do this:

 {0[:]}.format({::4})
'4'
 {0[{ : ! }]}.format({{ : ! }:4})
'4'

(One can also pass such formatting-exercising test suites as test_nntplib, 
test_string, and test_collections.)

Though still not this:

 {0[{]}.format({{:4})
Traceback (most recent call last):
  File stdin, line 1, in module
ValueError: unmatched '{' in format

Even though the stray { in the square brackets has no semantic significance, 
it still gets picked up; the culprit is apparently in MarkupIterator_next, 
whose initial bracket-detecting while loop is not square-bracket aware. 

[issue12014] str.format parses replacement field incorrectly

2011-05-05 Thread Ben Wolfson

New submission from Ben Wolfson wolf...@gmail.com:

As near as I can make out from 
http://docs.python.org/library/string.html#formatstrings, the following 
should return the string hi:

{0[!]}.format({!:hi})

We have a {, followed by a field name, followed by a }, the field name 
consisting of an arg_name, which is 0, a [, an element index, and a ]. The 
element index, which the docs say may be any source character except ], is 
here !. And, according to the docs, An expression of the form '.name' 
selects the named attribute using getattr(), while an expression of the form 
'[index]' does an index lookup using __getitem__().

However, it doesn't work:
 {0[!]}.format({!:hi})
Traceback (most recent call last):
  File stdin, line 1, in module
ValueError: Missing ']' in format string

The same thing happens with other strings that are significant in other places 
in the string-formatting DSL:

 {0[:]}.format({::hi})
Traceback (most recent call last):
  File stdin, line 1, in module
ValueError: Missing ']' in format string

If there are more characters the error message changes:

 class spam:
... def __getitem__(self, k): return hi
... 
 {0[this works as expected]}.format(spam())
'hi'
 {0[I love spam! it is very tasty.]}.format(spam())
Traceback (most recent call last):
  File stdin, line 1, in module
ValueError: expected ':' after format specifier
 {0[.]}.format(spam()) # periods are ok
'hi'
 {0[although curly braces, }, are not square brackets, they also don't work 
 here]}.format(spam())

Right square brackets work fine, though:

 {0[[]}.format(spam())
'hi'

The failure of the expected result with curly braces presumably indicates at 
least part of the cause of the other failures: namely, that they stem from 
supporting providing flags to one replacement field using another, as in 
{1:{0}}. Which is quite useful. But it obviously isn't universally supported 
in the case of index fields anyway:

 {0[recursive {1[spam]}]}.format(spam(), spam())
Traceback (most recent call last):
  File stdin, line 1, in module
ValueError: Only '.' or '[' may follow ']' in format field specifier

(Note that this is a very strange error message itself, asis the following, but 
since one isn't, according to the grammar, allowed to include a ] where I've 
got one *anyway*, perhaps that's to be expected:

 {0[recursive {1[spam].lower} ]}.format(spam(), spam())
Traceback (most recent call last):
  File stdin, line 1, in module
AttributeError: 'str' object has no attribute 'lower} ]'

)

But, even if that would explain why one can't use a { in the index field, it 
wouldn't explain why one can't use a ! or :, since if those aren't already 
part of a replacement field, as indicated by some initial {, they couldn't 
have the significance that they do when they *are* part of that field.

--
components: Interpreter Core
messages: 135258
nosy: Ben.Wolfson
priority: normal
severity: normal
status: open
title: str.format parses replacement field incorrectly
versions: Python 2.6, Python 3.1

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12014
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12014] str.format parses replacement field incorrectly

2011-05-05 Thread Ben Wolfson

Changes by Ben Wolfson wolf...@gmail.com:


--
type:  - behavior

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12014
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com