The culprit is this:

http://svn.apache.org/viewcvs.cgi/httpd/mod_python/trunk/src/psp_parser.l?rev=104353&r1=102649&r2=104353

Before the patch all the text would be enclosed in triple double-quotes (""") and all double-quotes within would be escaped. I guess Brendan O'Connor (who submitted the patch) thought putting an 'r' in front of the triple quotes would eliminate the need for escaping anything inside, but it ain't so. (In fact I seem to recall having gone that erroneous path myself originally).

To demonstrate:

print """blah\""""
blah"                   <-- OK

print r"""blah""""
  File "<stdin>", line 1
    print r"""blah""""
                     ^
SyntaxError: EOL while scanning single-quoted string

... and if we try to escape the quote:

print r"""blah\""""
blah\"                  <-- BAD

... we don't get the original content. Therefore the "triple double-quote with double-quote escaped" is the only way to get consistency.

Thus the fix is to roll that patch entirely back because it's wrong.

This does NOT, however, address the issue which got this thread started!!! I'm pretty sure we still need the addition (somewhere below) to the psp_parser.l file.

Grisha


On Thu, 10 Nov 2005, Jim Gallacher wrote:

Gregory (Grisha) Trubetskoy wrote:

On Wed, 9 Nov 2005, Jim Gallacher wrote:

This just get's stranger and stranger. Regenerating psp_parser.c from the current psp_parser.l has caused my psp pages to go completely pair-shaped. Things that rendered correctly before now puke up hairballs.

For example the psp code (where my_link = 'some_url'):
   <a href="<%= my_link %>">My Link</a>
used to render as:
   <a href="some_url">My Link</a>
but now renders as:
   <a href=,0); req.write(str( my_link ),0); req.write(r>My Link</a>


You may find it useful to use the _psp module from the command line, since what you really want to see is not what it renders as, but the Python code it generates:

from mod_python import _psp
s = _psp.parse("/path/to/your/file")
print s

See below for test results.


Changing the double quote to a single quote fixes the problem.
   <a href='<%= my_link %>'>My Link</a>


This doesn't make a lot of sense, because PSP does not concern itself with quotes - it scans for the "<%=" and once it has seen one then "%>", the quotes would remain untouched, so the problem is elsewhere.

I don't want to refactor *all* of my psp pages, so I guess we'll need to fix psp_parser. ;)


Just be careful, you may be trying to fix what is not broken in the first place. I use the 3.1.4 PSP very heavily and there is not a single glitch with it that I know of, and I can certainly use any kind of quote I want.

And this was my experience as well up to and including 3.2.4b, until I deleted psp_parser.c and regenerated it. Then everything went wrong with the site the I'm developing. I've been testing all the betas against this code since I figured I'd be more likely to spot strange problems. I did. :(

I'd start out with confirming your theory that psp_parser.c is stale somehow - that should be pretty easy - just generate a new one and diff it with what's in SVN.

$ svn co $MP_TRUNK /tmp/mod_python
$ cd /tmp/mod_python
$ ./configure
$ make
$ make install
$ echo "run parser test from command line"
$ mv src/psp_parser.c psp_parser.c.orig
$ make clean
$ make
$ make install
$ diff -u src psp_parser.c.orig psp_parser.c > psp_parser.diff
$ echo "re-run parser test from command line"

See attached diff. The 2 files are not the same.

Test results using mod_python._psp.parse('test.psp') from the command line interpreter:

test.psp
--------
<%
x = 'XXXX'
%>
test '<%= x %>'
test "<%= x %>"


Code generated from current psp_parser.c
----------------------------------------
req.write("""""",0);
x = 'XXXX'
req.write("""
test '""",0); req.write(str( x ),0); req.write("""'
test \"""",0); req.write(str( x ),0); req.write("""\"
""",0)

Output from generated code (GOOD!)
----------------------------------

test '
XXXX
'
test "
XXXX
"

Code generated with recreated psp_parser.c
------------------------------------------

req.write(r"""""",0);
x = 'XXXX'
req.write(r"""
test '""",0); req.write(str( x ),0); req.write(r"""'
test """",0); req.write(str( x ),0); req.write(r""""
""",0)

Output from generated code (BAD!)
---------------------------------

test '
XXXX
'
test ,0); req.write(str( x ),0); req.write(r


So it's not my imagination. :)

I'll dig through the svn logs and check the history of psp_parser.l and psp_parser.c. Maybe there will be some clues in there. Won't get to it until Sunday though.

The most recent change in SVN seems to have been adding an 'r' before the triple quote for the <TEXT> portion (r""" instead of just """), which should have solved some backslash problems.

Again, I haven't tested anything, but looking at the code, it seems to me that indeed there should be a problem exactly as Anton reported it and that my fix would be necessary, _and_ it may also apply to other special sequences such as tab \t. I may be missing something, but I just wnated to warn you that you may be missing something :-)

I'm pretty sure I'm missing something! :)

Jim

Reply via email to