The culprit is this:
http://svn.apache.org/viewcvs.cgi/httpd/mod_python/trunk/src/psp_parser.l?rev=104353&r1=102649&r2=104353
Before the patch all the text would be enclosed in triple double-quotes
(""") and all double-quotes within would be escaped. I guess Brendan
O'Connor (who submitted the patch) thought putting an 'r' in front of the
triple quotes would eliminate the need for escaping anything inside, but
it ain't so. (In fact I seem to recall having gone that erroneous path
myself originally).
To demonstrate:
print """blah\""""
blah" <-- OK
print r"""blah""""
File "<stdin>", line 1
print r"""blah""""
^
SyntaxError: EOL while scanning single-quoted string
... and if we try to escape the quote:
print r"""blah\""""
blah\" <-- BAD
... we don't get the original content. Therefore the "triple double-quote
with double-quote escaped" is the only way to get consistency.
Thus the fix is to roll that patch entirely back because it's wrong.
This does NOT, however, address the issue which got this thread started!!!
I'm pretty sure we still need the addition (somewhere below) to the
psp_parser.l file.
Grisha
On Thu, 10 Nov 2005, Jim Gallacher wrote:
Gregory (Grisha) Trubetskoy wrote:
On Wed, 9 Nov 2005, Jim Gallacher wrote:
This just get's stranger and stranger. Regenerating psp_parser.c from
the current psp_parser.l has caused my psp pages to go completely
pair-shaped. Things that rendered correctly before now puke up
hairballs.
For example the psp code (where my_link = 'some_url'):
<a href="<%= my_link %>">My Link</a>
used to render as:
<a href="some_url">My Link</a>
but now renders as:
<a href=,0); req.write(str( my_link ),0); req.write(r>My Link</a>
You may find it useful to use the _psp module from the command line, since
what you really want to see is not what it renders as, but the Python code
it generates:
from mod_python import _psp
s = _psp.parse("/path/to/your/file")
print s
See below for test results.
Changing the double quote to a single quote fixes the problem.
<a href='<%= my_link %>'>My Link</a>
This doesn't make a lot of sense, because PSP does not concern itself with
quotes - it scans for the "<%=" and once it has seen one then "%>", the
quotes would remain untouched, so the problem is elsewhere.
I don't want to refactor *all* of my psp pages, so I guess we'll need to
fix psp_parser. ;)
Just be careful, you may be trying to fix what is not broken in the first
place. I use the 3.1.4 PSP very heavily and there is not a single glitch
with it that I know of, and I can certainly use any kind of quote I want.
And this was my experience as well up to and including 3.2.4b, until I
deleted psp_parser.c and regenerated it. Then everything went wrong with the
site the I'm developing. I've been testing all the betas against this code
since I figured I'd be more likely to spot strange problems. I did. :(
I'd start out with confirming your theory that psp_parser.c is stale
somehow - that should be pretty easy - just generate a new one and diff it
with what's in SVN.
$ svn co $MP_TRUNK /tmp/mod_python
$ cd /tmp/mod_python
$ ./configure
$ make
$ make install
$ echo "run parser test from command line"
$ mv src/psp_parser.c psp_parser.c.orig
$ make clean
$ make
$ make install
$ diff -u src psp_parser.c.orig psp_parser.c > psp_parser.diff
$ echo "re-run parser test from command line"
See attached diff. The 2 files are not the same.
Test results using mod_python._psp.parse('test.psp') from the command line
interpreter:
test.psp
--------
<%
x = 'XXXX'
%>
test '<%= x %>'
test "<%= x %>"
Code generated from current psp_parser.c
----------------------------------------
req.write("""""",0);
x = 'XXXX'
req.write("""
test '""",0); req.write(str( x ),0); req.write("""'
test \"""",0); req.write(str( x ),0); req.write("""\"
""",0)
Output from generated code (GOOD!)
----------------------------------
test '
XXXX
'
test "
XXXX
"
Code generated with recreated psp_parser.c
------------------------------------------
req.write(r"""""",0);
x = 'XXXX'
req.write(r"""
test '""",0); req.write(str( x ),0); req.write(r"""'
test """",0); req.write(str( x ),0); req.write(r""""
""",0)
Output from generated code (BAD!)
---------------------------------
test '
XXXX
'
test ,0); req.write(str( x ),0); req.write(r
So it's not my imagination. :)
I'll dig through the svn logs and check the history of psp_parser.l and
psp_parser.c. Maybe there will be some clues in there. Won't get to it until
Sunday though.
The most recent change in SVN seems to have been adding an 'r' before the
triple quote for the <TEXT> portion (r""" instead of just """), which
should have solved some backslash problems.
Again, I haven't tested anything, but looking at the code, it seems to me
that indeed there should be a problem exactly as Anton reported it and
that my fix would be necessary, _and_ it may also apply to other special
sequences such as tab \t. I may be missing something, but I just wnated to
warn you that you may be missing something :-)
I'm pretty sure I'm missing something! :)
Jim