#12578: multipartparser.Parser does not accept non-canonical bare CR and bare LF
------------------------------------+---------------------------------------
          Reporter:  jfenwick       |         Owner:  nobody
            Status:  closed         |     Milestone:        
         Component:  HTTP handling  |       Version:  1.1   
        Resolution:  invalid        |      Keywords:  jython
             Stage:  Unreviewed     |     Has_patch:  0     
        Needs_docs:  0              |   Needs_tests:  0     
Needs_better_patch:  0              |  
------------------------------------+---------------------------------------
Comment (by kmtracey):

 To answer your first question, the extra CRs in the debug data are coming
 from the way it was collected:

 This code:

 {{{
 #!python
 f = open('c:/chunk.txt', 'a')
 f.write(chunk)
 f.close()
 }}}

 on Windows, will transform any LFs in `chunk` to CRLF.  So where chunk
 originally had CRLF, what gets written to the file is CRCRLF.  What was
 actually fed to the Django parsing code (`chunk`) has the correct CRLF so
 you don't see a POST/parse failure.

 The problem with the debug data collection code is it does not open the
 file in binary mode.  See:
 http://docs.python.org/tutorial/inputoutput.html#reading-and-writing-
 files.  To record exactly what is in chunk, the file needs to be opened in
 binary mode.

 You can see this behavior in action in a Python shell:

 {{{
 D:\tmp>python
 Python 2.5.2 (r252:60911, Feb 21 2008, 13:11:45) [MSC v.1310 32 bit
 (Intel)] on win32
 Type "help", "copyright", "credits" or "license" for more information.
 >>> txtfile = file.open('xyz.txt','a')
 >>> txtfile = open('xyz.txt','a')
 >>> txtfile.write('This is a line terminated by CRLF already\r\n')
 >>> txtfile.close()
 >>> binfile = open('xyz.txt','rb')
 >>> data = binfile.read()
 >>> data
 'This is a line terminated by CRLF already\r\r\n'
 >>>
 }}}

 Note I believe it is the reverse of this problem that is causing the
 failure with Jython/Windows.  Some code somewhere is reading the data as a
 text file, instead of a binary file, and the existing CRLFs are being
 turned into just plain LFs:

 {{{
 >>> binfile = open('abc.txt','ab')
 >>> binfile.write('This is a line terminated by CRLF already\r\n')
 >>> binfile.close()
 >>> txtfile_wrong = open('abc.txt','r')
 >>> txtfile_wrong.read()
 'This is a line terminated by CRLF already\n'
 >>>
 }}}

 I don't know where this is happening because I've got only the haziest
 idea of all the pieces involved in running Django under Jython/tomcat.  It
 doesn't necessarily have to be happening in Python code either -- if I
 remember right Java file I/O can do similar things.  But my Java memories
 are pretty dim since I haven't had to use it in a while.

 For the case that looks like it should work but is failing -- bare LFs in
 the `chunk` data have been turned into CRLFs in the debug data due to
 writing the file as text.  If you write the file as binary, then I expect
 you'll see the `chunk` data contains just LFs, which is why the parsing is
 failing.

-- 
Ticket URL: <http://code.djangoproject.com/ticket/12578#comment:9>
Django <http://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

-- 
You received this message because you are subscribed to the Google Groups 
"Django updates" group.
To post to this group, send email to django-upda...@googlegroups.com.
To unsubscribe from this group, send email to 
django-updates+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-updates?hl=en.

Reply via email to