Alexis,

Do you a have a small file which shows this behaviour and could be used for testing? Even better would be a function which would generate a test file. This could be included in the mod_python unit tests.

Jim


Alexis Marrero wrote:
All, The current 3.1 mod_python implementation of mod_python.util.StorageField.read_to_boudary reads as follows:

   203      def read_to_boundary(self, req, boundary, file):
   204          delim = ""
   205          line = req.readline()
   206          sline = line.strip()
   207          last_bound = boundary + "--"
   208          while line and sline != boundary and sline != last_bound:
   209              odelim = delim
   210              if line[-2:] == "\r\n":
   211                  delim = "\r\n"
   212                  line = line[:-2]
   213              elif line[-1:] == "\n":
   214                  delim = "\n"
   215                  line = line[:-1]
   216              file.write(odelim + line)
   217              line = req.readline()
   218              sline = line.strip()

As we have discussed previously: http://www.modpython.org/pipermail/mod_python/2005-March/017754.html
http://www.modpython.org/pipermail/mod_python/2005-March/017756.html
http://www.modpython.org/pipermail/mod_python/2005-November/019460.html

This triggered couple of changes in mod_python 3.2 Beta which reads as follows:
    33  # Fixes memory error when upload large files such as 700+MB ISOs.
    34  readBlockSize = 65368
    35
*...*
   225     def read_to_boundary(self, req, boundary, file):
...
   234         delim = ''
   235         lastCharCarried = False
   236         last_bound = boundary + '--'
   237         roughBoundaryLength = len(last_bound) + 128
   238         line = req.readline(readBlockSize)
   239         lineLength = len(line)
   240         if lineLength < roughBoundaryLength:
   241             sline = line.strip()
   242         else:
   243             sline = ''
244 while lineLength > 0 and sline != boundary and sline != last_bound:
   245             if not lastCharCarried:
   246                 file.write(delim)
   247                 delim = ''
   248             else:
   249                 lastCharCarried = False
   250             cutLength = 0
   251             if lineLength == readBlockSize:
   252                 if line[-1:] == '\r':
   253                     delim = '\r'
   254                     cutLength = -1
   255                     lastCharCarried = True
   256             if line[-2:] == '\r\n':
   257                 delim += '\r\n'
   258                 cutLength = -2
   259             elif line[-1:] == '\n':
   260                 delim += '\n'
   261                 cutLength = -1
   262             if cutLength != 0:
   263                 file.write(line[:cutLength])
   264             else:
   265                 file.write(line)
   266             line = req.readline(readBlockSize)
   267             lineLength = len(line)
   268             if lineLength < roughBoundaryLength:
   269                 sline = line.strip()
   270             else:
   271                 sline = ''

This function has a mysterious bug in it... For some files which I could disclose (one of them been the PDF file for Apple's Pages User Manual in Italian) the uploaded file in the server ends up with the same length but different sha512 (the only digest that I'm using). The problem is a '\r' in the middle of a chunk of data that is much larger than readBlockSize.

Anyhow, I wrote a new function, which I believe is much simpler, and test it with thousands and thousands of different files and so far it seems to work fine. It reads as follows:

def read_to_boundary(self, req, boundary, file):
    ''' read from the request object line by line with a maximum size,
        until the new line starts with boundary
    '''
    previous_delimiter = ''
    while 1:
        line = req.readline(1<<16)
        if line.startswith(boundary):
            break
if line.endswith('\r\n'):
            file.write(previous_delimiter + line[:-2])
            previous_delimiter = '\r\n'
elif line.endswith('\r') or line.endswith('\n'): file.write(previous_delimiter + line[:-1]) previous_delimiter = line[-1:]

        else:
            file.write(previous_delimiter + line)
            previous_delimiter = ''

Let me know any comments on it and if you test it and fails please also let me know. I don't have subversion account neither I don't know how to use it thus this email.

/amn

_______________________________________________
Mod_python mailing list
[EMAIL PROTECTED] <mailto:[EMAIL PROTECTED]>
http://mailman.modpython.org/mailman/listinfo/mod_python



Reply via email to