Alexis,
Do you a have a small file which shows this behaviour and could be used
for testing? Even better would be a function which would generate a test
file. This could be included in the mod_python unit tests.
Jim
Alexis Marrero wrote:
All,
The current 3.1 mod_python implementation of
mod_python.util.StorageField.read_to_boudary reads as follows:
203 def read_to_boundary(self, req, boundary, file):
204 delim = ""
205 line = req.readline()
206 sline = line.strip()
207 last_bound = boundary + "--"
208 while line and sline != boundary and sline != last_bound:
209 odelim = delim
210 if line[-2:] == "\r\n":
211 delim = "\r\n"
212 line = line[:-2]
213 elif line[-1:] == "\n":
214 delim = "\n"
215 line = line[:-1]
216 file.write(odelim + line)
217 line = req.readline()
218 sline = line.strip()
As we have discussed previously:
http://www.modpython.org/pipermail/mod_python/2005-March/017754.html
http://www.modpython.org/pipermail/mod_python/2005-March/017756.html
http://www.modpython.org/pipermail/mod_python/2005-November/019460.html
This triggered couple of changes in mod_python 3.2 Beta which reads as
follows:
33 # Fixes memory error when upload large files such as 700+MB ISOs.
34 readBlockSize = 65368
35
*...*
225 def read_to_boundary(self, req, boundary, file):
...
234 delim = ''
235 lastCharCarried = False
236 last_bound = boundary + '--'
237 roughBoundaryLength = len(last_bound) + 128
238 line = req.readline(readBlockSize)
239 lineLength = len(line)
240 if lineLength < roughBoundaryLength:
241 sline = line.strip()
242 else:
243 sline = ''
244 while lineLength > 0 and sline != boundary and sline !=
last_bound:
245 if not lastCharCarried:
246 file.write(delim)
247 delim = ''
248 else:
249 lastCharCarried = False
250 cutLength = 0
251 if lineLength == readBlockSize:
252 if line[-1:] == '\r':
253 delim = '\r'
254 cutLength = -1
255 lastCharCarried = True
256 if line[-2:] == '\r\n':
257 delim += '\r\n'
258 cutLength = -2
259 elif line[-1:] == '\n':
260 delim += '\n'
261 cutLength = -1
262 if cutLength != 0:
263 file.write(line[:cutLength])
264 else:
265 file.write(line)
266 line = req.readline(readBlockSize)
267 lineLength = len(line)
268 if lineLength < roughBoundaryLength:
269 sline = line.strip()
270 else:
271 sline = ''
This function has a mysterious bug in it... For some files which I could
disclose (one of them been the PDF file for Apple's Pages User Manual in
Italian) the uploaded file in the server ends up with the same length
but different sha512 (the only digest that I'm using). The problem is a
'\r' in the middle of a chunk of data that is much larger
than readBlockSize.
Anyhow, I wrote a new function, which I believe is much simpler, and
test it with thousands and thousands of different files and so far it
seems to work fine. It reads as follows:
def read_to_boundary(self, req, boundary, file):
''' read from the request object line by line with a maximum size,
until the new line starts with boundary
'''
previous_delimiter = ''
while 1:
line = req.readline(1<<16)
if line.startswith(boundary):
break
if line.endswith('\r\n'):
file.write(previous_delimiter + line[:-2])
previous_delimiter = '\r\n'
elif line.endswith('\r') or line.endswith('\n'):
file.write(previous_delimiter + line[:-1])
previous_delimiter = line[-1:]
else:
file.write(previous_delimiter + line)
previous_delimiter = ''
Let me know any comments on it and if you test it and fails please also
let me know. I don't have subversion account neither I don't know how to
use it thus this email.
/amn
_______________________________________________
Mod_python mailing list
[EMAIL PROTECTED] <mailto:[EMAIL PROTECTED]>
http://mailman.modpython.org/mailman/listinfo/mod_python