mod_python.util.StorageField.read_to_boundary has problems in 3.1 and 3.2

Alexis Marrero Sun, 06 Nov 2005 04:06:00 -0800

All,

The current 3.1 mod_python implementation of mod_python.util.StorageField.read_to_boudary reads as follows:

203 def read_to_boundary(self, req, boundary, file):

204 delim = ""

205 line = req.readline()

206 sline = line.strip()

207 last_bound = boundary + "--"

208 while line and sline != boundary and sline != last_bound:

209 odelim = delim

210 if line[-2:] == "\r\n":

211 delim = "\r\n"

212 line = line[:-2]

213 elif line[-1:] == "\n":

214 delim = "\n"

215 line = line[:-1]

216 file.write(odelim + line)

217 line = req.readline()

218 sline = line.strip()

As we have discussed previously:

http://www.modpython.org/pipermail/mod_python/2005-March/017754.html

http://www.modpython.org/pipermail/mod_python/2005-March/017756.html

http://www.modpython.org/pipermail/mod_python/2005-November/019460.html

This triggered couple of changes in mod_python 3.2 Beta which reads as follows:

33 # Fixes memory error when upload large files such as 700+MB ISOs.

34 readBlockSize = 65368

...

225 def read_to_boundary(self, req, boundary, file):

...

234 delim = ''

235 lastCharCarried = False

236 last_bound = boundary + '--'

237 roughBoundaryLength = len(last_bound) + 128

238 line = req.readline(readBlockSize)

239 lineLength = len(line)

240 if lineLength < roughBoundaryLength:

241 sline = line.strip()

242 else:

243 sline = ''

244 while lineLength > 0 and sline != boundary and sline != last_bound:

245 if not lastCharCarried:

246 file.write(delim)

247 delim = ''

248 else:

249 lastCharCarried = False

250 cutLength = 0

251 if lineLength == readBlockSize:

252 if line[-1:] == '\r':

253 delim = '\r'

254 cutLength = -1

255 lastCharCarried = True

256 if line[-2:] == '\r\n':

257 delim += '\r\n'

258 cutLength = -2

259 elif line[-1:] == '\n':

260 delim += '\n'

261 cutLength = -1

262 if cutLength != 0:

263 file.write(line[:cutLength])

264 else:

265 file.write(line)

266 line = req.readline(readBlockSize)

267 lineLength = len(line)

268 if lineLength < roughBoundaryLength:

269 sline = line.strip()

270 else:

271 sline = ''

This function has a mysterious bug in it... For some files which I could disclose (one of them been the PDF file for Apple's Pages User Manual in Italian) the uploaded file in the server ends up with the same length but different sha512 (the only digest that I'm using). The problem is a '\r' in the middle of a chunk of data that is much larger than readBlockSize.

Anyhow, I wrote a new function, which I believe is much simpler, and test it with thousands and thousands of different files and so far it seems to work fine. It reads as follows:

def read_to_boundary(self, req, boundary, file):

''' read from the request object line by line with a maximum size,

until the new line starts with boundary

'''

previous_delimiter = ''

while 1:

line = req.readline(1<<16)

if line.startswith(boundary):

break

if line.endswith('\r\n'):

file.write(previous_delimiter + line[:-2])

previous_delimiter = '\r\n'

elif line.endswith('\r') or line.endswith('\n'):

file.write(previous_delimiter + line[:-1])

previous_delimiter = line[-1:]

else:

file.write(previous_delimiter + line)

previous_delimiter = ''

Let me know any comments on it and if you test it and fails please also let me know. I don't have subversion account neither I don't know how to use it thus this email.

/amn

_______________________________________________

Mod_python mailing list

[EMAIL PROTECTED]

http://mailman.modpython.org/mailman/listinfo/mod_python

mod_python.util.StorageField.read_to_boundary has problems in 3.1 and 3.2

Reply via email to