Re: mod_python.util.StorageField.read_to_boundary has problems in 3.1 and 3.2

Jim Gallacher Mon, 07 Nov 2005 09:49:15 -0800

Nicolas Lehuen wrote:

Well, I've re-read the previous code and it looks like it does almostthe same thing except it is bugged :). CherryPy's implementation isalmost the same except it ought to work.
Jim, I've integrated your tricky file into the unit test. Alexis'version passes all tests, whereas the current version fails on your fileand ugh.pdf. So I guess we can call it a day and integrate Alexis' fix.
One thing which is different is the use of 1<<16 = 65536 as read blocksize instead of 65368. It was Barry Pearce who contributed the latestversion of read_to_boundary, he must have known why this block size wasgood (or was it his lucky number ?). Anyway, I've changed Alexis' codeto use the readBlockSize variable and changed its value to 1<<16. I'vedone tests with both 1<<16 and 65638 on the server side and on theclient side (changing the tricky file generation), and the new code works.

There is a failure mode in Alexis' code. If the final line in the filehas length == readBlockSize - 1, the boundary (eg. '\r\n--myboundary')gets split after the '\r'. The read terminates properly, but a stray'\r' gets appended to the file. Interestingly, the original codementions that this is unlikely to happen, but seems to handle it anyway.

This bit of code will generate a file which will cause a problem forAlexis' read_to_boundary.


def generate_split_file(offset=-1,
                          readBlockSize=65368,
                          fname='testfile'):
    f = open(fname, 'w')
    f.write('a'*50)
    f.write('\r\n')
    block_size =  readBlockSize + offset
    f.write('b'*block_size)
    f.close()

I'll write a new unit test to cover this situation.

I'm also thinking that the unit test should not hard code thereadBlockSize right in the test. It would be better to have a globalvariable at the top of test.py, with a note in both test.py and util.pythat it's value must be the same in both files. Otherwise, if the valueof readBlockSize changes in util.py, we may end up with regressions thatthe unit test will not catch.

As far as Nicolas' test_fileupload unit test is concerned I'd beinclined to remove the test for ugh.pdf. I'm pretty confident that thesimulated 'tricky' file is an adequate test, plus we can't be sure thatugh.pdf will not change in the future.

Alexis mentioned that he had 5 pdf files that were corrupted. If theother files are publicly accessible I could take a look and see if theyare failing the same way, with a '\r' character right at a readBlockSizeboundary point. Alternatively, I could polish my test script and he canrun it against his files. If we do that then I'm confident we cangenerate a synthetic file for testing and not depend on ugh.pdf

Jim

Re: mod_python.util.StorageField.read_to_boundary has problems in 3.1 and 3.2

Reply via email to