Re: mod_python.util.StorageField.read_to_boundary has problems in 3.1 and 3.2

2005-11-12 Thread Nicolas Lehuen
OK, this time I think the file upload problem is solved for good. I've checked-in Alexis's code, with comments. Then I've done a quick rewrite of the multipart/form-data parser found in FieldStorage.__init__ and read_to_boundary so that it uses a regexp for the boundary checks, with the hope that i

Re: mod_python.util.StorageField.read_to_boundary has problems in 3.1 and 3.2

2005-11-12 Thread Nicolas Lehuen
Guys, next time please use the JIRA bug tracking application available at : http://nagoya.apache.org/jira/browse/MODPYTHON Especially this bug report : http://issues.apache.org/jira/browse/MODPYTHON-40 I'm currently re-reading the whole thread and trying to make sense of all the test files and

Re: mod_python.util.StorageField.read_to_boundary has problems in 3.1 and 3.2

2005-11-08 Thread Alexis Marrero
Inspired by Mike's changes I made some changes the "new" version to improve performance while keeping readability: def read_to_boundary_new(self, req, boundary, file, readBlockSize): previous_delimiter = '' bound_length = len(boundary) while 1: line = req.readline(readBlockS

Re: mod_python.util.StorageField.read_to_boundary has problems in 3.1 and 3.2

2005-11-08 Thread Jim Gallacher
Mike Looijmans wrote: I've attached a modified upload_test_harness.py that includes the new and current, also the 'org' version (as in 3.1 release) and the 'mike' version. Nice changes, Mike. I started to get confused by the names of the various read_to_boundary_* functions, so I've made a s

Re: mod_python.util.StorageField.read_to_boundary has problems in 3.1 and 3.2

2005-11-08 Thread Mike Looijmans
Alexis Marrero wrote: The next test that I will run this against will be with an obscene amount of data for which this improvement helps a lot! The dumb thing is the checking for boundaries. I'm using http "chunked" encoding to access a raw TAPE device through HTTP with python (it GETs or PO

Re: mod_python.util.StorageField.read_to_boundary has problems in 3.1 and 3.2

2005-11-08 Thread Alexis Marrero
Thanks for that improvement, don't like its complexity though. I'm testing "mikes" version with my set of files I will all let you know how it goes. BTW, the line that reads "last_bound = boundary + '--'" so we save 4 CPU cycles there :) The next test that I will run this against will be

Re: mod_python.util.StorageField.read_to_boundary has problems in 3.1 and 3.2

2005-11-08 Thread Mike Looijmans
Here's one that passes all the tests, and is 2x as fast as the 'current' and 'new' implementations on random binary data. I haven't been able to generate data where the 'mike' version is slower: def read_to_boundary(self, req, boundary, file, readBlockSize=65536): prevline = "" last_bou

Re: mod_python.util.StorageField.read_to_boundary has problems in 3.1 and 3.2

2005-11-07 Thread Mike Looijmans
What i don't like at all in this implementation is the large amount of memcpy operations. 1. line.strip() 2. line[:-x] 3. previous_delimiter + ... The average pass will perform between two and three memcopy operations on the read block. Suggestion: Loose the strip() call - it serves no purpose

Re: mod_python.util.StorageField.read_to_boundary has problems in 3.1 and 3.2

2005-11-07 Thread Alexis Marrero
New version of read_to_boundary(...) readBlockSize = 1 << 16 def read_to_boundary(self, req, boundary, file): previous_delimiter = '' while 1: line = req.readline(readBlockSize) if line.strip().startswith(boundary): break if line.endswith('\r\n'):

Re: mod_python.util.StorageField.read_to_boundary has problems in 3.1 and 3.2

2005-11-07 Thread Jim Gallacher
Alexis Marrero wrote: Ok. Now I'm confused. So am I! I've created a test harness so we can bypass mod_python completely. It includes a slightly modified version of read_to_boundary which adds a new parameter, readBlockSize. In the output from the test harness, your version is 'new' and the

Re: mod_python.util.StorageField.read_to_boundary has problems in 3.1 and 3.2

2005-11-07 Thread Jim Gallacher
Alexis Marrero wrote: Sorry for all this emails, No worries. It's a bug that needs to be fixed, so your work will benefit everyone. :) Jim

Re: mod_python.util.StorageField.read_to_boundary has problems in 3.1 and 3.2

2005-11-07 Thread Alexis Marrero
Sorry for all this emails, but my system depends 100% on mod_python specially file uploading. :) On Nov 7, 2005, at 2:04 PM, Jim Gallacher wrote: Alexis Marrero wrote: Jim, Nicolas, Thanks for sending the function that creates the test file. However I ran it to create the test file, and a

Re: mod_python.util.StorageField.read_to_boundary has problems in 3.1 and 3.2

2005-11-07 Thread Jim Gallacher
Jim Gallacher wrote: Alexis Marrero wrote: Jim, Thanks for sending the function that creates the test file. However I ran it to create the test file, and after uploading the file the MD5 still the same. Just to clarify, is this for your new read_to_boundary or the one in 3.2? If it's for y

Re: mod_python.util.StorageField.read_to_boundary has problems in 3.1 and 3.2

2005-11-07 Thread Jim Gallacher
Alexis Marrero wrote: Jim, Nicolas, Thanks for sending the function that creates the test file. However I ran it to create the test file, and after uploading the file the MD5 still the same. Did you call it with the same block size as you are using in your code? The '\r' character must app

Re: mod_python.util.StorageField.read_to_boundary has problems in 3.1 and 3.2

2005-11-07 Thread Jim Gallacher
Nicolas Lehuen wrote: Well, I've re-read the previous code and it looks like it does almost the same thing except it is bugged :). CherryPy's implementation is almost the same except it ought to work. Jim, I've integrated your tricky file into the unit test. Alexis' version passes all tests,

Re: mod_python.util.StorageField.read_to_boundary has problems in 3.1 and 3.2

2005-11-06 Thread Nicolas Lehuen
Well, I've re-read the previous code and it looks like it does almost the same thing except it is bugged :). CherryPy's implementation is almost the same except it ought to work. Jim, I've integrated your tricky file into the unit test. Alexis' version passes all tests, whereas the current version

Re: mod_python.util.StorageField.read_to_boundary has problems in 3.1 and 3.2

2005-11-06 Thread Jim Gallacher
Gregory (Grisha) Trubetskoy wrote: So I guess this means we roll and vote on a 3.2.5b? As much as it pains me to say it, but yes, this is a must fixm so it's on to 3.2.5b. I think we need to do some more extensive testing on Alexis's fix before we roll 3.2.5b. His read_to_boundary is much

Re: mod_python.util.StorageField.read_to_boundary has problems in 3.1 and 3.2

2005-11-06 Thread Gregory (Grisha) Trubetskoy
So I guess this means we roll and vote on a 3.2.5b? Grisah On Sun, 6 Nov 2005, Nicolas Lehuen wrote: OK, it looks like Alexis' fix solves the problem with ugh.pdf without breaking the other unit tests. So I think we can safely integrate his patch. Shall I do it ? Regards, Nicolas

Re: mod_python.util.StorageField.read_to_boundary has problems in 3.1 and 3.2

2005-11-06 Thread Jim Gallacher
I've been spending some quality time with hexedit, vim and a little bit of python. I can now generate a file which can be used in the unit test. The problem seems to occur when a '\r' character is right at readBlockSize boundary, which is 65368 in the current mod_python.util. I have not yet t

Re: mod_python.util.StorageField.read_to_boundary has problems in 3.1 and 3.2

2005-11-06 Thread Alexis Marrero
Nicolas,Not that I'm the one to give permission whether to integrate things or not, but just to let you know I don't even have svn installed so I won't do it. At least not for a while...BTW, if there are some cherrypy developers in this mailing list, the CherryPy function that handles file uploads

Re: mod_python.util.StorageField.read_to_boundary has problems in 3.1 and 3.2

2005-11-06 Thread Nicolas Lehuen
OK, it looks like Alexis' fix solves the problem with ugh.pdf without breaking the other unit tests. So I think we can safely integrate his patch. Shall I do it ? Regards, Nicolas2005/11/6, Nicolas Lehuen <[EMAIL PROTECTED]>: Hi guys, In the pure "if it ain't tested, it ain't fixed" fashion, I've

Re: mod_python.util.StorageField.read_to_boundary has problems in 3.1 and 3.2

2005-11-06 Thread Nicolas Lehuen
Hi guys, In the pure "if it ain't tested, it ain't fixed" fashion, I've added a unit test for file upload to the test suite. It uploads a randomly generated 1 MB file to the server, and check that the MD5 digest returned by the server is correct. I could not reproduce Alexis' bug report this way,

Re: mod_python.util.StorageField.read_to_boundary has problems in 3.1 and 3.2

2005-11-06 Thread Alexis Marrero
I don't have a function that creates the files but the I can point you to a file that has the problem, ironically is "Unix Haters Handbook" :) Well, at least is not the Python HH http://research.microsoft.com/~daniel/uhh-download.html It's MD5 is 9e8c42be55aac825e7a34d448044d0fe. I don't

Re: mod_python.util.StorageField.read_to_boundary has problems in 3.1 and 3.2

2005-11-06 Thread Jim Gallacher
Alexis, I wanted to add that I'm testing your code. Alexis Marrero wrote: Let me know any comments on it and if you test it and fails please also let me know. I don't have subversion account neither I don't know how to use it thus this email. You don't need an account to use subversion ano

Re: mod_python.util.StorageField.read_to_boundary has problems in 3.1 and 3.2

2005-11-06 Thread Jim Gallacher
Alexis, Do you a have a small file which shows this behaviour and could be used for testing? Even better would be a function which would generate a test file. This could be included in the mod_python unit tests. Jim Alexis Marrero wrote: All, The current 3.1 mod_python implementation of