OK, this time I think the file upload problem is solved for good. I've
checked-in Alexis's code, with comments. Then I've done a quick
rewrite of the multipart/form-data parser found in
FieldStorage.__init__ and read_to_boundary so that it uses a regexp
for the boundary checks, with the hope that it simplify the code a
little bit (and remove thos nasty strip() calls). I've re-ran all
tests and everything seems OK. Now, Alexis, if you can run your test
with the 60,000+ files on this version of util.py, that would be great
:

http://svn.apache.org/viewcvs.cgi/httpd/mod_python/trunk/lib/python/mod_python/util.py?rev=332779&view=markup

Ironically, the JIRA server seems non responsive right now, so I can't
comment or close MODPYTHON-40...

Regards,
Nicolas

2005/11/12, Nicolas Lehuen <[EMAIL PROTECTED]>:
> Guys, next time please use the JIRA bug tracking application available at :
>
> http://nagoya.apache.org/jira/browse/MODPYTHON
>
> Especially this bug report :
>
> http://issues.apache.org/jira/browse/MODPYTHON-40
>
> I'm currently re-reading the whole thread and trying to make sense of
> all the test files and patches that were submitted, and it's quite
> unpleasant, especially since I'm using Google mail which tries to be
> too clever about attachments.
>
> Using JIRA, we have a permanent tracking of what's happening AND we
> can store attachments in a nice way.
>
> Regards,
> Nicolas
>
> 2005/11/8, Alexis Marrero <[EMAIL PROTECTED]>:
> > Inspired by Mike's changes I made some changes the "new" version to
> > improve performance while keeping readability:
> >
> > def read_to_boundary_new(self, req, boundary, file, readBlockSize):
> >      previous_delimiter = ''
> >      bound_length = len(boundary)
> >      while 1:
> >          line = req.readline(readBlockSize)
> >          if line[:bound_length] == boundary:
> >              break
> >
> >          if line[-2:] == '\r\n':
> >              file.write(previous_delimiter + line[:-2])
> >              previous_delimiter = '\r\n'
> >
> >          elif line[-1:] == '\r':
> >              file.write(previous_delimiter + line[:-1])
> >              previous_delimiter = '\r'
> >
> >          elif line[-1:] == '\n':
> >              if len(line[:-1]) > 0:
> >                  file.write(previous_delimiter + line[:-1])
> >                  previous_delimiter = '\n'
> >
> >              else:
> >                  previous_delimiter += '\n'
> >
> >          else:
> >              file.write(previous_delimiter + line)
> >              previous_delimiter = ''
> >
> > Results are very comparable to read_to_boundary_mike
> >
> > /amn
> > On Nov 8, 2005, at 9:07 AM, Jim Gallacher wrote:
> >
> > > Mike Looijmans wrote:
> > >
> > >> I've attached a modified upload_test_harness.py that includes the
> > >> new and current, also the 'org' version (as in 3.1 release) and
> > >> the 'mike' version.
> > >>
> > >
> > > Nice changes, Mike.
> > >
> > > I started to get confused by the names of the various
> > > read_to_boundary_* functions, so I've made a slight modification to
> > > your code and renamed these functions.
> > >
> > > Note also that there are some failures for the split_boundary file
> > > using offset -2 chunk [CR], so I've change the default offset to
> > > include this condition.
> > >
> > > I've also changed the generate_*file functions to embed an
> > > additional '\r' in a short string. Just being paranoid I guess.
> > >
> > > Diff is attached.
> > >
> > > If it's useful I can stick the upload_test_harness code in the
> > > mod_python svn repository.
> > >
> > > Jim
> > >
> > > --- upload_test_harness_mike.py.orig    2005-11-08
> > > 08:02:13.000000000 -0500
> > > +++ upload_test_harness_mike.py    2005-11-08 08:54:15.000000000 -0500
> > > @@ -14,8 +14,8 @@
> > >  ##    f.write('b'*block_size)
> > >  ##    f.close()
> > >
> > > -def read_to_boundary_current(self, req, boundary, file,
> > > readBlockSize):
> > > -    ''' currrent version '''
> > > +def read_to_boundary_324(self, req, boundary, file, readBlockSize):
> > > +    ''' currrent version  in 3.2.4b'''
> > >      #
> > >      # Although technically possible for the boundary to be split
> > > by the read, this will
> > >      # not happen because the readBlockSize is set quite high - far
> > > longer than any boundary line
> > > @@ -66,7 +66,7 @@
> > >          else:
> > >              sline = ''
> > >
> > > -def read_to_boundary_new(self, req, boundary, file, readBlockSize):
> > > +def read_to_boundary_alexis(self, req, boundary, file,
> > > readBlockSize):
> > >      ''' Alexis' version
> > >          read from the request object line by line with a maximum
> > > size,
> > >          until the new line starts with boundary
> > > @@ -89,7 +89,7 @@
> > >              file.write(previous_delimiter + line)
> > >              previous_delimiter = ''
> > >
> > > -def read_to_boundary_org(self, req, boundary, file, readBlockSize):
> > > +def read_to_boundary_314(self, req, boundary, file, readBlockSize):
> > >      delim = ""
> > >      line = req.readline(readBlockSize)
> > >      while line and not line.startswith(boundary):
> > > @@ -145,6 +145,8 @@
> > >
> > >      f = open(fname, 'wb')
> > >      f.write('a'*50)
> > > +    f.write('\r')
> > > +    f.write('a'*50)
> > >      f.write('\r\n')
> > >
> > >      block_size =  readBlockSize + offset
> > > @@ -163,6 +165,8 @@
> > >      """
> > >      f = open(fname, 'wb')
> > >      f.write('a'*50)
> > > +    f.write('\r')
> > > +    f.write('a'*50)
> > >      f.write('\r\n')
> > >
> > >      block_size =  readBlockSize + offset
> > > @@ -171,7 +175,7 @@
> > >
> > >      f.close()
> > >
> > > -read_boundaries = [read_to_boundary_current, read_to_boundary_new,
> > > read_to_boundary_org, read_to_boundary_mike]
> > > +read_boundaries = [read_to_boundary_324, read_to_boundary_alexis,
> > > read_to_boundary_314, read_to_boundary_mike]
> > >
> > >  def main(file_generator, offset, chunk, block_size=1<<16):
> > >      fname_in = 'testfile.in'
> > > @@ -218,6 +222,9 @@
> > >          pass
> > >
> > >  if __name__ == '__main__':
> > > +    # offsets which show problems are in [-2, -1]
> > > +    first_offset = -2
> > > +    last_offset = 0
> > >
> > >      #test_chunks =  ['', '\r', '\n', '\r\n']
> > >
> > > @@ -229,7 +236,7 @@
> > >          print '='*40
> > >          print file_gen_obj.__name__
> > >          for chunk in test_chunks:
> > > -            for i in range(-1, 0):
> > > +            for i in range(first_offset, last_offset):
> > >                  print '-'*40
> > >                  print 'test offset', i, 'chunk',[ cname(c) for c
> > > in chunk ]
> > >                  print '-'*40
> > >
> >
> >
>

Reply via email to