What i don't like at all in this implementation is the large amount of
memcpy operations.
1. line.strip()
2. line[:-x]
3. previous_delimiter + ...
The average pass will perform between two and three memcopy operations
on the read block.
Suggestion: Loose the strip() call - it serves no purpose (the boundary
will be the only thing on that line, otherwise it's not a boundary)
I've built one without any memcpy whatsoever, I've tried it with the
test harnass and it appears equal to the Alexis version (pass all except
the generate_split_boundary_file test):
def read_to_boundary(self, req, boundary, file, readBlockSize=65536):
prevline = ""
while 1:
line = req.readline(readBlockSize)
if not line or line.startswith(boundary):
if prevline.endswith('\r\n'):
file.write(prevline[:-2])
elif prevline.endswith('\n'):
file.write(prevline[:-1])
break
file.write(prevline)
prevline = line
Alexis Marrero wrote:
> New version of read_to_boundary(...)
>
> readBlockSize = 1 << 16
> def read_to_boundary(self, req, boundary, file):
> previous_delimiter = ''
> while 1:
> line = req.readline(readBlockSize)
> if line.strip().startswith(boundary):
> break
>
> if line.endswith('\r\n'):
> file.write(previous_delimiter + line[:-2])
> previous_delimiter = '\r\n'
>
> elif line.endswith('\r'):
> file.write(previous_delimiter + line[:-1])
> previous_delimiter = '\r'
>
> elif line.endswith('\n'):
> if len(line[:-1]) > 0:
> file.write(previous_delimiter + line[:-1])
> previous_delimiter = '\n'
>
> else:
> previous_delimiter += '\n'
>
> else:
> file.write(previous_delimiter + line)
> previous_delimiter = ''
>
> This new functions passes the test for Jim's filetest generator.
--
Mike Looijmans
Philips Natlab / Topic Automation