Generally, your post is filled with good thoughts :)
Below are a few comments -
Colin Blake wrote:
> There are several other problems with the current code in addition to
> the fundamental problem stated above. First, everything is done twice;
> we make two passes through the form controls, the first to compute the
> Content-Length and the second to actually write the content to a file.
> This is inefficient
agreed
> , bloats the code
how? is there redundant code for the second pass?
> , and makes any future code changes
> error prone.
certainly
> Second, if a file is modified while this code is running
> then it is possible (in fact likely) that we will send an incorrect
> Content-Length header.
grabbing a file handle with write permission for the duration of the
process would help solve this, but has it's own set of problems (ie,
what if the file is on read-only media?)
> Third, the error handling is virtually
> nonexistent (errors are detected but we don't respond correctly to
> them).
>
> I propose fixing all of these problems. Instead of making two passes
> through the form controls we'll just make one, and write out all the
> HTML to a temporary file.
file access is slow, and you may not always have a disk to write to.
It'd be nice if you just wrote to memory if possible, and only went to
disk if necessary. I think this general problem has been solved and an
abstraction for a temp file exists, so you don't have to sweat the
details. But I don't recall the class name for temp files, or even if
it's in the tree yet.
> This means that we don't need to rely upon
> stat to get any input file size
what is the issue with "stat"?
> , and neither do we have a window between
> getting the file size and actually reading the file where the contents
> (and therefore size) could change. At the end of the first (and only)
> pass we'll then know the correct Content-Length and so we can output the
> Content-Length header and copy the contents of the temporary file.
>
> The down side to this approach is that we have the overhead of writing
> and then reading an extra temporary file.
If the main issue really is locking the file, maybe we should pursue
that instead. Copying a large data set could be very expensive.
> But the benefits are that we
> only make one pass through the form controls (which should make future
> maintenance easier), we'll always get any file sizes correct, and the
> resultant code should be smaller. I also hope to fix the error handling.
> I think the benefits outweigh the drawbacks.
>
> Thoughts? Comments?
>