Evgeny Kotkov <kot...@apache.org> writes:

> URL: http://svn.apache.org/viewvc?rev=1886490&view=rev
> Log:
> In the update editor, stream data to both pristine and working files.
...
> Several quick benchmarks:
>
>   - Checking out subversion/trunk over https://
>
>     Total time:  3.861 s  →  2.812 s
>        Read IO:  57322 KB  →  316 KB
>       Write IO:  455013 KB  →  359977 KB
>
>   - Checking out 4 large binary files (7.4 GB) over https://
>
>     Total time:   91.594 s   →  70.214 s
>        Read IO:   7798883 KB  →  19 KB
>       Write IO:   15598167 KB  →  15598005 KB

Hey everyone,

Here's an improvement I have been working on recently.

Apparently, the client has an (implicit) limit on the size of directories that
can be safely checked out over HTTP without hitting a timeout.  The problem is
that when the client installs the new working files, it does so in a separate
step.  This step happens per-directory and involves copying and possibly
translating the pristine contents into new working files.  While that happens,
nothing is read from the connection.  So the amount of work that can be done
without hitting a timeout is limited.

Assuming the default HTTP timeout = 60 seconds of httpd 2.4.x and a relatively
fast disk, that puts the limit at around 6 GB for any directory.  Not cool.

My attempt to fix this is by making checkout stream data to both pristine and
the (projected) working file, so that the actual install would then happen as
just an atomic rename.  Since we now never stop reading from the connection,
the timeouts should no longer be an issue.  The new approach also has several
nice properties, such as not having to re-read the pristine files, not
interfering with the network-level buffering, TCP slow starts, and etc.
I see that it reduces the amount of both read and write I/O during all
checkouts, which should give a mild overall increase of how fast the
checkouts work.

Noting that this change only fixes "svn checkout", but not "svn export".
Export uses a separate implementation of the delta editor, and it should
be possible to update it in a similar way — but I'm leaving that for future
work for now.


Thanks,
Evgeny Kotkov

Reply via email to