All,

Subversion contains some pretty high level code that's pretty inscrutable.  I 
keep asking myself how we would do <whatever> if we were writing in a higher 
level or more object-oriented language.

In the merge code we work with <url,rev> coordinates a lot of the time.  
Passing these as two separate variables works OK as long as there are only two 
and not used too many times.  In this code, however, we end up passing them 
around very many times.  We end up with code like this:
    SVN_ERR(svn_client__get_youngest_common_ancestor(
              NULL /* yc_relpath */, &yc_url, &yc_rev,
              source1_url,
              source1_rev,
              source2_url,
              source2_rev,
              ctx, scratch_pool));


Sometimes we have more than two variables because we also want the repository 
root URL, either to check it's the same repo as some other location or to 
derive the repo-root-relative path (or fspath).  It's reached the point where 
it's definitely worth having a struct that contains <url, rev, repo_root>.  Now 
we can write things like:

    SVN_ERR(svn_client__get_youngest_common_ancestor(
              &yca, source1_loc, source2_loc,
              ctx, scratch_pool));


It's not just a case of one parameter 
instead of two.  Packaging the two related components of the location 
together also helps us write code that correctly pairs them up, by making it 
obvious whenever we deliberately separate them.  In a lot of existing code we 
find the the two bits of data being separated and passed them 
through different code paths.  Often we end up implicitly or explicitly 
joining them up again in the end, and often we do this correctly, but often it 
is hard to follow and very hard to verify.  And every now and then of course we 
do it wrongly, and may not notice for a long time.


So, I introduced this structure in libsvn_client/merge.c recently:

/* A location in a repository. */
typedef struct repo_location_t
{
  url_uuid_t *repo;
  svn_revnum_t rev;
  const char *url;
} repo_location_t;


It's not ideal as it is.  The pointer to a url_uuid_t sub-structure is 
unnecessary at this stage 
(that is, unless and until we might make the url_uuid_t structure into 
some bigger "repository" abstraction); it would be simpler and therefore better 
right now to embed the repo root URL and UUID directly.  Also, suggestions for 
a better name are welcome -- preferably shorter, as it will have a 
'svn_client__' prefix.

For small bits of code, use of a structure such as this -- with its own 
'create' and 'dup' functions -- would be over-complex.  However, for the large 
amounts of code in relatively high level client-library functions, use of it 
makes a significant contribution to writability and readability.

I'm planning to:

  * Replace the 'repo' field with a simple 'const char *repo_root_url' (and 
maybe a uuid as well), thus simplifying initial allocation and the 'dup' 
function.

  * Extend the visibility to make it an "svn_client__" prefixed module-scope 
type in libsvn_client, and use it to simplify the interfaces of other private 
svn_client__ functions.

If it works well there, we can consider making it more global.


- Julian

Reply via email to