Re: svn commit: r983766 - /subversion/branches/performance/subversion/libsvn_client/export.c
On Mon, Aug 09, 2010 at 09:45:18PM +0300, Daniel Shahaf wrote: Also, this isn't really related to performance; it belongs on /trunk. Next time, you could send this with a [PATCH] marker in the subject line, and a full committer could +1 you to commit that to directly to /trunk. Yes, please send patches if you have a change that isn't direclty related to your performance improvements work. The scope of the branch is not stefan2 makes all of his commits there, it's this branch is for stefan2's performance-related work. Thanks, Stefan
Re: svn commit: r983766 - /subversion/branches/performance/subversion/libsvn_client/export.c
On Tue, Aug 10, 2010 at 12:56:19PM +0200, Stefan Sperling wrote: On Mon, Aug 09, 2010 at 09:45:18PM +0300, Daniel Shahaf wrote: Also, this isn't really related to performance; it belongs on /trunk. Next time, you could send this with a [PATCH] marker in the subject line, and a full committer could +1 you to commit that to directly to /trunk. Yes, please send patches if you have a change that isn't direclty related to your performance improvements work. The scope of the branch is not stefan2 makes all of his commits there, it's this branch is for stefan2's performance-related work. Oh, and there is a technical reason for this: Subversion is bad at dealing with cyclic merges (a.k.a. reflective merges). Because of this, it really matters where a change enters the tree. See the ASCII graph at the end of this post for an example where things go wrong because a change is made on a branch first and not on trunk: http://mail-archives.apache.org/mod_mbox/subversion-dev/201004.mbox/%3c20100406152724.gm19...@noel.stsp.name%3e (This example may not apply to our situation, but other examples can be contrived.) So, ideally, changes we may want to cherry-pick to other branches such as 1.6.x should enter our tree via trunk, and be merged to other branches from there -- it will enter your performance branch during a sync with trunk, and the 1.6.x branch via a cherry picking merge. Stefan
Re: svn commit: r983807 - in /subversion/branches/atomic-revprop/subversion: include/svn_error.h libsvn_subr/error.c
On Mon, 2010-08-09, danie...@apache.org wrote: +/** Return TRUE if @a err's chain contains the error code @a apr_err. + * + * @since New in 1.7. + */ +svn_boolean_t +svn_error_has_cause(svn_error_t *err, apr_status_t apr_err); This looks like it could be a useful API. I would expect to be able to call such a function on a NULL error pointer, and the result should be FALSE since no error doesn't have any cause that can be expressed in an apr_status_t. +svn_boolean_t +svn_error_has_cause(svn_error_t *err, apr_status_t apr_err) +{ + svn_error_t *child; + + if (! err ! apr_err) +/* The API doesn't specify the behaviour when ERR is NULL. */ +return TRUE; What's this block for? The behaviour I mentioned above would fall out from just removing this block. It looks like this is so that you can write svn_error_has_cause(err, APR_SUCCESS) and get TRUE if ERR is NULL or contains APR_SUCCESS anywhere in its chain, but is that really useful? - Julian + for (child = err; child; child = child-child) +if (child-apr_err == apr_err) + return TRUE; + + return FALSE; +}
Re: svn commit: r983807 - in /subversion/branches/atomic-revprop/subversion: include/svn_error.h libsvn_subr/error.c
(I intended to commit that to trunk) Julian Foad wrote on Tue, Aug 10, 2010 at 12:17:24 +0100: On Mon, 2010-08-09, danie...@apache.org wrote: +/** Return TRUE if @a err's chain contains the error code @a apr_err. + * + * @since New in 1.7. + */ +svn_boolean_t +svn_error_has_cause(svn_error_t *err, apr_status_t apr_err); This looks like it could be a useful API. I'll need it to dig beneath ra_svn's wrapping of server-generated errors by another error code. I would expect to be able to call such a function on a NULL error pointer, and the result should be FALSE since no error doesn't have any cause that can be expressed in an apr_status_t. Ahem, doesn't (apr_status_t)APR_SUCCESS mean no error? +svn_boolean_t +svn_error_has_cause(svn_error_t *err, apr_status_t apr_err) +{ + svn_error_t *child; + + if (! err ! apr_err) +/* The API doesn't specify the behaviour when ERR is NULL. */ +return TRUE; What's this block for? To make svn_error_has_cause(SVN_NO_ERROR, apr_err) return TRUE if and only if apr_err == APR_SUCCESS. The behaviour I mentioned above would fall out from just removing this block. It looks like this is so that you can write svn_error_has_cause(err, APR_SUCCESS) and get TRUE if ERR is NULL or contains APR_SUCCESS anywhere in its chain, but is that really useful? I didn't consider the option that APR_SUCCESS could occur as part of the chain. Good point. - Julian I can see several options: * forbid passing SVN_NO_ERROR * return FALSE on SVN_NO_ERROR * reture (apr_err == APR_SUCCESS ? TRUE : FALSE) on SVN_NO_ERROR Right now, the API does the first and the code the third. I suppose the question is (1) whether we expect the caller to test for SVN_NO_ERROR before calling this function, and (2) whether we expect people to pass APR_SUCCESS to this function. Personally my answers (while writing this) were: (1) yes (2) no. + for (child = err; child; child = child-child) +if (child-apr_err == apr_err) + return TRUE; + + return FALSE; +}
Re: svn commit: r983720 - in /subversion/trunk/subversion: libsvn_client/repos_diff.c tests/cmdline/diff_tests.py
On Mon, 2010-08-09, sbut...@apache.org wrote: Fix issue 2333 diff URL1 URL2 not reverse of diff URL2 URL1. When the repository reports a deleted directory, recursively walk the directory (in the repository) and report files as deleted. TODO: Handle non-infinite depth correctly. Review by: rhuijben stsp * subversion/libsvn_client/repos_diff.c (edit_baton): Add a boolean field to control whether this workaround should be used. Add a func and baton for cancellation. (diff_deleted_dir): New function. (delete_entry): Call diff_deleted_dir() if needed. (svn_client__get_diff_editor): Set the new edit_baton fields. * subversion/tests/cmdline/diff_tests.py (diff_multiple_reverse): Remove a comment that made this test the moral equivalent of an XFAIL. (diff_renamed_dir): Add more test cases. Correct the expectations for diffs within a moved directory. (test_list): Remove XFail from diff_renamed_dir. [...] +/* Recursively walk tree rooted at DIR (at REVISION) in the repository, + * reporting all files as deleted. Part of a workaround for issue 2333. + * + * DIR is a repository path relative to the URL in RA_SESSION. REVISION + * may be NULL, in which case it defaults to HEAD. EDIT_BATON is the Does REVISION really need to be able to default to HEAD? I wouldn't have thought so. (If so, that would be SVN_INVALID_REVNUM not NULL.) + * overall crawler editor baton. If CANCEL_FUNC is not NULL, then it + * should refer to a cancellation function (along with CANCEL_BATON). + */ +/* ### TODO: Handle depth. */ +static svn_error_t * +diff_deleted_dir(const char *dir, + svn_revnum_t revision, + svn_ra_session_t *ra_session, + void *edit_baton, + svn_cancel_func_t cancel_func, + void *cancel_baton, + apr_pool_t *pool) - Julian
Re: svn commit: r983807 - in /subversion/branches/atomic-revprop/subversion: include/svn_error.h libsvn_subr/error.c
On Tue, 2010-08-10 at 15:04 +0300, Daniel Shahaf wrote: (I intended to commit that to trunk) Julian Foad wrote on Tue, Aug 10, 2010 at 12:17:24 +0100: On Mon, 2010-08-09, danie...@apache.org wrote: +/** Return TRUE if @a err's chain contains the error code @a apr_err. + * + * @since New in 1.7. + */ +svn_boolean_t +svn_error_has_cause(svn_error_t *err, apr_status_t apr_err); This looks like it could be a useful API. I'll need it to dig beneath ra_svn's wrapping of server-generated errors by another error code. I would expect to be able to call such a function on a NULL error pointer, and the result should be FALSE since no error doesn't have any cause that can be expressed in an apr_status_t. Ahem, doesn't (apr_status_t)APR_SUCCESS mean no error? Yes but it's not really a cause of the error. +svn_boolean_t +svn_error_has_cause(svn_error_t *err, apr_status_t apr_err) +{ + svn_error_t *child; + + if (! err ! apr_err) +/* The API doesn't specify the behaviour when ERR is NULL. */ +return TRUE; What's this block for? To make svn_error_has_cause(SVN_NO_ERROR, apr_err) return TRUE if and only if apr_err == APR_SUCCESS. The behaviour I mentioned above would fall out from just removing this block. It looks like this is so that you can write svn_error_has_cause(err, APR_SUCCESS) and get TRUE if ERR is NULL or contains APR_SUCCESS anywhere in its chain, but is that really useful? I didn't consider the option that APR_SUCCESS could occur as part of the chain. Good point. - Julian I can see several options: * forbid passing SVN_NO_ERROR * return FALSE on SVN_NO_ERROR * reture (apr_err == APR_SUCCESS ? TRUE : FALSE) on SVN_NO_ERROR Right now, the API does the first and the code the third. I suppose the question is (1) whether we expect the caller to test for SVN_NO_ERROR before calling this function, and (2) whether we expect people to pass APR_SUCCESS to this function. Personally my answers (while writing this) were: (1) yes (2) no. Generally we have found that with error testing it's convenient to be able to write err = foo(); do_something_with(err); without having to check err is non-null first. I agree with (2) - I don't think it makes much sense to pass APR_SUCCESS to this function. - Julian + for (child = err; child; child = child-child) +if (child-apr_err == apr_err) + return TRUE; + + return FALSE; +}
Re: svn commit: r983807 - in /subversion/branches/atomic-revprop/subversion: include/svn_error.h libsvn_subr/error.c
Julian Foad wrote on Tue, Aug 10, 2010 at 13:48:20 +0100: On Tue, 2010-08-10 at 15:04 +0300, Daniel Shahaf wrote: I can see several options: * forbid passing SVN_NO_ERROR * return FALSE on SVN_NO_ERROR * reture (apr_err == APR_SUCCESS ? TRUE : FALSE) on SVN_NO_ERROR Right now, the API does the first and the code the third. I suppose the question is (1) whether we expect the caller to test for SVN_NO_ERROR before calling this function, and (2) whether we expect people to pass APR_SUCCESS to this function. Personally my answers (while writing this) were: (1) yes (2) no. Generally we have found that with error testing it's convenient to be able to write err = foo(); do_something_with(err); without having to check err is non-null first. Yep, and I just realized that one of my pending patches also missed the if (err) check. How about: [[[ Index: subversion/libsvn_subr/error.c === --- subversion/libsvn_subr/error.c (revision 983929) +++ subversion/libsvn_subr/error.c (working copy) @@ -274,9 +274,8 @@ { svn_error_t *child; - if (! err ! apr_err) -/* The API doesn't specify the behaviour when ERR is NULL. */ -return TRUE; + if (err == SVN_NO_ERROR) +return FALSE; for (child = err; child; child = child-child) if (child-apr_err == apr_err) Index: subversion/include/svn_error.h === --- subversion/include/svn_error.h (revision 983929) +++ subversion/include/svn_error.h (working copy) @@ -195,6 +195,8 @@ /** Return TRUE if @a err's chain contains the error code @a apr_err. * + * If @a err is #SVN_NO_ERROR, return FALSE. + * * @since New in 1.7. */ svn_boolean_t ]]] I agree with (2) - I don't think it makes much sense to pass APR_SUCCESS to this function. Of course, but I don't think we should bother to special-case/check for that.
Re: svn commit: r983766 - /subversion/branches/performance/subversion/libsvn_client/export.c
Stefan Sperling wrote on Tue, Aug 10, 2010 at 13:06:43 +0200: On Tue, Aug 10, 2010 at 12:56:19PM +0200, Stefan Sperling wrote: On Mon, Aug 09, 2010 at 09:45:18PM +0300, Daniel Shahaf wrote: Also, this isn't really related to performance; it belongs on /trunk. Next time, you could send this with a [PATCH] marker in the subject line, and a full committer could +1 you to commit that to directly to /trunk. Yes, please send patches if you have a change that isn't direclty related to your performance improvements work. The scope of the branch is not stefan2 makes all of his commits there, it's this branch is for stefan2's performance-related work. Oh, and there is a technical reason for this: Subversion is bad at dealing with cyclic merges (a.k.a. reflective merges). And as a work-around, you could apply the patches to trunk (without creating mergeinfo) and then merge --record-only them to the branch. Daniel (who just asked on IRC after realizing he accidentally committed to the atomic-revprops branch something that should have gone to trunk) Because of this, it really matters where a change enters the tree. See the ASCII graph at the end of this post for an example where things go wrong because a change is made on a branch first and not on trunk: http://mail-archives.apache.org/mod_mbox/subversion-dev/201004.mbox/%3c20100406152724.gm19...@noel.stsp.name%3e (This example may not apply to our situation, but other examples can be contrived.) So, ideally, changes we may want to cherry-pick to other branches such as 1.6.x should enter our tree via trunk, and be merged to other branches from there -- it will enter your performance branch during a sync with trunk, and the 1.6.x branch via a cherry picking merge. Stefan
Re: svn commit: r983766 - /subversion/branches/performance/subversion/libsvn_client/export.c
On Tue, Aug 10, 2010 at 5:56 AM, Stefan Sperling s...@elego.de wrote: On Mon, Aug 09, 2010 at 09:45:18PM +0300, Daniel Shahaf wrote: Also, this isn't really related to performance; it belongs on /trunk. Next time, you could send this with a [PATCH] marker in the subject line, and a full committer could +1 you to commit that to directly to /trunk. Yes, please send patches if you have a change that isn't direclty related to your performance improvements work. The scope of the branch is not stefan2 makes all of his commits there, it's this branch is for stefan2's performance-related work. By the way, please don't take all this as everybody is jumping on Stefan F. We are really grateful for the performance improvements going on on your branch and look forward to seeing them in Subversion. Your changes are just the first time this has happened in a while, and we're using them as an opportunity to do a bit of group re-education. So please don't feel singled out. :) (We will shortly be chastising Daniel S. for a similar transgression. :P ) -Hyrum
Re: svn commit: r983807 - in /subversion/branches/atomic-revprop/subversion: include/svn_error.h libsvn_subr/error.c
On Tue, 2010-08-10, Daniel Shahaf wrote: Julian Foad wrote on Tue, Aug 10, 2010 at 13:48:20 +0100: On Tue, 2010-08-10 at 15:04 +0300, Daniel Shahaf wrote: I can see several options: * forbid passing SVN_NO_ERROR * return FALSE on SVN_NO_ERROR * reture (apr_err == APR_SUCCESS ? TRUE : FALSE) on SVN_NO_ERROR Right now, the API does the first and the code the third. I suppose the question is (1) whether we expect the caller to test for SVN_NO_ERROR before calling this function, and (2) whether we expect people to pass APR_SUCCESS to this function. Personally my answers (while writing this) were: (1) yes (2) no. Generally we have found that with error testing it's convenient to be able to write err = foo(); do_something_with(err); without having to check err is non-null first. Yep, and I just realized that one of my pending patches also missed the if (err) check. How about: [[[ Index: subversion/libsvn_subr/error.c === --- subversion/libsvn_subr/error.c(revision 983929) +++ subversion/libsvn_subr/error.c(working copy) @@ -274,9 +274,8 @@ { svn_error_t *child; - if (! err ! apr_err) -/* The API doesn't specify the behaviour when ERR is NULL. */ -return TRUE; + if (err == SVN_NO_ERROR) +return FALSE; No need to implement this check as a special case: the general case loop below already does the same. for (child = err; child; child = child-child) if (child-apr_err == apr_err) Index: subversion/include/svn_error.h === --- subversion/include/svn_error.h(revision 983929) +++ subversion/include/svn_error.h(working copy) @@ -195,6 +195,8 @@ /** Return TRUE if @a err's chain contains the error code @a apr_err. * + * If @a err is #SVN_NO_ERROR, return FALSE. + * * @since New in 1.7. */ svn_boolean_t ]]] I agree with (2) - I don't think it makes much sense to pass APR_SUCCESS to this function. Of course, but I don't think we should bother to special-case/check for that. Agreed. - Julian
Upgrade to single-DB: problem locking an empty DB
Upgrading a WC to single-DB: upgrade_to_wcng() calls svn_wc__db_upgrade_begin() to create a new DB, and then svn_wc__db_wclock_obtain() and then svn_wc__write_upgraded_entries() The _wclock_obtain() fails because it checks that the node with relpath exists. Normally in libsvn_wc a new DB is created with svn_wc__db_init() which inserts a row for relpath , but svn_wc__db_upgrade_begin() doesn't. What's the best solution here? Not lock it? Have svn_wc__db_upgrade_begin() create an initial row? Have svn_wc__db_wclock_obtain() NOT check for existence of a row? The first and last options don't sound right. Creating an initial row does sound right, and requires (presumably) a modification of svn_wc__write_upgraded_entries(). - Julian
repository corruption by svnadmin upgrade
I've posted messages about the following problem in the users list, but here's other information. After a svnadmin upgrade with svn version 1.5.1 (r32289), which terminated without any error, and svn-populate-node-origins-index, the repository got corrupted: /svnroot/mpfr/db/format now contains: 3 layout linear without an ending \n, while the documentation[*] says: The format file is a single line of the form format number\n, followed by any number of lines specifying 'format options' - additional information about the filesystem's format. Each format option line is of the form option\n or option parameters\n. The current file hasn't been updated to the new format; it is still in the format 2 (with the old timestamp). According to the documentation: The format of the current file is: * Format 3 and above: a single line of the form youngest-revision\n giving the youngest revision for the repository. * Format 2 and below: a single line of the form youngest-revision next-node-id next-copy-id\n giving the youngest revision, the next unique node-ID, and the next unique copy-ID for the repository. [*] http://svn.apache.org/repos/asf/subversion/trunk/subversion/libsvn_fs_fs/structure The following files/directory have been created: -rw-r--r-- 1 www-data scm_mpfr 2 2010-08-06 16:01:43 txn-current -rw-r--r-- 1 www-data scm_mpfr 0 2010-08-06 16:01:43 txn-current-lock drwxr-sr-x 2 www-data scm_mpfr 4096 2010-08-06 16:01:43 txn-protorevs and they prevent any commit (due to the lock, I assume). The contents themselves are correct (tested with svnadmin verify and a dump comparison). More information about the repository: vlefe...@ff-scm-prod:~$ svnadmin lslocks /svnroot/mpfr vlefe...@ff-scm-prod:~$ svnadmin lstxns /svnroot/mpfr vlefe...@ff-scm-prod:~$ export TIME_STYLE=+%Y-%m-%d %H:%M:%S vlefe...@ff-scm-prod:~$ ls -l /svnroot/mpfr total 24 -rw-r--r-- 1 www-data scm_mpfr 379 2005-10-24 14:03:46 README.txt drwxrwsr-x 2 www-data scm_mpfr 4096 2005-10-24 14:03:46 conf drwxrwsr-x 2 www-data scm_mpfr 4096 2008-12-10 18:44:36 dav drwxrwsr-x 7 www-data scm_mpfr 4096 2010-08-06 16:01:43 db -r--r--r-- 1 www-data scm_mpfr2 2010-08-06 16:01:43 format drwxrwsr-x 2 www-data scm_mpfr 4096 2010-05-25 14:47:14 hooks drwxrwsr-x 2 www-data scm_mpfr 4096 2005-10-24 14:03:46 locks vlefe...@ff-scm-prod:~$ ls -l /svnroot/mpfr/db total 468 -rw-rw-r-- 1 www-data scm_mpfr 11 2010-08-05 02:26:05 current -r--r--r-- 1 www-data scm_mpfr 15 2010-08-06 16:01:43 format -rw-r--r-- 1 www-data scm_mpfr 5 2005-10-24 14:03:46 fs-type drwxrwsr-x 2 www-data scm_mpfr 4096 2010-08-06 16:02:20 node-origins drwxrwsr-x 2 www-data scm_mpfr 229376 2010-08-05 02:26:05 revprops drwxrwsr-x 2 www-data scm_mpfr 229376 2010-08-05 02:26:05 revs drwxrwsr-x 2 www-data scm_mpfr 4096 2010-08-05 02:26:05 transactions -rw-r--r-- 1 www-data scm_mpfr 2 2010-08-06 16:01:43 txn-current -rw-r--r-- 1 www-data scm_mpfr 0 2010-08-06 16:01:43 txn-current-lock drwxr-sr-x 2 www-data scm_mpfr 4096 2010-08-06 16:01:43 txn-protorevs -rw-r--r-- 1 www-data scm_mpfr 37 2005-10-24 14:03:46 uuid -rw-rw-r-- 1 www-data scm_mpfr 0 2005-10-24 14:03:46 write-lock vlefe...@ff-scm-prod:~$ ls -l /svnroot/mpfr/db/transactions total 0 vlefe...@ff-scm-prod:~$ ls -l /svnroot/mpfr/db/txn-protorevs total 0 vlefe...@ff-scm-prod:~$ ls -l /svnroot/mpfr/locks total 8 -rw-r--r-- 1 www-data scm_mpfr 295 2005-10-24 14:03:46 db-logs.lock -rw-r--r-- 1 www-data scm_mpfr 460 2005-10-24 14:03:46 db.lock vlefe...@ff-scm-prod:~$ cat /svnroot/mpfr/format 5 vlefe...@ff-scm-prod:~$ cat /svnroot/mpfr/db/current 7052 un bv vlefe...@ff-scm-prod:~$ cat /svnroot/mpfr/db/format; echo 3 layout linear vlefe...@ff-scm-prod:~$ cat /svnroot/mpfr/db/fs-type fsfs -- Vincent Lefèvre vinc...@vinc17.net - Web: http://www.vinc17.net/ 100% accessible validated (X)HTML - Blog: http://www.vinc17.net/blog/ Work: CR INRIA - computer arithmetic / Arénaire project (LIP, ENS-Lyon)
Re: Upgrade to single-DB: problem locking an empty DB
On Tue, 2010-08-10 at 15:31 +0100, Julian Foad wrote: Upgrading a WC to single-DB: upgrade_to_wcng() calls svn_wc__db_upgrade_begin() to create a new DB, and then svn_wc__db_wclock_obtain() and then svn_wc__write_upgraded_entries() The _wclock_obtain() fails because it checks that the node with relpath exists. Normally in libsvn_wc a new DB is created with svn_wc__db_init() which inserts a row for relpath , but svn_wc__db_upgrade_begin() doesn't. What's the best solution here? Not lock it? Have svn_wc__db_upgrade_begin() create an initial row? Have svn_wc__db_wclock_obtain() NOT check for existence of a row? The first and last options don't sound right. Creating an initial row does sound right, and requires (presumably) a modification of svn_wc__write_upgraded_entries(). The upgrade function is creating new DBs (or a new single-DB). Should it perhaps be asking for a lock on the DB as a whole, as distinct from a recursive lock on the WC root directory? In normal operation, those two mean the same, but maybe here we need to distinguish these as two different concepts. - Julian
Re: Upgrade to single-DB: problem locking an empty DB
On Tue, 2010-08-10 at 16:26 +0100, Julian Foad wrote: On Tue, 2010-08-10 at 15:31 +0100, Julian Foad wrote: Upgrading a WC to single-DB: upgrade_to_wcng() calls svn_wc__db_upgrade_begin() to create a new DB, and then svn_wc__db_wclock_obtain() and then svn_wc__write_upgraded_entries() The _wclock_obtain() fails because it checks that the node with relpath exists. Normally in libsvn_wc a new DB is created with svn_wc__db_init() which inserts a row for relpath , but svn_wc__db_upgrade_begin() doesn't. What's the best solution here? Not lock it? Have svn_wc__db_upgrade_begin() create an initial row? Have svn_wc__db_wclock_obtain() NOT check for existence of a row? The first and last options don't sound right. Creating an initial row does sound right, and requires (presumably) a modification of svn_wc__write_upgraded_entries(). The upgrade function is creating new DBs (or a new single-DB). Should it perhaps be asking for a lock on the DB as a whole, as distinct from a recursive lock on the WC root directory? In normal operation, those two mean the same, but maybe here we need to distinguish these as two different concepts. Or should it ... - read the fields of the 'this-dir' entry from 'entries', - create a DB with an initial row initialized from those fields, - lock that '' dir, - read+translate+write the rest of the 'entries' ? - Julian
RE: Bikeshed: configuration override order
Summary... There are two issues here... 1. The repo admin wants to enforce what is commited to their repo. This exists with scripts but common request can be made inherient repo settings (probably need to be path based). The issue with pre-commit hooks is that they are run after all data is transferred - so if I commit loads of data, and accidentally leave the banned buildlog.htm file in... I'll find out about it after half an hour's commit.. and have to repeat the process. It's not that major an issue but repo-set config would fix it. Auto-props is a more tricky problem that would just be nicely solved by telling the clients what settings to use. 2. The admins want to simplify config for clients committing to repos that have various enforcements made (see item 1) without needing to deal with distirbuting client config files or what have you. This is the biggie - one remote worker decided he didn't have to read all the email I sent him on how to set up svn, and ended up committing 2 Gb of crud to my lovely repository. We have a pre-commit hook for all file extensions now. Perhaps the ideal (and easiest) way to resolve all this is to have the repo inform the client of mandatory settings that it expects to be set, and perhaps optional ones, and either automatically download a new set of configs to the client, or have the client fail until its config is in compliance with the repo's configuration settings. A new option could be added to the client to download and use the config from the repo. This could be made automatic, every time the client runs a repo-contacting command, it gets a hash or guid from the server that describes the config, and if that doesn't match what the client has, the client can run the config-download in the background (and then re-try the command it was trying to do). Perhaps allow the client to store multiple config files and use the one that matches the repo, along with the default one that we currently have. The server config can then only set the bits it wants, missing config options are used from the default. There's scope in there for the client to download the repo's chosen settings, and then override them anyway. There's scope for the repo to mandate a few options, giving the client a choice of going to play elsewhere or accept the restrictions. Anyway, that's my thoughts on the subject. Repo-driven config, even if its just an easy way to download a new set to the client instead of the email 'use this file, or use this registry script' is a huge step forward. Andy
Re: buildbot failure in ASF Buildbot on svn-x64-centos gcc
What is up with these authz failures? It seems the buildbots have been failing miserably over the past couple weeks. Is nobody looking at and correcting the failures? On Tue, Aug 10, 2010 at 11:58, build...@apache.org wrote: The Buildbot has detected a new failure of svn-x64-centos gcc on ASF Buildbot. Full details are available at: http://ci.apache.org/builders/svn-x64-centos%20gcc/builds/898 Buildbot URL: http://ci.apache.org/ Buildslave for this Build: svn-x64-centos Build Reason: Build Source Stamp: [branch subversion/branches/1.6.x] 984005 Blamelist: hwright,rhuijben BUILD FAILED: failed Test fsfs+ra_neon sincerely, -The Buildbot
Re: Bikeshed: configuration override order
On 08/10/2010 12:15 PM, Julian Foad wrote: On Tue, 2010-08-10 at 17:12 +0100, Bolstridge, Andrew wrote: Summary... There are two issues here... 1. The repo admin wants to enforce what is commited to their repo. This exists with scripts but common request can be made inherient repo settings (probably need to be path based). The issue with pre-commit hooks is that they are run after all data is transferred - so if I commit loads of data, and accidentally leave the banned buildlog.htm file in... I'll find out about it after half an hour's commit.. and have to repeat the process. No - there are two different hooks, for precisely this reason. pre-commit runs before all the file contents are transferred, and start-commit after all the file contents are transferred. Er... no. 'start-commit' runs before any files are transferred. It can't be used to fail a commit based on any information that comes from the commit's tree delta content payload. 'pre-commit' runs just before the commit transaction is promoted into a revision, and can be used to block commits based on the content payload. But only after the whole payload is transmitted to the server. -- C. Michael Pilato cmpil...@collab.net CollabNet www.collab.net Distributed Development On Demand signature.asc Description: OpenPGP digital signature
Re: Bikeshed: configuration override order
On 08/10/2010 01:10 PM, C. Michael Pilato wrote: On 08/10/2010 12:15 PM, Julian Foad wrote: On Tue, 2010-08-10 at 17:12 +0100, Bolstridge, Andrew wrote: Summary... There are two issues here... 1. The repo admin wants to enforce what is commited to their repo. This exists with scripts but common request can be made inherient repo settings (probably need to be path based). The issue with pre-commit hooks is that they are run after all data is transferred - so if I commit loads of data, and accidentally leave the banned buildlog.htm file in... I'll find out about it after half an hour's commit.. and have to repeat the process. No - there are two different hooks, for precisely this reason. pre-commit runs before all the file contents are transferred, and start-commit after all the file contents are transferred. Er... no. 'start-commit' runs before any files are transferred. It can't be used to fail a commit based on any information that comes from the commit's tree delta content payload. 'pre-commit' runs just before the commit transaction is promoted into a revision, and can be used to block commits based on the content payload. But only after the whole payload is transmitted to the server. In other words, I think you were essentially correct about the two-hook system, Julian -- you just had the hooks switched. But regardless, the two-hook system still fails Andrew's particular enforcement requirements (which are quite common). -- C. Michael Pilato cmpil...@collab.net CollabNet www.collab.net Distributed Development On Demand signature.asc Description: OpenPGP digital signature
Re: buildbot failure in ASF Buildbot on svn-x64-centos gcc
The general ambivalence toward buildbot failures on the 1.6.x branch is due (at least for me) to the fact that some of the build slaves aren't configured to properly build the branch, so they *always* fail. However, I can reproduce the authz failure on the 1.6.x branch, which is a bit disconcerting. That will have to be fixed before any 1.6.13 release. The trunk failures have been cleaned up, and everything (bindings and all) currently passes there. -Hyrum On Tue, Aug 10, 2010 at 11:28 AM, Greg Stein gst...@gmail.com wrote: What is up with these authz failures? It seems the buildbots have been failing miserably over the past couple weeks. Is nobody looking at and correcting the failures? On Tue, Aug 10, 2010 at 11:58, build...@apache.org wrote: The Buildbot has detected a new failure of svn-x64-centos gcc on ASF Buildbot. Full details are available at: http://ci.apache.org/builders/svn-x64-centos%20gcc/builds/898 Buildbot URL: http://ci.apache.org/ Buildslave for this Build: svn-x64-centos Build Reason: Build Source Stamp: [branch subversion/branches/1.6.x] 984005 Blamelist: hwright,rhuijben BUILD FAILED: failed Test fsfs+ra_neon sincerely, -The Buildbot
RE: Bikeshed: configuration override order
Summary... There are two issues here... 1. The repo admin wants to enforce what is commited to their repo. This exists with scripts but common request can be made inherient repo settings (probably need to be path based). The issue with pre-commit hooks is that they are run after all data is transferred - so if I commit loads of data, and accidentally leave the banned buildlog.htm file in... I'll find out about it after half an hour's commit.. and have to repeat the process. It's not that major an issue but repo-set config would fix it. Auto-props is a more tricky problem that would just be nicely solved by telling the clients what settings to use. 2. The admins want to simplify config for clients committing to repos that have various enforcements made (see item 1) without needing to deal with distirbuting client config files or what have you. This is the biggie - one remote worker decided he didn't have to read all the email I sent him on how to set up svn, and ended up committing 2 Gb of crud to my lovely repository. We have a pre-commit hook for all file extensions now. Anyway, that's my thoughts on the subject. Repo-driven config, even if its just an easy way to download a new set to the client instead of the email 'use this file, or use this registry script' is a huge step forward. Andy This is why I am trying to say it needs to be a two layer configuration. The settings on the server are enforced before anything is committed to the repo. The settings are also sent to the client (in some way) yes, to avoid that 5 minute or whatever wait. However I think shooting for client enforcement and trusting the client isn't something svn should rely on. It is similar to writing a web app where you have client side validation/sanitation... but you still ALWAYS duplicate that validation/sanitation on the server... because you really have no way to trust that someone isn't posting bad data to your server/services using a client that isn't the one you provided. I think the same approach has to be taken here. The server side config is provided to the client as a convenience but it is always enforced on the server (without the need to write scripts) because you can't always trust the client to be using or respecting forced repository configuration. BOb
Re: Bikeshed: configuration override order
On 08/10/2010 01:25 PM, Julian Foad wrote: Oh? All I know about Andrew's particular requirements related to this query is what's quoted above - If I ... accidentally leave the banned buildlog.htm file in ... - which sounded vaguely like a requirement for a path-based rule. Maybe I missed something. I don't think you missed the requirement. You merely offered a non-solution. :-) Andrew said (essentially), I need path-based rule enforcement, but the one you offer -- the pre-commit hook -- has this nasty side effect of only kicking in after the entire commit payload is on the server. You replied, No - there are two different hooks, for precisely this reason. But Andrew can't benefit from that other hook, because it can't do path-based rule enforcement. -- C. Michael Pilato cmpil...@collab.net CollabNet www.collab.net Distributed Development On Demand signature.asc Description: OpenPGP digital signature
Re: Bikeshed: configuration override order
On Tue, Aug 10, 2010 at 06:25:15PM +0100, Julian Foad wrote: All I know about Andrew's particular requirements related to this query is what's quoted above - If I ... accidentally leave the banned buildlog.htm file in ... - which sounded vaguely like a requirement for a path-based rule. Maybe I missed something. The start-commit hook doesn't know what paths the client will touch during the commit. It only knows which user is making the commit, and the capabilities advertised by the client (like I can do merge tracking). It can be used to temporarily make an entire repository read-only, by rejecting any commit. But it can't make decisions based on the content of a commit. Stefan
Re: Bikeshed: configuration override order
On 08/07/2010 12:18 PM, Greg Hudson wrote: My thinking about repository configuration is that the uses cases should be divided into two categories: 1. Stuff that isn't really client configuration at all, like auto-props, should come from the repository instead, and should only come from client configuration for compatibility. 2. Stuff that is client configuration should only come from client configuration. Client control is not legitimate business for an open source product, though it could be the business of a proprietary value-added fork. Are you saying that client control isn't legitimate because of some fluffy open-source principle that checks that right at the door, or because of the practical impossibility due to the fact that the client source code is available for modification and recompilation? The foremost bit of client configuration that CollabNet's Subversion customers are demanding (besides auto-props, which I think we all agree on) is a way for the server to set a policy which dictates that clients may not use plaintext or other insecure password storage mechanisms. I claim it's ridiculous to propose that we need a custom fork of Subversion, Subclipse, TortoiseSVN, AnkhSVN, etc. -- all just to bang in this feature. What, then, do you propose? Do you use server-side configuration to demand, say, client certs (which you still can't guarantee are passphrase-protected, right?)? Do you rip insecure password storage out of Subversion altogether? -- C. Michael Pilato cmpil...@collab.net CollabNet www.collab.net Distributed Development On Demand signature.asc Description: OpenPGP digital signature
Re: RFC: How should revert handle copied/added items?
On Tue, Aug 10, 2010 at 12:18 AM, Greg Stein gst...@gmail.com wrote: Greg and Peter, Thanks for your views on this topic. A few comments/questions below. On Mon, Aug 9, 2010 at 12:54, Paul Burba ptbu...@gmail.com wrote: We have several issues related to the question of how revert should handle locally added or copied items: 'svn revert of svn move' http://subversion.tigris.org/issues/show_bug.cgi?id=876 'svn revert should provide a way to delete copied files' http://subversion.tigris.org/issues/show_bug.cgi?id=3101 'Case only renames resulting from merges don't work or break the WC on case-insensitive file systems' http://subversion.tigris.org/issues/show_bug.cgi?id=3115 With only one exception that I know of[1], revert leaves local additions of every stripe as unversioned. My first question is simple: Do we consider this the correct behavior? I haven't read the above bugs, but have come to a solid view of how 'svn revert' should behave. When stopping to think about the *representation* of changes within the working copy, it also (imo) leads to a better understanding of the operations performed and what a revert of those operations can mean. I'll try to provide a brief description of my thoughts, then answer your questions based on that, and then summarize any gaps. Please ask for more detail, or to argue interpretation! There are *two* semantic concepts behind revert. The first is revert my structural changes (e.g add/move/copy/delete). The second is revert my content changes (e.g. text edits and prop changes). These two concepts are conflated today; we make no distinction between them, and provide no UI or API mechanism to distinguish *what* you would like to revert. Content reverts are simple and straight-forward, and can be applied to any node [which has content changes]. Structural changes *cannot* be applied to any node. If you copy a (sub)tree to a new location, then you cannot revert a *child* of that copy. Without stretching the brain along some inhuman path, it just has no meaning. The child arrived as a result of a parent copy. You can *delete* or *replace* the child using further operations, but an implied-copy of a child can only be reverted by reverting the ancestor which is the root of the copied-here operation. For example, $ svn cp A/B C $ svn revert C/D/file That should error. $ svn revert C Should succeed, and undo the copy that was made. Since we can't revert the root of an added subtree by itself, I assume that if the root of your revert target is a copy, that '-R' is implied? $ edit C/D/file $ svn revert C/D/file Should succeed; reverting just local post-copy edits. What about this: $ svn revert -R C/D Should that revert C/D/file's post-copy edit or produce the same error as the attempt to revert C/D/file above? If the former then we have a situation where two consecutive reverts are needed to fully revert (i.e. one to revert the edit, one to revert the copy). Not that this is inherently bad, just a bit unusual. But if we implement --discard-local-edits there is always the option to do it in one fell swoop. I favor this option, assuming we implement --discard-local-edits or something similar. If the latter, then it gets a bit cumbersome to revert multiple text edits but keep the copy; you'd need to explicitly specify each path to be reverted. There are UI issues with allowing the second to succeed if local mods are present, yet fail if not. I relegate this to UI foo. Conceptually, the second revert should be able to succeed. Here is the more interesting scenario: $ svn cp file1 file2 $ edit file2 $ svn revert file2 If we implement the earlier behavior where 'svn cp A/B C, edit C/D/file, svn revert C/D/file' reverts just local post-copy edits to C/D/file, I think that is what we should do here too. This is where the two semantics come in. Do you want to revert the content edits? Or do you want to revert the copy? Questions and UI and API abound :-) A) Yes! svn revert only reverts the scheduled addition, but leaves the item behind as unversioned. In this case the one exception [1] becomes a bug which we can fix. For a schedule-add... yes, this is safest. This content may not exist anywhere else, so it should remain. If you're using the term addition to also mean add with history, then I'll smack you and say you should use the term copy. A revert of a copy has restrictions as noted above, and if it also has local changes, then leaving it behind may be important. Swing away, I was using addition to cover both copies and schedule-adds. But I did so knowingly, because today svn revert treats both the same way: The scheduled add or add with history is reverted and the unversioned item is left on disk. The only exception I know of to this is what prompted me to write this thread in the first place: merge of case-only remanes --
RE: Bikeshed: configuration override order
On 08/07/2010 12:18 PM, Greg Hudson wrote: My thinking about repository configuration is that the uses cases should be divided into two categories: 1. Stuff that isn't really client configuration at all, like auto-props, should come from the repository instead, and should only come from client configuration for compatibility. 2. Stuff that is client configuration should only come from client configuration. Client control is not legitimate business for an open source product, though it could be the business of a proprietary value-added fork. Are you saying that client control isn't legitimate because of some fluffy open-source principle that checks that right at the door, or because of the practical impossibility due to the fact that the client source code is available for modification and recompilation? The latter. I see no problem having it baked into the client. But, the fact that it is open source makes it easy for someone to get a client that doesn't respect repository settings so it isn't practical to list it as a [trustable/secure/enterprise level] feature unless perhaps svn was creating signed binaries and a repo could be set to only work with such a signed binary. Imagine the feature list: o Repository/Server enforced client settings assuming a client that supports this feature is used. I'm not sure that will work. But if you said something like: o Repository/Server declared client settings enforced at the server is possible without the need for script hooks. (It is not possible to enforce client side password storage, blah blah) I think the later is more practical. I am certainly no open source purist. The foremost bit of client configuration that CollabNet's Subversion customers are demanding (besides auto-props, which I think we all agree on) is a way for the server to set a policy which dictates that clients may not use plaintext or other insecure password storage mechanisms. I claim it's ridiculous to propose that we need a custom fork of Subversion, Subclipse, TortoiseSVN, AnkhSVN, etc. -- all just to bang in this feature. I agree with you. See above. What, then, do you propose? Do you use server-side configuration to demand, say, client certs (which you still can't guarantee are passphrase- protected, right?)? I think my point here is that you can't guarantee anything on the client side. Do you rip insecure password storage out of Subversion altogether? Perhaps, although I'm still not sure that solves the problem. There are several tools like multi-clip board apps and such that will still allow a user from doing stupid stuff. Also, hasn't the svn team taken the stance that it's a version control system, the best a malicious client can do it commit stuff... no data in the repo can be lost. Granted, that is a fall back. If we are strictly talking about security here, which we aren't, I think there are also ways to secure access to the repository outside of svn security like VPNs, network security, etc. BOb
Re: Bikeshed: configuration override order
On Tue, 2010-08-10 at 14:24 -0400, C. Michael Pilato wrote: The foremost bit of client configuration that CollabNet's Subversion customers are demanding (besides auto-props, which I think we all agree on) is a way for the server to set a policy which dictates that clients may not use plaintext or other insecure password storage mechanisms. I don't expect anyone to consider my opinion blocking, but I think this is a questionable area for any kind of software to delve into. I've only seen this kind of client control in one other context (a branded Jabber client), and never in an open source project. (*) Lots and lots of clients are able to remember passwords: web browsers, email clients, IM clients. Lots of central IT organizations (MIT's included) don't like this feature and recommend that users not use it. Lots of users do it anyway. I don't know of a single piece of widely-used client software which allows the server to turn off password memory. (*) Actually, on consideration, there was some flap about the okay to print flag in PDF documents, or something related to that. I can't remember how it turned out.
Re: RFC: How should revert handle copied/added items?
On Tue, Aug 10, 2010 at 14:41, Paul Burba ptbu...@gmail.com wrote: On Tue, Aug 10, 2010 at 12:18 AM, Greg Stein gst...@gmail.com wrote: ... For example, $ svn cp A/B C $ svn revert C/D/file That should error. $ svn revert C Should succeed, and undo the copy that was made. Since we can't revert the root of an added subtree by itself, I assume that if the root of your revert target is a copy, that '-R' is implied? I generally don't think at the cmdline level :-P Reverting the root of a copy should imply -R, I think. But we also defined revert as non-recursive by default (iirc). There is also great potential for further edits under there (structural and content) which the user might have forgotten. An implied -R could blast (say) some local-deletes, adds, and content changes. When the user puts a -R on the cmdline, then we assume they have checked the subtree and are willing to revert it all. Having the cmdline tool say To revert C (a copy of repos/p...@569), please specify -R. It could even warn about edits to children. Dunno. $ edit C/D/file $ svn revert C/D/file Should succeed; reverting just local post-copy edits. What about this: $ svn revert -R C/D Should that revert C/D/file's post-copy edit or produce the same error as the attempt to revert C/D/file above? You could go two ways on this, I think: 1) revert C/D/file, then error on C/D 2) error without doing anything If the former then we have a situation where two consecutive reverts are needed to fully revert (i.e. one to revert the edit, one to revert the copy). Not that this is inherently bad, just a bit unusual. But if we implement --discard-local-edits there is always the option to do it in one fell swoop. I favor this option, assuming we implement --discard-local-edits or something similar. If the latter, then it gets a bit cumbersome to revert multiple text edits but keep the copy; you'd need to explicitly specify each path to be reverted. Right. And --discard-local-edits was proffered as an example of content edits. Don't forget this scenario: $ svn cp A/B C $ svn rm C/D/file $ svn revert -R C This is just a different layer of operation over the copy. Another structural edit, instead of a content edit. I think it should succeed, and remove content from the disk. There is no potential data loss, since this was just a copy. If there was a local-add under there, then it gets more troublesome. Maybe remove all copied files, but leave behind the locally-added file and enough parent directories? There are UI issues with allowing the second to succeed if local mods are present, yet fail if not. I relegate this to UI foo. Conceptually, the second revert should be able to succeed. Here is the more interesting scenario: $ svn cp file1 file2 $ edit file2 $ svn revert file2 If we implement the earlier behavior where 'svn cp A/B C, edit C/D/file, svn revert C/D/file' reverts just local post-copy edits to C/D/file, I think that is what we should do here too. *shrug* ... seems reasonable. This is where the two semantics come in. Do you want to revert the content edits? Or do you want to revert the copy? Questions and UI and API abound :-) A) Yes! svn revert only reverts the scheduled addition, but leaves the item behind as unversioned. In this case the one exception [1] becomes a bug which we can fix. For a schedule-add... yes, this is safest. This content may not exist anywhere else, so it should remain. If you're using the term addition to also mean add with history, then I'll smack you and say you should use the term copy. A revert of a copy has restrictions as noted above, and if it also has local changes, then leaving it behind may be important. Swing away, I was using addition to cover both copies and schedule-adds. But I did so knowingly, because today svn revert treats both the same way: The scheduled add or add with history is reverted and the unversioned item is left on disk. The only exception Yeah. Well, we're trying to use better definitions because it is this exact imprecision which has caused us so many problems. Adds and copies are very different, and it helps us to reason much better about revert's behavior. ... A directory-copy with post-copy propchanges cannot be left behind. It becomes unversioned, meaning we have nowhere to leave the propchanges behind. It may be that users first revert the propchanges, *then* they revert the copy. Dunno. Again, I think the two revert approach is fine. When you make a copy then edit that copy before it is committed I think of this as layering changes atop one another. Yup. And further structural changes (add/delete, another copy, etc) are other types of layers. And that is also how we are going to store these things in wc.db: multiple layers of operations with one current view of the tree. svn revert will take the cautious approach and revert the top layer of changes first (i.e. the prop/text
Re: NODE_DATA (2nd iteration)
I'll work on reviewing this stuff. I believe there are quite a few details that need to be worked out (like exact presence values). Cheers, -g On Tue, Aug 10, 2010 at 12:18, Julian Foad julian.f...@wandisco.com wrote: Any responses would be greatly appreciated. - Julian On Tue, 2010-08-03, Julian Foad wrote: On Mon, 2010-07-12, Erik Huelsmann wrote: After lots of discussion regarding the way NODE_DATA/4th tree should be working, I'm now ready to post a summary of the progress. In my last e-mail (http://svn.haxx.se/dev/archive-2010-07/0262.shtml) I stated why we need this; this post is about the conclusion of what needs to happen. Also included are the first steps there. With the advent of NODE_DATA, we distinguish node values specifically related to BASE nodes, those specifically related to current WORKING nodes and those which are to be maintained for multiple levels of WORKING nodes (not only the current view) (the latter category is most often also shared with BASE). The respective tables will hold the columns shown below. - TABLE WORKING_NODE ( wc_id INTEGER NOT NULL REFERENCES WCROOT (id), local_relpath TEXT NOT NULL, parent_relpath TEXT, moved_here INTEGER, moved_to TEXT, original_repos_id INTEGER REFERENCES REPOSITORY (id), original_repos_path TEXT, original_revnum INTEGER, translated_size INTEGER, last_mod_time INTEGER, /* an APR date/time (usec since 1970) */ keep_local INTEGER, PRIMARY KEY (wc_id, local_relpath) ); CREATE INDEX I_WORKING_PARENT ON WORKING_NODE (wc_id, parent_relpath); The moved_* and original_* columns are typical examples of WORKING fields only maintained for the visible WORKING nodes: the original_* and moved_* fields are inherited from the operation root by all children part of the operation. The operation root will be the visible change on its own level, meaning it'll have rows both in the WORKING_NODE and NODE_DATA tables. The fact that these columns are not in the WORKING_NODE table means that tree changes are not preserved accros overlapping changes. This is fully compatible with what we do today: changes to higher levels destroy changes to lower levels. The translated_size and last_mod_time columns exist in WORKING_NODE and BASE_NODE; they explicitly don't exist in NODE_DATA. The fact that they exist in BASE_NODE is a bit of a hack: it's to prevent creation of WORKING_NODE data for every file which has keyword expansion or eol translation properties set: these columns serve only to optimize working copy scanning for changes and as such only relate to the visible WORKING_NODEs. Can we come up with an English description of what each table will now represent? The BASE_NODE table lists the existing node-revs in the repository that comprise the mixed-revision tree that was most recently updated/switched to or checked out. (The kind and content of these nodes is not here; see the NODE_DATA table.) TABLE BASE_NODE ( wc_id INTEGER NOT NULL REFERENCES WCROOT (id), local_relpath TEXT NOT NULL, repos_id INTEGER REFERENCES REPOSITORY (id), repos_relpath TEXT, We need a revision number column here to go along with repos_id and relpath to make a valid node-rev reference, don't we? parent_relpath TEXT, (While we're reorganising, can we move that parent_relpath column to adjacent to local_relpath?) translated_size INTEGER, last_mod_time INTEGER, /* an APR date/time (usec since 1970) */ dav_cache BLOB, incomplete_children INTEGER, file_external TEXT, PRIMARY KEY (wc_id, local_relpath) ); The NODE_DATA table records the kind and shallow content (props, text, link target) of each node in the WC. It includes both the nodes that comprise the currently 'visible' (or 'actual' or 'on-disk') state of the WC and also all nodes that are part of a copied or moved tree but currently shadowed by a replacement performed inside that tree. At least one row exists for each WC path, including paths with no change and all paths affected by a tree change (add, delete, etc.). If the same path is affected by multiple levels of tree change - a replacement inside a copied directory, for example - then multiple rows exist with different 'op_depth' values. TABLE NODE_DATA ( wc_id INTEGER NOT NULL REFERENCES WCROOT (id), local_relpath TEXT NOT NULL, op_depth INTEGER NOT NULL, presence TEXT NOT NULL, kind TEXT NOT NULL, checksum TEXT, changed_rev INTEGER, changed_date INTEGER, /* an APR date/time (usec since 1970) */ changed_author TEXT, The changed_* columns can only belong to a node-rev that exists in the repository. What node-rev do they belong to and why aren't they alongside the node-rev details? (The changed_* columns convey essentially a rev number
Bug in svn_fs_paths_changed2() Python bindings?
Hi all, Looks like the binding for svn_fs_paths_changed2() incorrectly specifies the type of structures contained in the hash it returns: the following code s = fs.paths_changed2(rev_root, pool) for i in s: sys.stderr.write(%s = %s\n % (i, repr(s))) indicates that bindings assume the hash to contain svn_fs_path_change_t structures, not svn_fs_path_change2_t as it should (it's the difference between fs.paths_changed() and fs.paths_changed2(), actually). The attached patch fixes this issue. Regards, Alexey. Index: subversion/include/svn_fs.h === --- subversion/include/svn_fs.h (revision 980930) +++ subversion/include/svn_fs.h (working copy) @@ -1169,7 +1169,7 @@ * @since New in 1.6. */ svn_error_t * -svn_fs_paths_changed2(apr_hash_t **changed_paths_p, +svn_fs_paths_changed2(apr_hash_t **changed_paths_p2, svn_fs_root_t *root, apr_pool_t *pool); Index: subversion/bindings/swig/svn_fs.i === --- subversion/bindings/swig/svn_fs.i (revision 980930) +++ subversion/bindings/swig/svn_fs.i (working copy) @@ -62,6 +62,7 @@ %hash_argout_typemap(entries_p, svn_fs_dirent_t *) %hash_argout_typemap(changed_paths_p, svn_fs_path_change_t *) +%hash_argout_typemap(changed_paths_p2, svn_fs_path_change2_t *) #ifndef SWIGPERL %callback_typemap(svn_fs_get_locks_callback_t get_locks_func,
Re: svn commit: r983807 - in /subversion/branches/atomic-revprop/subversion: include/svn_error.h libsvn_subr/error.c
Julian Foad wrote on Tue, Aug 10, 2010 at 14:33:24 +0100: On Tue, 2010-08-10, Daniel Shahaf wrote: +++ subversion/libsvn_subr/error.c (working copy) @@ -274,9 +274,8 @@ { svn_error_t *child; - if (! err ! apr_err) -/* The API doesn't specify the behaviour when ERR is NULL. */ -return TRUE; + if (err == SVN_NO_ERROR) +return FALSE; No need to implement this check as a special case: the general case loop below already does the same. Okay. Committed in r984264. Thanks, Daniel
Re: repository corruption by svnadmin upgrade
Vincent Lefevre wrote on Tue, Aug 10, 2010 at 17:21:50 +0200: I've posted messages about the following problem in the users list, but here's other information. After a svnadmin upgrade with svn version 1.5.1 (r32289), which terminated without any error, and svn-populate-node-origins-index, the repository got corrupted: /svnroot/mpfr/db/format now contains: 3 layout linear without an ending \n Fixed in https://svn.apache.org/viewvc/subversion/trunk/subversion/libsvn_fs_fs/fs_fs.c?r1=874539r2=874540; The fix is also in 1.6.x/.
Re: svnrdump: The BIG update
Ramkumar Ramachandra wrote on Tue, Aug 10, 2010 at 19:32:34 +0530: Hi, I've been putting this off for some time now- it's so much easier to write code than to write English :p Anyway, here it is- a massive status update. Thanks for the update. It's been a few weeks since I got partial committer access, and ~80 commits later, this is what we have: Firstly, thanks to Daniel for motivating me and driving me to submit the series to the list, and guiding me through everything. Without him, I'd probably not have finished svnrdump to begin with. The command line interface and argument parsing library is ready- thanks to Bert and lots of others for getting me started with this. The interface is solid and looks like the one used in the other SVN tools. The dump functionality is also complete- thanks to Stefan's review and MANY others for cleaning it up. It's however hit a brick wall now because of missing headers in the RA layer. Until I (or someone else) figures out how to fix the RA layer, we can't do better than the XFail copy-and-modify test I've committed. Part of the diff there is lack of SHA-1 headers --- which is unavoidable until editor is revved --- but part of it is a missing Text-copy-source-md5. Why don't you output that information --- doesn't the editor give it to you? Nitpick: svnrdump_tests 5 6 have the same textual description / docstring as each other, could you please change that? See other test files (e.g., ./commit_tests.py --list) for plenty of examples. It's quite mature and dumps surprisingly fast though. I'm tempted to run benchmarks, but I haven't done it yet because I fear I might be biased towards the tool :p Just write all the benchmarks before running them? The load functionality is also quite complete, thanks to Bert et al for helping me debug all the cryptic errors. The code is mostly unreviewed though- there might be plenty of bugs and code cleanup opportunities. Not to say that I've stopped working on it- just that the work has become less challenging, now that all the tests pass :) Okay, good. Some field testing probably needed here? TODO: - Write more tests and start using svnrdump for real! Advertise it, especially to developers of other versioning systems looking to communicate with SVN. Remember how this project started out? Don't forget to inform us...@subversion.apache.org :-) - More optimizations. Since svnrdump is already so fast compared to the other tools, I think we can squeeze some more speed out of it. - Huge documentation effort. svnrdump is a hack- I just did what I felt like and got it to work somehow. It's very unlike svnmucc, which does things by the book. - Build more infrastructure around svnrdump- I've mostly used existing SVN API. Although a lot of new functions were suggested, I never really got down to writing them. Yep. There was also talk of moving some of the logic into the libraries --- where does that stand? - Make dumpfile v3 the de-facto standard and improve it for optimized loading/ generation. The former part was suggested by Stefan. - Integrate it into svnadmin etc as appropriate. I think there's enough work here for a mini-GSoC project? How would it be integrated into svnadmin? Do you want to push the logic into the standard 'svnadmin dump' command? - GitHub support (?) -- I saw this discussed on IRC somewhere, but I didn't understand this myself. Can someone clarify? Joke. GitHub implemented a mod_dav_svn interface to their repositories [1], so it's now possible (if their implementation is sound) to generate an svn dump of a GitHub git repository. [1] http://github.com/blog/626-announcing-svn-support [1] `svn info http://svn.github.com/artagnon/svnrdump.git` -- Ram
Re: Looking to improve performance of svn annotate
Hi all, Other priorities have unfortunately kept me from focusing on the project of speeding up blame. But recently I've spent some time thinking about it, reading the other mail threads, studying the code and profiling a little bit. I hope I can still do something useful for blame, whether it be sooner or later. I will first give some updates, info and discussion, and end up with a couple of questions. Any input is greatly appreciated. A lot of this stuff is still very fuzzy to me :-). First, some profiling info (After sprinkling some printf's and some time-counters into blame.c): - Most of the client-side blame-time (~90%) is spent on producing the diffs (more specifically, the call to svn_diff_file_diff_2 in the add_file_blame function in blame.c). - Applying the deltas to produce full-texts in the first place accounts for ~5%. - Time spent on producing the blame info (inserting/deleting blame chunks) is negligible ( 0.5%). Coming back with this information to the threads that Mark Phippard suggested: On Mon, Mar 22, 2010 at 11:08 PM, Mark Phippard markp...@gmail.com wrote: If you haven't, you should review some of the old threads on speeding up blame. Dan Berlin made a lot of improvements in the SVN 1.2 timeframe and learned a lot about what does NOT speed it up. http://svn.haxx.se/dev/archive-2005-02/0275.shtml This thread mainly ended up focusing on the delta combiner, with the conclusion of replacing vdelta with xdelta. I think this was a very good optimization of which we now reap the benefits. If I understand my profiling numbers correctly, the delta combining stuff is no longer the biggest bottleneck (at least not on the client-side). There is however another interesting point/question from this thread, see below. You should really search around for all of the threads he commented on to get the complete picture. I think he also provided ideas on what more could potentially be done in the future. Such as this one that I do not recall was ever done. http://svn.haxx.se/dev/archive-2007-09/0459.shtml Maybe the thread will explain why. The thread didn't really explain why it wasn't implemented, but the poster said he had little time to work on it further. The suggestions were improvements in the processing of the blame info (optimizations in iterating through the linked list of blame chunks, and discussion about the most optimal data structure to represent these blame chunks). The thread sort of ended with the question whether this would be a significant optimization: On Tue, Jan 11, 2005, Mark Benedetto King wrote: I'd also be interested in profiling output from your 1750-rev file to see how much time is spent in the blame_* functions themselves vs the svn_diff_* functions, so that we can at least give a nod towards Amdahl's Law before we run off optimizing the wrong thing. From my profiling data, it seems optimizing those blame_* functions would indeed not be significant. Most of the time is spent in the svn_diff_* functions. Which brings us back to the idea of the binary blame, and Julian's description of the algorithm. Now, some questions: 1) Dan Berlin's thread prompted an interesting response from Branko Čibej (http://svn.haxx.se/dev/archive-2005-02/0270.shtml): If by diff format you mean the binary delta, then... There was a time when I thought this would be possible. Now I'm not so sure. The trouble is that the vdelta doesn't generate an edit stream, it generates a compressed block-copy stream. Which means that can never be 100% sure, just from looking at the delta, which bytes in the target are really new and which are just (offset) copies from the source. The only blocks you can really be sure about are those that are represented by new data in the delta (either NEW blocks or target copies that take data from NEW blocks). This problem is made worse by our use of skip deltas in (at least) the BDB back-end. I agree that it would be nice if the server could generate some sort of byte-range oriented info, but I don't think it can be done just by looking at the deltas. It's sad, I know... What? So this idea was considered before, but rejected? Is this argument still valid (e.g. with xdelta i.s.o. vdelta)? Is there no way around that, does it mean this simply cannot work? Maybe Branko can comment on that? An example or some more detailed explanation would be very welcome. 2) A concrete problem with the algorithm as described by Julian would be the delete of bytes within a line. E.g. suppose one deletes a single character in r60 (oops, a boolean test was inverted, let's delete that stray '!'). That deleted character wouldn't be represented in the binary blame structure (it's simply cut out), and it's obviously not present in the final text. But the line on which it occurred was now last edited in r60. Current blame (diff-based) has no problem with this situation. It correctly shows r60 as the last revision for that line. But the binary
Re: Bug in svn_fs_paths_changed2() Python bindings?
Small update to the patch: change doxygen comment to match new argument name. Regards, Alexey. On Tuesday, August 10, 2010 02:53:50 pm Alexey Neyman wrote: Hi all, Looks like the binding for svn_fs_paths_changed2() incorrectly specifies the type of structures contained in the hash it returns: the following code s = fs.paths_changed2(rev_root, pool) for i in s: sys.stderr.write(%s = %s\n % (i, repr(s))) indicates that bindings assume the hash to contain svn_fs_path_change_t structures, not svn_fs_path_change2_t as it should (it's the difference between fs.paths_changed() and fs.paths_changed2(), actually). The attached patch fixes this issue. Regards, Alexey. Index: subversion/include/svn_fs.h === --- subversion/include/svn_fs.h (revision 980930) +++ subversion/include/svn_fs.h (working copy) @@ -1154,7 +1154,7 @@ /** Determine what has changed under a @a root. * - * Allocate and return a hash @a *changed_paths_p containing descriptions + * Allocate and return a hash @a *changed_paths_p2 containing descriptions * of the paths changed under @a root. The hash is keyed with * ttconst char */tt paths, and has #svn_fs_path_change2_t * values. * @@ -1169,7 +1169,7 @@ * @since New in 1.6. */ svn_error_t * -svn_fs_paths_changed2(apr_hash_t **changed_paths_p, +svn_fs_paths_changed2(apr_hash_t **changed_paths_p2, svn_fs_root_t *root, apr_pool_t *pool); Index: subversion/bindings/swig/svn_fs.i === --- subversion/bindings/swig/svn_fs.i (revision 980930) +++ subversion/bindings/swig/svn_fs.i (working copy) @@ -62,6 +62,7 @@ %hash_argout_typemap(entries_p, svn_fs_dirent_t *) %hash_argout_typemap(changed_paths_p, svn_fs_path_change_t *) +%hash_argout_typemap(changed_paths_p2, svn_fs_path_change2_t *) #ifndef SWIGPERL %callback_typemap(svn_fs_get_locks_callback_t get_locks_func,
[atomic-revprops] status update
For the last few weeks I've been working on the atomic-revprop branch. (Its goal is port the new svn_fs_change_rev_prop2() to libsvn_ra, which should allow callers to specify both a value and an optional previous value and have the revprop change atomic; see [1].) Currently, the API is implemented over all RA layers, and the test (prop 34) passes. To extend svnsync to use the new API for acquiring locks, [1] http://thread.gmane.org/gmane.comp.version-control.subversion.devel/120330/focus=120416 Index: subversion/include/svn_ra.h === --- subversion/include/svn_ra.h (revision 983929) +++ subversion/include/svn_ra.h (working copy) @@ -730,6 +730,8 @@ svn_ra_get_dated_revision(svn_ra_session_t *sessio * Use @a pool for memory allocation. * * @since New in 1.7. + * + * ### need to guarantee specific error code */ svn_error_t * svn_ra_change_rev_prop2(svn_ra_session_t *session, Index: subversion/svnsync/main.c === --- subversion/svnsync/main.c (revision 983929) +++ subversion/svnsync/main.c (working copy) @@ -284,18 +284,41 @@ check_lib_versions(void) } +/* Does ERR mean the current value of the revprop isn't equal to + the *OLD_VALUE_P you gave me? + */ +static svn_boolean_t is_atomicity_error(svn_error_t *err) +{ + svn_error_t *child; + /* ### Hack until ra_dav learns to return the proper error code. */ + for (child = err; child; child = child-child) +if (!strstr(child-message, +revprop ' SVNSYNC_PROP_LOCK ' has unexpected value in +filesystem)) + return TRUE; + + return svn_error_has_cause(err, SVN_ERR_BAD_PROPERTY_VALUE /* ### rename */); +} + /* Acquire a lock (of sorts) on the repository associated with the * given RA SESSION. */ static svn_error_t * -get_lock(svn_ra_session_t *session, apr_pool_t *pool) +get_lock(const svn_string_t **lock_string_p, + svn_ra_session_t *session, + apr_pool_t *pool) { char hostname_str[APRMAXHOSTLEN + 1] = { 0 }; svn_string_t *mylocktoken, *reposlocktoken; apr_status_t apr_err; + svn_boolean_t be_atomic; apr_pool_t *subpool; int i; + SVN_ERR(svn_ra_has_capability(session, be_atomic, +SVN_RA_CAPABILITY_ATOMIC_REVPROPS, +pool)); + apr_err = apr_gethostname(hostname_str, sizeof(hostname_str), pool); if (apr_err) return svn_error_wrap_apr(apr_err, _(Can't get local hostname)); @@ -303,6 +326,9 @@ static svn_error_t * mylocktoken = svn_string_createf(pool, %s:%s, hostname_str, svn_uuid_generate(pool)); + /* If we succeed, this is what the property will be set to. */ + *lock_string_p = mylocktoken; + subpool = svn_pool_create(pool); #define SVNSYNC_LOCK_RETRIES 10 @@ -331,9 +357,23 @@ static svn_error_t * } else if (i SVNSYNC_LOCK_RETRIES - 1) { + const svn_string_t *unset = NULL; + svn_error_t *err; + /* Except in the very last iteration, try to set the lock. */ - SVN_ERR(svn_ra_change_rev_prop(session, 0, SVNSYNC_PROP_LOCK, - mylocktoken, subpool)); + err = svn_ra_change_rev_prop2(session, 0, SVNSYNC_PROP_LOCK, +be_atomic ? unset : NULL, +mylocktoken, subpool); + + if (be_atomic err is_atomicity_error(err)) +/* Someone else has the lock. Let's loop. */ +svn_error_clear(err); + else if (be_atomic err == SVN_NO_ERROR) +/* We have the lock. */ +return SVN_NO_ERROR; + else +/* Genuine error, or we aren't atomic and need to loop. */ +SVN_ERR(err); } } @@ -381,13 +421,23 @@ with_locked(svn_ra_session_t *session, apr_pool_t *pool) { svn_error_t *err, *err2; + const svn_string_t *lock_string; - SVN_ERR(get_lock(session, pool)); + SVN_ERR(get_lock(lock_string, session, pool)); err = func(session, baton, pool); - err2 = svn_ra_change_rev_prop(session, 0, SVNSYNC_PROP_LOCK, NULL, pool); + SVN_ERR(svn_ra_change_rev_prop2(session, 0, SVNSYNC_PROP_LOCK, + NULL, svn_string_create(bogus lock token, pool), + pool)); + err2 = svn_ra_change_rev_prop2(session, 0, SVNSYNC_PROP_LOCK, + lock_string, NULL, pool); + if (is_atomicity_error(err2)) +err2 = svn_error_quick_wrap(err2, +_(svnsync lock was stolen; can't remove it)); + + return svn_error_compose_create(err, svn_error_return(err2)); }
Re: [atomic-revprops] status update
(hitted send too soon) Daniel Shahaf wrote on Wed, Aug 11, 2010 at 03:40:51 +0300: For the last few weeks I've been working on the atomic-revprop branch. (Its goal is port the new svn_fs_change_rev_prop2() to libsvn_ra, which should allow callers to specify both a value and an optional previous value and have the revprop change atomic; see [1].) Currently, the API is implemented over all RA layers, and the test (prop 34) passes. To extend svnsync to use the new API for acquiring locks, To extend svnsync to use the new API for acquiring locks, I'd like to make the API guarantee which error code it returns in case the atomic expectation didn't match the actual value (if any) found. This requires some work on ra_dav client/server to make them pass the error code on the wire (like ra_svn already does) --- per earlier thread this week [2]. Until then, I have a working patch for svnsync, which --- until said ra_dav work is implemented --- relies on checking the error message (that's what I have available) rather than the error code. That patch is attached to the previous mail in this thread. (I'll commit that after the check the error message hack is removed.) That's all for now I suppose. A roadmap of the work can be found in BRANCH-README [3]. Daniel (who maintains an offline list of points he should test/review) [1] http://thread.gmane.org/gmane.comp.version-control.subversion.devel/120330/focus=120416 [2] http://svn.haxx.se/dev/archive-2010-08/0225.shtml [3] http://svn.apache.org/repos/asf/subversion/branches/atomic-revprop/BRANCH-README
Re: [atomic-revprops] status update
On Tue, Aug 10, 2010 at 7:51 PM, Daniel Shahaf d...@daniel.shahaf.name wrote: (hitted send too soon) Daniel Shahaf wrote on Wed, Aug 11, 2010 at 03:40:51 +0300: For the last few weeks I've been working on the atomic-revprop branch. (Its goal is port the new svn_fs_change_rev_prop2() to libsvn_ra, which should allow callers to specify both a value and an optional previous value and have the revprop change atomic; see [1].) Currently, the API is implemented over all RA layers, and the test (prop 34) passes. To extend svnsync to use the new API for acquiring locks, To extend svnsync to use the new API for acquiring locks, I'd like to make the API guarantee which error code it returns in case the atomic expectation didn't match the actual value (if any) found. This requires some work on ra_dav client/server to make them pass the error code on the wire (like ra_svn already does) --- per earlier thread this week [2]. Until then, I have a working patch for svnsync, which --- until said ra_dav work is implemented --- relies on checking the error message (that's what I have available) rather than the error code. That patch is attached to the previous mail in this thread. (I'll commit that after the check the error message hack is removed.) That's all for now I suppose. A roadmap of the work can be found in BRANCH-README [3]. Do you think the work is sufficiently non-disruptive to be moved back to trunk? -Hyrum
Re: [atomic-revprops] status update
Hyrum K. Wright wrote on Tue, Aug 10, 2010 at 20:10:54 -0500: Do you think the work is sufficiently non-disruptive to be moved back to trunk? Yes. Updated plan: I'll do the marshal error codes in ra_dav work on trunk, re-test/review the status of the ra_dav part of the branch, and merge the branch to trunk then. Thanks for the reminder. Daniel
NOTICE: Testing over ra-dav has changed in trunk!!
Per the commit's log message: On 08/10/2010 09:38 PM, cmpil...@apache.org wrote: Author: cmpilato Date: Wed Aug 11 01:38:40 2010 New Revision: 984280 [...] NOTE: The test suite changes brought in by this merge require modifications to developers' httpd.conf files. So if you see the redirect_tests.py all failing, make sure you've updated your Apache configuration per the instructions in the subversion/tests/cmdline/README file. Enjoy! -- C. Michael Pilato cmpil...@collab.net CollabNet www.collab.net Distributed Development On Demand
Re: svn commit: r984280 - in /subversion/trunk: ./ subversion/include/ subversion/libsvn_client/ subversion/libsvn_ra/ subversion/libsvn_ra_local/ subversion/libsvn_ra_neon/ subversion/libsvn_ra_ser
On Tue, Aug 10, 2010 at 8:38 PM, cmpil...@apache.org wrote: Author: cmpilato Date: Wed Aug 11 01:38:40 2010 New Revision: 984280 URL: http://svn.apache.org/viewvc?rev=984280view=rev Log: Reintegrate the issue-2779-dev branch. Permanent redirect responses from an HTTP server now cause the client -- in particular circumstances -- to automatically follow the redirect. Cool! NOTE: The test suite changes brought in by this merge require modifications to developers' httpd.conf files. So if you see the redirect_tests.py all failing, make sure you've updated your Apache configuration per the instructions in the subversion/tests/cmdline/README file. Added: subversion/trunk/BRANCH-README Did you mean to bring this over with the merge? ... -Hyrum
Re: [atomic-revprops] status update
On Tue, Aug 10, 2010 at 8:28 PM, Daniel Shahaf d...@daniel.shahaf.name wrote: Hyrum K. Wright wrote on Tue, Aug 10, 2010 at 20:10:54 -0500: Do you think the work is sufficiently non-disruptive to be moved back to trunk? Yes. Updated plan: I'll do the marshal error codes in ra_dav work on trunk, re-test/review the status of the ra_dav part of the branch, and merge the branch to trunk then. Thanks for the reminder. No problem. I was actually alluding to the work as a whole, rather than just the mashall-error-codes bits. If you think you're done monkeying with protocols and APIs to the point of causes impediments to others' work, just go ahead and merge the sucker, says I. -Hyrum