Re: svn commit: r997203 - in /subversion/trunk: ./ subversion/include/ subversion/libsvn_client/ subversion/libsvn_diff/ subversion/libsvn_fs_fs/ subversion/libsvn_ra_svn/ subversion/libsvn_repos/ sub
On Wed, Sep 15, 2010 at 06:52:07AM -, hwri...@apache.org wrote: == --- subversion/trunk/subversion/include/svn_string.h (original) +++ subversion/trunk/subversion/include/svn_string.h Wed Sep 15 06:52:06 2010 @@ -253,6 +253,15 @@ svn_stringbuf_chop(svn_stringbuf_t *str, void svn_stringbuf_fillchar(svn_stringbuf_t *str, unsigned char c); +/** Append a single character @a byte onto @a targetstr. + * + * reallocs if necessary. @a targetstr is affected, nothing else is. + * @since New in 1.7. + */ +void +svn_stringbuf_appendbyte(svn_stringbuf_t *targetstr, + char byte); + The docstring should list advantages svn_stringbuf_appendbyte(buf, c) has over svn_stringbuf_appendbytes(buf, c, 1). We need to understand where the performance benefits really come from. IIRC the benefit was dependent on the optimizer in the compiler to some extent, is this correct? Stefan
Re: svn commit: r995603 - /subversion/branches/performance/subversion/libsvn_subr/svn_string.c
On Fri, Sep 10, 2010 at 03:32:14PM +0200, Stefan Sperling wrote: On Thu, Sep 09, 2010 at 10:58:06PM -, stef...@apache.org wrote: Author: stefan2 Date: Thu Sep 9 22:58:06 2010 New Revision: 995603 URL: http://svn.apache.org/viewvc?rev=995603view=rev Log: Try really hard to help the compiler generate good code for this frequently called function. * subversion/libsvn_subr/svn_string.c (svn_stringbuf_appendbyte): reformulate Modified: subversion/branches/performance/subversion/libsvn_subr/svn_string.c Modified: subversion/branches/performance/subversion/libsvn_subr/svn_string.c URL: http://svn.apache.org/viewvc/subversion/branches/performance/subversion/libsvn_subr/svn_string.c?rev=995603r1=995602r2=995603view=diff == --- subversion/branches/performance/subversion/libsvn_subr/svn_string.c (original) +++ subversion/branches/performance/subversion/libsvn_subr/svn_string.c Thu Sep 9 22:58:06 2010 @@ -403,8 +403,11 @@ svn_stringbuf_appendbyte(svn_stringbuf_t apr_size_t old_len = str-len; if (str-blocksize old_len + 1) { - str-data[old_len] = byte; - str-data[old_len+1] = '\0'; + char *dest = str-data; + + dest[old_len] = byte; + dest[old_len+1] = '\0'; + str-len = old_len+1; } else @@ -412,7 +415,8 @@ svn_stringbuf_appendbyte(svn_stringbuf_t /* we need to re-allocate the string buffer * - let the more generic implementation take care of that part */ - svn_stringbuf_appendbytes(str, byte, 1); + char b = byte; + svn_stringbuf_appendbytes(str, b, 1); } } If you're expecting this function to stay written as it now, you need to add a comment explaining why it is laid out the way it is. Otherwise people might change it, and the resulting generated code will be less optimal again. That said, I have my doubts whether tweaking C code in order to tune the output of certain compilers in certain versions is a useful thing to do. Will this still work as expected when new compiler versions are released? It seems you'd rather want to fix the optimizer of the compiler. This has been merged into trunk now, so I'm bumping this thread to repeat my question above. Stefan
Re: svn commit: r997203 - in /subversion/trunk: ./ subversion/include/ subversion/libsvn_client/ subversion/libsvn_diff/ subversion/libsvn_fs_fs/ subversion/libsvn_ra_svn/ subversion/libsvn_repos/ sub
On Wed, Sep 15, 2010 at 9:45 AM, Stefan Sperling s...@elego.de wrote: On Wed, Sep 15, 2010 at 06:52:07AM -, hwri...@apache.org wrote: == --- subversion/trunk/subversion/include/svn_string.h (original) +++ subversion/trunk/subversion/include/svn_string.h Wed Sep 15 06:52:06 2010 @@ -253,6 +253,15 @@ svn_stringbuf_chop(svn_stringbuf_t *str, void svn_stringbuf_fillchar(svn_stringbuf_t *str, unsigned char c); +/** Append a single character @a byte onto @a targetstr. + * + * reallocs if necessary. @a targetstr is affected, nothing else is. + * @since New in 1.7. + */ +void +svn_stringbuf_appendbyte(svn_stringbuf_t *targetstr, + char byte); + The docstring should list advantages svn_stringbuf_appendbyte(buf, c) has over svn_stringbuf_appendbytes(buf, c, 1). We need to understand where the performance benefits really come from. IIRC the benefit was dependent on the optimizer in the compiler to some extent, is this correct? I don't think that's completely correct, but I'll leave it to Stefan F. to determine that. Since this function has now been merged to trunk, updates to the doc string should be done on trunk. But I feel comfortable allowing Stefan to make these docstring edits and commit without prior review, instead of requiring him to mail the patch to the list. -Hyrum
Re: svn commit: r995603 - /subversion/branches/performance/subversion/libsvn_subr/svn_string.c
On 15.09.2010 09:46, Stefan Sperling wrote: On Fri, Sep 10, 2010 at 03:32:14PM +0200, Stefan Sperling wrote: On Thu, Sep 09, 2010 at 10:58:06PM -, stef...@apache.org wrote: Author: stefan2 Date: Thu Sep 9 22:58:06 2010 New Revision: 995603 URL: http://svn.apache.org/viewvc?rev=995603view=rev Log: Try really hard to help the compiler generate good code for this frequently called function. * subversion/libsvn_subr/svn_string.c (svn_stringbuf_appendbyte): reformulate Modified: subversion/branches/performance/subversion/libsvn_subr/svn_string.c Modified: subversion/branches/performance/subversion/libsvn_subr/svn_string.c URL: http://svn.apache.org/viewvc/subversion/branches/performance/subversion/libsvn_subr/svn_string.c?rev=995603r1=995602r2=995603view=diff == --- subversion/branches/performance/subversion/libsvn_subr/svn_string.c (original) +++ subversion/branches/performance/subversion/libsvn_subr/svn_string.c Thu Sep 9 22:58:06 2010 @@ -403,8 +403,11 @@ svn_stringbuf_appendbyte(svn_stringbuf_t apr_size_t old_len = str-len; if (str-blocksize old_len + 1) { - str-data[old_len] = byte; - str-data[old_len+1] = '\0'; + char *dest = str-data; + + dest[old_len] = byte; + dest[old_len+1] = '\0'; + str-len = old_len+1; } else @@ -412,7 +415,8 @@ svn_stringbuf_appendbyte(svn_stringbuf_t /* we need to re-allocate the string buffer * - let the more generic implementation take care of that part */ - svn_stringbuf_appendbytes(str, byte, 1); + char b = byte; + svn_stringbuf_appendbytes(str, b, 1); } } If you're expecting this function to stay written as it now, you need to add a comment explaining why it is laid out the way it is. Otherwise people might change it, and the resulting generated code will be less optimal again. That said, I have my doubts whether tweaking C code in order to tune the output of certain compilers in certain versions is a useful thing to do. Will this still work as expected when new compiler versions are released? It seems you'd rather want to fix the optimizer of the compiler. This has been merged into trunk now, so I'm bumping this thread to repeat my question above. The two optimizations shown above are indeed superfluous in C, since there are no function calls with possible side effects between creating the temp var and using it. A good compiler will do them itself, a great compiler will look for possible global optimizations and might do who knows what -- might even create a specialized inline for the second case. It would be different if the results were used after the function call, since we cannot use restrict because of our standards compatibility. But as it is ... which compiler is this supposed to tickle to generate better code? Whether something gets passed in a register in a call depends on the platform ABI and inlining, not on the presence of an extra local variable (which may just get optimized out anyway). -- Brane
Public API documentation
I've been thinking about ways that our API documentation could be made a bit cleaner. Currently, we use a lot of prose to describe APIs, their parameters, and other nuances. I find this approach rather unique, and in some ways obfuscating. We have great documentation in that we cover a lot of ground, but I wonder if we can be more consistent and clear. For instance, look at the description of svn_client_checkout3(): http://people.apache.org/~hwright/svn/doc/api/trunk/group__clnt__wc__checkout.html#ga31e87d0db53bf1158eaceb6552c63bae It's is really difficult to determine what each parameter does or what possible errors are returned, without reading the entire docstring. Maybe that's intended, but perhaps a strategy which enumerates the parameters would be a bit better. I've mocked up with this may look like: http://people.apache.org/~hwright/svn/doc/api/temp/group__clnt__wc__checkout.html#ga31e87d0db53bf1158eaceb6552c63bae You'll note that there isn't a complex discussion of client contexts, depth, or peg revisions, but rather links to a common location where such concepts can be discussed in-depth. I've also tried to detect the various errors which may be returned, and enumerate them specifically, rather than buried in the prose description of the API. I think this strategy will lead to more consistent and maintainable documentation across the project (and easier identifiable deviations from standard behavior). Thoughts? -Hyrum
RE: svn commit: r997026 - /subversion/trunk/subversion/mod_dav_svn/repos.c
-Original Message- From: cmpil...@apache.org [mailto:cmpil...@apache.org] Sent: dinsdag 14 september 2010 20:09 To: comm...@subversion.apache.org Subject: svn commit: r997026 - /subversion/trunk/subversion/mod_dav_svn/repos.c Author: cmpilato Date: Tue Sep 14 18:08:59 2010 New Revision: 997026 URL: http://svn.apache.org/viewvc?rev=997026view=rev Log: For issue #3709 (Inconsistency between svn list and svn checkout), teach the mod_dav_svn tree walker logic to authorize access to the paths it touches. Is it possible to port this back to 1.6.x? Bert
RE: svn commit: r997237 - in /subversion/trunk/subversion/bindings/javahl/native: ClientContext.cpp ClientContext.h SVNClient.cpp
-Original Message- From: hwri...@apache.org [mailto:hwri...@apache.org] Sent: woensdag 15 september 2010 11:18 To: comm...@subversion.apache.org Subject: svn commit: r997237 - in /subversion/trunk/subversion/bindings/javahl/native: ClientContext.cpp ClientContext.h SVNClient.cpp Author: hwright Date: Wed Sep 15 09:17:34 2010 New Revision: 997237 URL: http://svn.apache.org/viewvc?rev=997237view=rev Log: Revert r997228. Apparently, some of the improved error handling added in that revision wasn't to friendly with the tests. It will require further digging, but I'm reverting in the intrim. Modified: subversion/trunk/subversion/bindings/javahl/native/ClientContext.cpp subversion/trunk/subversion/bindings/javahl/native/ClientContext.h subversion/trunk/subversion/bindings/javahl/native/SVNClient.cpp Modified: subversion/trunk/subversion/bindings/javahl/native/ClientContext.cpp URL: http://svn.apache.org/viewvc/subversion/trunk/subversion/bindings/javah l/native/ClientContext.cpp?rev=997237r1=997236r2=997237view=diff === === --- subversion/trunk/subversion/bindings/javahl/native/ClientContext.cpp (original) +++ subversion/trunk/subversion/bindings/javahl/native/ClientContext.cpp Wed Sep 15 09:17:34 2010 @@ -97,20 +97,20 @@ ClientContext::~ClientContext() env-DeleteGlobalRef(m_jctx); } -svn_error_t * -ClientContext::getContext(svn_client_ctx_t **ctx, CommitMessage *message) +svn_client_ctx_t * +ClientContext::getContext(CommitMessage *message) { SVN::Pool *requestPool = JNIUtil::getRequestPool(); apr_pool_t *pool = requestPool-pool(); svn_auth_baton_t *ab; - -*ctx = persistentCtx; +svn_client_ctx_t *ctx = persistentCtx; +//SVN_JNI_ERR(svn_client_create_context(ctx, pool), NULL); const char *configDir = m_configDir.c_str(); if (m_configDir.length() == 0) configDir = NULL; -SVN_ERR(svn_config_get_config(((*ctx)-config), configDir, pool)); -svn_config_t *config = (svn_config_t *) apr_hash_get((*ctx)- config, +SVN_JNI_ERR(svn_config_get_config((ctx-config), configDir, pool), NULL); +svn_config_t *config = (svn_config_t *) apr_hash_get(ctx-config, SVN_CONFIG_CATEGORY_CONFIG, APR_HASH_KEY_STRING); @@ -118,8 +118,10 @@ ClientContext::getContext(svn_client_ctx apr_array_header_t *providers; /* Populate the registered providers with the platform-specific providers */ - SVN_ERR(svn_auth_get_platform_specific_client_providers(providers, -config, pool)); + SVN_JNI_ERR(svn_auth_get_platform_specific_client_providers(providers, + config, +pool), +NULL); This doesn't look right (in the old and new code). If you use a client session multiple time, you shouldn't recreate the auth baton on every invocation, but only if there is a reason to recreate. The auth baton also handles some caching for the auth providers. (At least it stores the used passwords that weren't saved to disk, but I think it also stores some context state for a few providers) Bert
Re: Public API documentation
Hyrum K. Wright wrote: I've been thinking about ways that our API documentation could be made a bit cleaner. Currently, we use a lot of prose to describe APIs, their parameters, and other nuances. I find this approach rather unique, and in some ways obfuscating. We have great documentation in that we cover a lot of ground, but I wonder if we can be more consistent and clear. For instance, look at the description of svn_client_checkout3(): http://people.apache.org/~hwright/svn/doc/api/trunk/group__clnt__wc__checkout.html#ga31e87d0db53bf1158eaceb6552c63bae It's is really difficult to determine what each parameter does or what possible errors are returned, without reading the entire docstring. Maybe that's intended, but perhaps a strategy which enumerates the parameters would be a bit better. I've mocked up with this may look like: http://people.apache.org/~hwright/svn/doc/api/temp/group__clnt__wc__checkout.html#ga31e87d0db53bf1158eaceb6552c63bae You'll note that there isn't a complex discussion of client contexts, depth, or peg revisions, but rather links to a common location where such concepts can be discussed in-depth. I've also tried to detect the various errors which may be returned, and enumerate them specifically, rather than buried in the prose description of the API. I think this strategy will lead to more consistent and maintainable documentation across the project (and easier identifiable deviations from standard behavior). Thoughts? -Hyrum +1. You know what, I've been a bit skeptical of the list-of-params style of documentation, as sometimes I've seen documentation that says @param revision - the revision number; @param path - the path, without saying the path of what, what kind of path, relative to what. But your mock-up is very good. I particularly like that it refers to the canonical definitions of depth, client_ctx (saying which parts of that are relevant) and operative/peg revisions, instead of trying to explain them yet again in yet another way. I also like the way we can easily check whether the correct set of parameters is documented. In the past we have often changed function parameters (when revving the function, or just renaming existing params) and have failed to notice that the doc string no longer refers to the real params. This style won't fit all functions - for example, I don't think it would be useful to rewrite the doc string of svn_client_checkout2() which is currently: Similar to svn_client_checkout3() but with allow_unver_obstructions always set to FALSE, and depth set according to recurse: if recurse is TRUE, depth is svn_depth_infinity, if recurse is FALSE, depth is svn_depth_files. So +1 on starting to use this style and on committing your svn_client_checkout3() doc string to start the ball rolling. - Julian
Re: Public API documentation
On Wed, Sep 15, 2010 at 11:35 AM, Julian Foad julian.f...@wandisco.com wrote: Hyrum K. Wright wrote: I've been thinking about ways that our API documentation could be made a bit cleaner. Currently, we use a lot of prose to describe APIs, their parameters, and other nuances. I find this approach rather unique, and in some ways obfuscating. We have great documentation in that we cover a lot of ground, but I wonder if we can be more consistent and clear. For instance, look at the description of svn_client_checkout3(): http://people.apache.org/~hwright/svn/doc/api/trunk/group__clnt__wc__checkout.html#ga31e87d0db53bf1158eaceb6552c63bae It's is really difficult to determine what each parameter does or what possible errors are returned, without reading the entire docstring. Maybe that's intended, but perhaps a strategy which enumerates the parameters would be a bit better. I've mocked up with this may look like: http://people.apache.org/~hwright/svn/doc/api/temp/group__clnt__wc__checkout.html#ga31e87d0db53bf1158eaceb6552c63bae You'll note that there isn't a complex discussion of client contexts, depth, or peg revisions, but rather links to a common location where such concepts can be discussed in-depth. I've also tried to detect the various errors which may be returned, and enumerate them specifically, rather than buried in the prose description of the API. I think this strategy will lead to more consistent and maintainable documentation across the project (and easier identifiable deviations from standard behavior). Thoughts? -Hyrum +1. You know what, I've been a bit skeptical of the list-of-params style of documentation, as sometimes I've seen documentation that says @param revision - the revision number; @param path - the path, without saying the path of what, what kind of path, relative to what. But your mock-up is very good. I particularly like that it refers to the canonical definitions of depth, client_ctx (saying which parts of that are relevant) and operative/peg revisions, instead of trying to explain them yet again in yet another way. I also like the way we can easily check whether the correct set of parameters is documented. In the past we have often changed function parameters (when revving the function, or just renaming existing params) and have failed to notice that the doc string no longer refers to the real params. And we get the added benefit of allowing doxygen to help us by identifying undocumented or documented-but-not-existent parameters. This style won't fit all functions - for example, I don't think it would be useful to rewrite the doc string of svn_client_checkout2() which is currently: Similar to svn_client_checkout3() but with allow_unver_obstructions always set to FALSE, and depth set according to recurse: if recurse is TRUE, depth is svn_depth_infinity, if recurse is FALSE, depth is svn_depth_files. Agreed. That is essentially inherit the docstring for checkout3(), but with the following delta. So +1 on starting to use this style and on committing your svn_client_checkout3() doc string to start the ball rolling. Thanks. Though I'll probably wait a day or two, to let any dissenters make themselves known. :) -Hyrum
Re: svn commit: r997237 - in /subversion/trunk/subversion/bindings/javahl/native: ClientContext.cpp ClientContext.h SVNClient.cpp
On Wed, Sep 15, 2010 at 11:34 AM, Bert Huijben b...@qqmail.nl wrote: -Original Message- From: hwri...@apache.org [mailto:hwri...@apache.org] Sent: woensdag 15 september 2010 11:18 To: comm...@subversion.apache.org Subject: svn commit: r997237 - in /subversion/trunk/subversion/bindings/javahl/native: ClientContext.cpp ClientContext.h SVNClient.cpp Author: hwright Date: Wed Sep 15 09:17:34 2010 New Revision: 997237 URL: http://svn.apache.org/viewvc?rev=997237view=rev Log: Revert r997228. Apparently, some of the improved error handling added in that revision wasn't to friendly with the tests. It will require further digging, but I'm reverting in the intrim. Modified: subversion/trunk/subversion/bindings/javahl/native/ClientContext.cpp subversion/trunk/subversion/bindings/javahl/native/ClientContext.h subversion/trunk/subversion/bindings/javahl/native/SVNClient.cpp Modified: subversion/trunk/subversion/bindings/javahl/native/ClientContext.cpp URL: http://svn.apache.org/viewvc/subversion/trunk/subversion/bindings/javah l/native/ClientContext.cpp?rev=997237r1=997236r2=997237view=diff === === --- subversion/trunk/subversion/bindings/javahl/native/ClientContext.cpp (original) +++ subversion/trunk/subversion/bindings/javahl/native/ClientContext.cpp Wed Sep 15 09:17:34 2010 @@ -97,20 +97,20 @@ ClientContext::~ClientContext() env-DeleteGlobalRef(m_jctx); } -svn_error_t * -ClientContext::getContext(svn_client_ctx_t **ctx, CommitMessage *message) +svn_client_ctx_t * +ClientContext::getContext(CommitMessage *message) { SVN::Pool *requestPool = JNIUtil::getRequestPool(); apr_pool_t *pool = requestPool-pool(); svn_auth_baton_t *ab; - - *ctx = persistentCtx; + svn_client_ctx_t *ctx = persistentCtx; + //SVN_JNI_ERR(svn_client_create_context(ctx, pool), NULL); const char *configDir = m_configDir.c_str(); if (m_configDir.length() == 0) configDir = NULL; - SVN_ERR(svn_config_get_config(((*ctx)-config), configDir, pool)); - svn_config_t *config = (svn_config_t *) apr_hash_get((*ctx)- config, + SVN_JNI_ERR(svn_config_get_config((ctx-config), configDir, pool), NULL); + svn_config_t *config = (svn_config_t *) apr_hash_get(ctx-config, SVN_CONFIG_CATEGORY_CONFIG, APR_HASH_KEY_STRING); @@ -118,8 +118,10 @@ ClientContext::getContext(svn_client_ctx apr_array_header_t *providers; /* Populate the registered providers with the platform-specific providers */ - SVN_ERR(svn_auth_get_platform_specific_client_providers(providers, - config, pool)); + SVN_JNI_ERR(svn_auth_get_platform_specific_client_providers(providers, + config, + pool), + NULL); This doesn't look right (in the old and new code). If you use a client session multiple time, you shouldn't recreate the auth baton on every invocation, but only if there is a reason to recreate. The auth baton also handles some caching for the auth providers. (At least it stores the used passwords that weren't saved to disk, but I think it also stores some context state for a few providers) The reason we have to recreate the auth baton is because the user could change the various prompters in between invocations of client APIs. If the prompter was immutable, we wouldn't have to do this. (There may be a clever way of coding around it, I just haven't gotten there yet.) -Hyrum
Re: [Issue 3474] making a new subdir, moving files into it and then renaming the subdir, breaks history of the moved files
Julian Foad julian.f...@wandisco.com writes: On Tue, 2010-09-14, Daniel Shahaf wrote: Johan Corveleyn wrote on Sat, Sep 11, 2010 at 00:02:16 +0200: On Fri, Sep 10, 2010 at 11:45 PM, joha...@tigris.org wrote: http://subversion.tigris.org/issues/show_bug.cgi?id=3474 --- Additional comments from joha...@tigris.org Fri Sep 10 14:45:17 -0700 2010 --- This issue seems to be fixed on trunk. The described scenario now goes as follows: (assuming we're in a working copy with a versioned file a.txt) [[[ $ svn mkdir subdir A subdir # same as in 1.6 $ svn mv a.txt subdir A subdir\a.txt D a.txt # same as in 1.6 $ svn st A subdir A +subdir\a.txt D a.txt# same as in 1.6 $ svn mv subdir subdir2 A subdir2 D subdir\a.txt D subdir # different! will ask on dev list about this. Is the above output an expected change of behavior? I believe that's a bug. It should display the same as it did in 1.6, unless and until we decide to change it. The notification rules for 1.6 are a bit haphazard, I suspect some of the behaviour is not the result of deliberate decisions but simply a consequence of the way the code is structured. For example, in 1.6 move/delete notifies deletes for all nodes in a tree, apart from nodes within an added directory. There is no reason for that, beyond it being the way the code is written. For moves/copies the add notification is for all nodes in a tree apart from nodes within a copy source directory that is normal or copied. (I may not have got that exactly right!) These rules are not documented as far as I can see, so the question is whether we have to maintain this arbitrary behaviour. Perhaps we should simply notify for all nodes? Perhaps each individual client should be deciding which notifications to suppress? -- Philip
1.6.13 release
It's been 3 months since the last 1.6.x release, and we've got a few items in STATUS or CHANGES which could warrant a release. I'd like to roll in 1 or 2 weeks, so please go through STATUS and do some review. If folks have thoughts on the timeliness of this release, or other changes which ought to be included, please discuss below. (Maybe this can be an activity for those attending the apache retreat this weekend?) Thanks, -Hyrum
RE: [Issue 3474] making a new subdir, moving files into it and then renaming the subdir, breaks history of the moved files
-Original Message- From: Philip Martin [mailto:philip.mar...@wandisco.com] Sent: woensdag 15 september 2010 12:08 To: Julian Foad Cc: Daniel Shahaf; Johan Corveleyn; Subversion Development Subject: Re: [Issue 3474] making a new subdir, moving files into it and then renaming the subdir, breaks history of the moved files Julian Foad julian.f...@wandisco.com writes: On Tue, 2010-09-14, Daniel Shahaf wrote: Johan Corveleyn wrote on Sat, Sep 11, 2010 at 00:02:16 +0200: On Fri, Sep 10, 2010 at 11:45 PM, joha...@tigris.org wrote: http://subversion.tigris.org/issues/show_bug.cgi?id=3474 --- Additional comments from joha...@tigris.org Fri Sep 10 14:45:17 -0700 2010 --- This issue seems to be fixed on trunk. The described scenario now goes as follows: (assuming we're in a working copy with a versioned file a.txt) [[[ $ svn mkdir subdir A subdir # same as in 1.6 $ svn mv a.txt subdir A subdir\a.txt D a.txt # same as in 1.6 $ svn st A subdir A +subdir\a.txt D a.txt# same as in 1.6 $ svn mv subdir subdir2 A subdir2 D subdir\a.txt D subdir # different! will ask on dev list about this. Is the above output an expected change of behavior? I believe that's a bug. It should display the same as it did in 1.6, unless and until we decide to change it. The notification rules for 1.6 are a bit haphazard, I suspect some of the behaviour is not the result of deliberate decisions but simply a consequence of the way the code is structured. For example, in 1.6 move/delete notifies deletes for all nodes in a tree, apart from nodes within an added directory. There is no reason for that, beyond it being the way the code is written. For moves/copies the add notification is for all nodes in a tree apart from nodes within a copy source directory that is normal or copied. (I may not have got that exactly right!) These rules are not documented as far as I can see, so the question is whether we have to maintain this arbitrary behaviour. Perhaps we should simply notify for all nodes? Perhaps each individual client should be deciding which notifications to suppress? +1 on moving the choice to the clients. (svn has different requirements then clients like Subclipse). Personally (and for AnkhSVN) I don't have an issue with notifying adds and deletes for the root only. (The fact that it is an add or delete tells that it's operation changes the state of all descendants) Bert
Re: [Issue 3474] making a new subdir, moving files into it and then renaming the subdir, breaks history of the moved files
Bert Huijben b...@qqmail.nl writes: -Original Message- From: Philip Martin [mailto:philip.mar...@wandisco.com] Perhaps we should simply notify for all nodes? Perhaps each individual client should be deciding which notifications to suppress? +1 on moving the choice to the clients. (svn has different requirements then clients like Subclipse). Personally (and for AnkhSVN) I don't have an issue with notifying adds and deletes for the root only. (The fact that it is an add or delete tells that it's operation changes the state of all descendants) Root-only delete notification is much easier to implement in the library than in the client. That's because delete notifications for children tend to happen before the parent, and the application has to be organised to allow it to determine that a child will be followed by a parent. -- Philip
Re: sasl mechanisms order
On Tue, Sep 07, 2010 at 02:33:28PM +0700, Victor Sudakov wrote: This time it has compiled, but does not work. svn list svn://admin/test/ works OK (IMHO because the ANONYMOUS mechanism is sufficient for that) but svn co svn://admin/test/ dumps core immediately. Hi Victor, I hope I have found the problem. Does the patch below work better? Thanks, Stefan Index: subversion/libsvn_subr/config_file.c === --- subversion/libsvn_subr/config_file.c(revision 997080) +++ subversion/libsvn_subr/config_file.c(working copy) @@ -759,6 +759,11 @@ svn_config_ensure(const char *config_dir, apr_pool NL ### may be cached to disk.NL ### username Specifies the default username. NL +### preferred-sasl-mechanism Specifies which SASL mechanismNL +### among the ones offered by the NL +### server should be tried first. NL +### See the SASL documentation forNL +### a list of mechanisms available. NL ###NL ### Set store-passwords to 'no' to avoid storing passwords in the NL ### auth/ area of your config directory. It defaults to 'yes',NL Index: subversion/libsvn_ra_svn/cyrus_auth.c === --- subversion/libsvn_ra_svn/cyrus_auth.c (revision 997080) +++ subversion/libsvn_ra_svn/cyrus_auth.c (working copy) @@ -27,6 +27,7 @@ #include apr_thread_mutex.h #include apr_version.h +#include svn_config.h #include svn_types.h #include svn_string.h #include svn_error.h @@ -720,6 +721,67 @@ svn_error_t *svn_ra_svn__get_addresses(const char return SVN_NO_ERROR; } + +/* Return one or more SASL mechanisms from MECHLIST. + * SESS is the session baton. + * If a preferred SASL mechanism has been defined in the configuration, + * prefer it if it occurs within MECHLIST. Else, fall back to EXTERNAL, + * then ANONYMOUS, then let SASL decide. + * Potentially allocate the returned list of mechanisms in RESULT_POOL. + * Use SCRATCH_POOL for temporary allocations. */ +static const char * +get_sasl_mechanisms(svn_ra_svn__session_baton_t *sess, +apr_array_header_t *mechlist, +apr_pool_t *result_pool, +apr_pool_t *scratch_pool) +{ + const char *mechstring = ; + svn_config_t *cfg; + + cfg = sess-config ? apr_hash_get(sess-config, SVN_CONFIG_CATEGORY_SERVERS, +APR_HASH_KEY_STRING) : NULL; + if (cfg) +{ + const char *server_group; + const char *preferred_mech; + + server_group = svn_config_find_group(cfg, sess-hostname, + SVN_CONFIG_SECTION_GROUPS, + scratch_pool); + if (server_group) +svn_config_get(cfg, preferred_mech, server_group, + SVN_CONFIG_OPTION_PREFERRED_SASL_MECHANISM, NULL); + else +preferred_mech = NULL; + + if (preferred_mech svn_ra_svn__find_mech(mechlist, preferred_mech)) +return preferred_mech; +} + + if (svn_ra_svn__find_mech(mechlist, EXTERNAL)) +return EXTERNAL; + else if (svn_ra_svn__find_mech(mechlist, ANONYMOUS)) +return ANONYMOUS; + else +{ + int i; + + /* Create a string containing the list of mechanisms, + * separated by spaces. */ + for (i = 0; i mechlist-nelts; i++) +{ + svn_ra_svn_item_t *elt = APR_ARRAY_IDX(mechlist, i, + svn_ra_svn_item_t); + mechstring = apr_pstrcat(result_pool, + mechstring, + i == 0 ? : , + elt-u.word, NULL); +} +} + + return mechstring; +} + svn_error_t * svn_ra_svn__do_cyrus_auth(svn_ra_svn__session_baton_t *sess, apr_array_header_t *mechlist, @@ -734,7 +796,6 @@ svn_ra_svn__do_cyrus_auth(svn_ra_svn__session_bato array terminator). */ sasl_callback_t callbacks[3]; cred_baton_t cred_baton; - int i; if (!sess-is_tunneled) { @@ -742,24 +803,7 @@ svn_ra_svn__do_cyrus_auth(svn_ra_svn__session_bato sess-conn, pool)); } - /* Prefer EXTERNAL, then ANONYMOUS, then let SASL decide. */ - if (svn_ra_svn__find_mech(mechlist, EXTERNAL)) -mechstring = EXTERNAL; - else if (svn_ra_svn__find_mech(mechlist, ANONYMOUS)) -mechstring = ANONYMOUS; - else -{ - /* Create a string containing the list of mechanisms, separated by spaces.
Re: svn diff optimization to make blame faster?
On Sun, Sep 5, 2010 at 1:53 AM, Johan Corveleyn jcor...@gmail.com wrote: On Tue, Aug 24, 2010 at 11:11 AM, Philip Martin philip.mar...@wandisco.com wrote: Johan Corveleyn jcor...@gmail.com writes: On Sun, Aug 22, 2010 at 4:02 PM, Branko Čibej br...@xbc.nu wrote: svn_diff uses basically the same algorithm as GNU diff but implemented slightly differently and IIRC it doesn't have some of GNU diff's optimizations. I'm sure it can be speeded up, but haven't a clue about how much. Ok, thanks. In the meantime I saw that there is not that much difference anymore between GNU diff and svn_diff, after running the latter from a release build, and disabling my anti-virus (which makes me wonder why my anti-virus slows down svn_diff (impact when opening the modified datasource), but not on GNU diff). The big difference between Subversion's diff and GNU diff is that GNU uses heuristics to cut short the diff algorithm whereas Subversion grinds on to the end. Certain files trigger the heuristics and then GNU diff is much faster than Subversion. Typically the file has a large number of matches and non-matches repeated through the file, machine generated files sometimes fit this pattern. GNU diff's heuristics work well so when they trigger the resulting diff is usually good enough. They can be disabled using the --minimal option and using that makes GNU diff performance much more like Subversion. Hmm, I thought harder about this. If I try --minimal with GNU diff, it's just as fast, still significantly faster than svn diff. Then I read a little bit more about diff algorithms, like the blog entry that Greg Hudson suggested in the other thread [1], about the Patience Diff algorithm [2]. In there the following two lines caught my eye, first steps in the diff algo: [[[ 1) Match the first lines of both if they're identical, then match the second, third, etc. until a pair doesn't match. 2) Match the last lines of both if they're identical, then match the next to last, second to last, etc. until a pair doesn't match. ]]] (then the longest common subsequence algorithm kicks in) Now, these steps are not specific to Patience Diff, they are common to most diff algorithms. And I bet it's a much more efficient to exclude the common prefix/suffix this way, than to include them into the entire lcs algorithm. Right now, svn diff doesn't do this AFAICS. However, GNU diff does, and I think that's where most of the performance difference comes from. For big files, where a couple of lines are modified somewhere in the middle (and the first 3 and last 3 lines are identical), I think this can make a big difference. Thoughts? If no-one else fancies this, I think I'll take a stab at implementing this optimization (first with a POC to see if it's significant). Unless feedback tells me it's not worth it, not a good idea, ... Some update on this: I have implemented this for svn_diff (excluding the identical prefix and suffix of both files, and only then starting to fill up the token tree and let the lcs-agorithm to its thing). It makes a *huge* difference. On my bigfile.xml (1.5 Mb) with only one line changed, the call to svn_diff_diff is ~10 times faster (15-20 ms vs. 150-170 ms). This means blame is much faster for such big files as well, because diff accounts for 90% of the client-side work of blame. This means that in most cases the client cpu won't be the bottleneck anymore, so it's more constrained by server-side and network. In my tests, blame was about twice as fast for bigfile.xml (since all my tests were done with client and server on the same machine, I'm guessing the server is now the bottleneck (as well as them running on the same machine)). Same factor of performance increase for a medium-sized file (a java source file of 300 Kb). The patch is still very crude, with lots of code duplication, and everything stuffed in one big function (and I may have overlooked some edge conditions), so I don't feel it's ready to post yet (don't want to waste people's time :)). I'll try to clean it up as soon as I can and post a reasonable version to the list. Some notes already: - I basically wrote a new function datasources_open in diff_file.c (similar to datasource_open), that opens both files (original and modified), and compares bytes from the beginning until the first nonmatching byte (counting newlines when it sees them), and backwards from the end to find the first (or last, depending how you look at it) nonmatching byte. The number of prefix_lines is then used to determine the starting offset of the tokens (lines) that will later be extracted. - I think I should generalize this function to take an array of datasources, and not just two, so it can be used by diff3 and diff4 as well. But for now I didn't want to take on that extra complication. - The code is complicated by the fact that the svn_diff code (diff_file.c) works with chunks of data (doesn't read the entire file(s)
Re: svn diff optimization to make blame faster?
On Wed, Sep 15, 2010 at 02:20:05PM +0200, Johan Corveleyn wrote: Just to know how hard/fast I should pursue this: is there any chance that a change like this could still make it into 1.7 (maybe even backported to 1.6?), provided that the patch is of good quality and all... ? Depending on how invasive the change is, and how quickly we get it reviewed and tested, there's a chance that this can make the 1.7 release. For 1.6 it's probably not worth the effort. Stefan
Re: svn commit: r993183 - in /subversion/trunk/subversion: include/ libsvn_diff/ libsvn_ra_serf/ libsvn_subr/ mod_dav_svn/reports/
On Tue, 2010-09-07, s...@apache.org wrote: Author: stsp Date: Tue Sep 7 00:09:46 2010 New Revision: 993183 URL: http://svn.apache.org/viewvc?rev=993183view=rev Log: Introduce a new family of functions to parse numbers from strings. Looks useful. Thanks. The goal is to replace direct calls to functions like atoi() and apr_atoi64(), which have very inconvenient error handling to the point where a lot of our code simply ignores conversion errors and continues operating with the converted data in potentially undefined state. The new functions, on the other hand, report conversion errors via the standard Subversion error handling interface, thereby not allowing callers to ignore conversion errors. This commit also switches selected pieces of code over to the new functions. More conversion to come. * subversion/include/svn_string.h (svn_cstring_strtoi64, svn_cstring_atoi64, svn_cstring_atoi, svn_cstring_strtoui64, svn_cstring_atoui64, svn_cstring_atoui): Declare. * subversion/libsvn_subr/svn_string.c (): Include svn_private_config.h for the _() gettext macro. (svn_cstring_strtoi64, svn_cstring_strtoui64, svn_cstring_atoi64, svn_cstring_atoi): New. And the other two functions as well? (Just missing from the log message.) * subversion/libsvn_ra_serf/serf.c (dirent_walker): Use svn_cstring_atoi64() instead of apr_atoi64(). * subversion/libsvn_diff/parse-diff.c (parse_offset): Call svn_cstring_strtoui64() instead of calling apr_atoi64() and performing manual overflow checking. * subversion/mod_dav_svn/reports/log.c (dav_svn__log_report): Use svn_cstring_atoi() instead of atoi() for parsing CDATA of the limit element. * subversion/mod_dav_svn/reports/replay.c (dav_svn__replay_report): Use svn_cstring_strtoi64() instead of atoi() for parsing CDATA of the send-deltas element. * subversion/libsvn_subr/win32_xlate.c (get_page_id_from_name): Use svn_cstring_atoui() instead of atoi() to parse the number of a codepage. * subversion/libsvn_subr/io.c (svn_io_read_version_file): Use svn_cstring_atoi() instead of atoi(). * subversion/libsvn_subr/hash.c (svn_hash_read): Use svn_cstring_atoi() instead of atoi(). Modified: subversion/trunk/subversion/include/svn_string.h +/** + * Parse the C string @a str into a 64 bit number, and return it in @a *n. + * Assume that the number is represented in base @a base. + * Raise an error if conversion fails (e.g. due to overflow), or if the + * converted number is smaller than @a minval or larger than @a maxval. What are the semantics regarding 'str' - does it ignore leading spaces, a plus sign, decimal point, trailing spaces, trailing non-spaces? Preferably refer to a standard function to make this clear without trying to spell it all out in full. + * @since New in 1.7. + */ +svn_error_t * +svn_cstring_strtoi64(apr_int64_t *n, const char *str, + apr_int64_t minval, apr_int64_t maxval, + int base); + +/** + * Parse the C string @a str into a 64 bit number, and return it in @a *n. + * Assume that the number is represented in base 10. + * Raise an error if conversion fails (e.g. due to overflow). + * + * @since New in 1.7. + */ +svn_error_t * +svn_cstring_atoi64(apr_int64_t *n, const char *str); + +/** + * Parse the C string @a str into a 32 bit number, and return it in @a *n. What does parse into a 32 bit number mean when 'int' is not 32 bits? If the intention is to limit this to 32-bit numbers its name should reflect it. (The data type *could* stay as 'int' because we are allowed to assume it's at least 32 bits, but I don't know if we'd want to do that.) Otherwise it should support the range of 'int' and not assume 32 bits. + * Assume that the number is represented in base 10. + * Raise an error if conversion fails (e.g. due to overflow). + * + * @since New in 1.7. + */ +svn_error_t * +svn_cstring_atoi(int *n, const char *str); + +/** + * Parse the C string @a str into an unsigned 64 bit number, and return + * it in @a *n. Assume that the number is represented in base @a base. + * Raise an error if conversion fails (e.g. due to overflow), or if the + * converted number is smaller than @a minval or larger than @a maxval. + * + * @since New in 1.7. + */ +svn_error_t * +svn_cstring_strtoui64(apr_uint64_t *n, const char *str, + apr_uint64_t minval, apr_uint64_t maxval, + int base); + +/** + * Parse the C string @a str into an unsigned 64 bit number, and return + * it in @a *n. Assume that the number is represented in base 10. + * Raise an error if conversion fails (e.g. due to overflow). + * + * @since New in 1.7. + */ +svn_error_t * +svn_cstring_atoui64(apr_uint64_t *n, const char *str); + +/** + * Parse the C string @a str into an unsigned 32 bit number, and return + * it in @a *n. Assume that the number is represented in base 10. + * Raise
Re: svn commit: r993183 - in /subversion/trunk/subversion: include/ libsvn_diff/ libsvn_ra_serf/ libsvn_subr/ mod_dav_svn/reports/
Also printf formatting type mismatches: subversion/libsvn_subr/svn_string.c: In function 'svn_cstring_strtoui64': subversion/libsvn_subr/svn_string.c:662: format '%lu' expects type 'long unsigned int', but argument 5 has type 'apr_uint64_t' subversion/libsvn_subr/svn_string.c:662: format '%lu' expects type 'long unsigned int', but argument 6 has type 'apr_uint64_t' subversion/libsvn_subr/svn_string.c: In function 'svn_cstring_strtoi64': subversion/libsvn_subr/svn_string.c:706: format '%ld' expects type 'long int', but argument 5 has type 'apr_int64_t' subversion/libsvn_subr/svn_string.c:706: format '%ld' expects type 'long int', but argument 6 has type 'apr_int64_t' (These should use APR_[U]INT64_T_FMT. Doing so is complicated by the fact that the format string localization tools don't like string concatenation with macros, and there's no pool easily available to do a temporary printf.) - Julian
Re: Public API documentation
On Mi, 2010-09-15 at 10:35 +0100, Julian Foad wrote: Hyrum K. Wright wrote: I've been thinking about ways that our API documentation could be made a bit cleaner. ... Thoughts? -Hyrum +1. You know what, I've been a bit skeptical of the list-of-params style of documentation, as sometimes I've seen documentation that says @param revision - the revision number; @param path - the path, without saying the path of what, what kind of path, relative to what. But your mock-up is very good. I particularly like that it refers to the canonical definitions of depth, client_ctx (saying which parts of that are relevant) and operative/peg revisions, instead of trying to explain them yet again in yet another way. Indeed, I find it very good and it adds a lot of value by referencing important concepts and giving a concise list of return values. This style also feels a lot easier to handle, in my opinion, if I refer to the documentation in order to use other language bindings. Could be due to fewer uses of C syntax and marking parameters as in/out. I don't feel burdened with the task to read everything and can quickly check a single parameter or return value's description. This style won't fit all functions - for example, I don't think it would be useful to rewrite the doc string of svn_client_checkout2() which is currently: Similar to svn_client_checkout3() but with allow_unver_obstructions always set to FALSE, and depth set according to recurse: if recurse is TRUE, depth is svn_depth_infinity, if recurse is FALSE, depth is svn_depth_files. I think that is true. It would fit many of the current version functions but describing differences to older versions, for example, may be easier done with a descriptive text. Cheers, Uwe
Re: svn diff optimization to make blame faster?
Johan Corveleyn wrote on Wed, Sep 15, 2010 at 14:20:05 +0200: The patch is still very crude, with lots of code duplication, and everything stuffed in one big function (and I may have overlooked some edge conditions), so I don't feel it's ready to post yet (don't want to waste people's time :)). I'll try to clean it up as soon as I can and post a reasonable version to the list. +1, sounds like you have some interesting stuff working there. :-)
Re: svn commit: r997249 - in /subversion/trunk/subversion: include/svn_client.h libsvn_client/patch.c svn/cl.h svn/main.c svn/patch-cmd.c tests/cmdline/patch_tests.py tests/libsvn_client/client-test.c
s...@apache.org wrote on Wed, Sep 15, 2010 at 10:05:15 -: Author: stsp Date: Wed Sep 15 10:05:14 2010 New Revision: 997249 URL: http://svn.apache.org/viewvc?rev=997249view=rev Log: Add an --old-patch-target-names option to svn patch. This option is useful in two cases: 1) A patch contains names like --- foo.c +++ foo.c.new and should be applied to foo.c. 2) A patch contains names like --- foo.c.orig +++ foo.c and should be applied in reverse to foo.c, e.g. to undo prior application of the patch. What happens if a user forgets to supply the new option? (Does svn complain that 'foo.c.new is nonexistent/unversioned'?) For case (2): suppose I have such a patch (was emailed to me). I applied it using 'svn patch $patchfile'. Now I want to unapply it; so so I use 'svn patch --rd $patchfile' or 'svn patch --rd --optn $patchfile'? AIUI, currently only the latter will work? Is the UI 'svn patch' always uses the filename in the /^+++/ line?
Re: svn commit: r993183 - in /subversion/trunk/subversion: include/ libsvn_diff/ libsvn_ra_serf/ libsvn_subr/ mod_dav_svn/reports/
On Wed, Sep 15, 2010 at 02:14:36PM +0100, Julian Foad wrote: On Tue, 2010-09-07, s...@apache.org wrote: Author: stsp Date: Tue Sep 7 00:09:46 2010 New Revision: 993183 URL: http://svn.apache.org/viewvc?rev=993183view=rev Log: Introduce a new family of functions to parse numbers from strings. Looks useful. Thanks. The goal is to replace direct calls to functions like atoi() and apr_atoi64(), which have very inconvenient error handling to the point where a lot of our code simply ignores conversion errors and continues operating with the converted data in potentially undefined state. The new functions, on the other hand, report conversion errors via the standard Subversion error handling interface, thereby not allowing callers to ignore conversion errors. This commit also switches selected pieces of code over to the new functions. More conversion to come. * subversion/include/svn_string.h (svn_cstring_strtoi64, svn_cstring_atoi64, svn_cstring_atoi, svn_cstring_strtoui64, svn_cstring_atoui64, svn_cstring_atoui): Declare. * subversion/libsvn_subr/svn_string.c (): Include svn_private_config.h for the _() gettext macro. (svn_cstring_strtoi64, svn_cstring_strtoui64, svn_cstring_atoi64, svn_cstring_atoi): New. And the other two functions as well? (Just missing from the log message.) Will fix that. Modified: subversion/trunk/subversion/include/svn_string.h +/** + * Parse the C string @a str into a 64 bit number, and return it in @a *n. + * Assume that the number is represented in base @a base. + * Raise an error if conversion fails (e.g. due to overflow), or if the + * converted number is smaller than @a minval or larger than @a maxval. What are the semantics regarding 'str' - does it ignore leading spaces, a plus sign, decimal point, trailing spaces, trailing non-spaces? Preferably refer to a standard function to make this clear without trying to spell it all out in full. It's C-string, i.e. NUL-terminated. The number format is the same as apr_strtoi64() expects, which is equivalent to the C standard function strtol(). + * @since New in 1.7. + */ +svn_error_t * +svn_cstring_strtoi64(apr_int64_t *n, const char *str, + apr_int64_t minval, apr_int64_t maxval, + int base); + +/** + * Parse the C string @a str into a 64 bit number, and return it in @a *n. + * Assume that the number is represented in base 10. + * Raise an error if conversion fails (e.g. due to overflow). + * + * @since New in 1.7. + */ +svn_error_t * +svn_cstring_atoi64(apr_int64_t *n, const char *str); + +/** + * Parse the C string @a str into a 32 bit number, and return it in @a *n. What does parse into a 32 bit number mean when 'int' is not 32 bits? If the intention is to limit this to 32-bit numbers its name should reflect it. (The data type *could* stay as 'int' because we are allowed to assume it's at least 32 bits, but I don't know if we'd want to do that.) Otherwise it should support the range of 'int' and not assume 32 bits. I'll just change that to say integer. Internally, it uses an int, and assumes that any value from APR_INT32_MIN to APR_INT32_MAX will fit. Modified: subversion/trunk/subversion/libsvn_subr/svn_string.c +svn_error_t * +svn_cstring_strtoui64(apr_uint64_t *n, const char *str, + apr_uint64_t minval, apr_uint64_t maxval, + int base) +{ + apr_int64_t parsed; + + /* We assume errno is thread-safe. */ + errno = 0; /* APR-0.9 doesn't always set errno */ + + /* ### We're throwing away half the number range here. + * ### Maybe implement our own number parser? */ What use is this function, compared to _strtoi64(), if it doesn't actually support the top half of the range? Forward planning :) As soon as apr_strtoui64() exists, we can switch this function to use it without much effort. + parsed = apr_strtoi64(str, NULL, base); + if (errno != 0) +return svn_error_return( + svn_error_createf(SVN_ERR_INCORRECT_PARAMS, NULL, + _(Could not convert '%s' into a number), + str)); + if (parsed 0 || parsed minval || parsed maxval) Forget parsed 0; the parsed minval term will always catch that case because minval can't be negative. I'd prefer to keep it for clarity until this function can work with unsigned integers only. Note also a later commit which forced the minval and maxval comparisons to be signed. +return svn_error_return( + svn_error_createf(SVN_ERR_INCORRECT_PARAMS, NULL, + _(Number '%s' is out of range), str)); + *n = parsed; + return SVN_NO_ERROR; +} + +svn_error_t * +svn_cstring_atoui64(apr_uint64_t *n, const char *str) +{ + return svn_error_return(svn_cstring_strtoui64(n, str, 0, +
Re: svn commit: r993183 - in /subversion/trunk/subversion: include/ libsvn_diff/ libsvn_ra_serf/ libsvn_subr/ mod_dav_svn/reports/
On Wed, Sep 15, 2010 at 02:27:21PM +0100, Julian Foad wrote: Also printf formatting type mismatches: subversion/libsvn_subr/svn_string.c: In function 'svn_cstring_strtoui64': subversion/libsvn_subr/svn_string.c:662: format '%lu' expects type 'long unsigned int', but argument 5 has type 'apr_uint64_t' subversion/libsvn_subr/svn_string.c:662: format '%lu' expects type 'long unsigned int', but argument 6 has type 'apr_uint64_t' subversion/libsvn_subr/svn_string.c: In function 'svn_cstring_strtoi64': subversion/libsvn_subr/svn_string.c:706: format '%ld' expects type 'long int', but argument 5 has type 'apr_int64_t' subversion/libsvn_subr/svn_string.c:706: format '%ld' expects type 'long int', but argument 6 has type 'apr_int64_t' (These should use APR_[U]INT64_T_FMT. Doing so is complicated by the fact that the format string localization tools don't like string concatenation with macros, and there's no pool easily available to do a temporary printf.) The translation issues is why I didn't use the APR_[U]INT64_T_FMT. It's either using it and make life hard for translators, or have these warnings... neither solution is perfect. Or is there a better solution? We have a similar define for svn_revnum_t BTW, but it's been deprecated for the same reason. Stefan
Re: Public API documentation
On 09/15/2010 05:35 AM, Julian Foad wrote: Hyrum K. Wright wrote: I've been thinking about ways that our API documentation could be made a bit cleaner. [...] It's is really difficult to determine what each parameter does or what possible errors are returned, without reading the entire docstring. Maybe that's intended, but perhaps a strategy which enumerates the parameters would be a bit better. I've mocked up with this may look like: http://people.apache.org/~hwright/svn/doc/api/temp/group__clnt__wc__checkout.html#ga31e87d0db53bf1158eaceb6552c63bae I would suggest that yes, it is intended that folks read the entire docstring for a function before using it. But I get that sometimes it might be nice to more quickly lookup a particular parameter's general meaning without swimming through tons of context. Julian Foad wrote: But your mock-up is very good. I particularly like that it refers to the canonical definitions of depth, client_ctx (saying which parts of that are relevant) and operative/peg revisions, instead of trying to explain them yet again in yet another way. +1 I also like the way we can easily check whether the correct set of parameters is documented. Yup. -- C. Michael Pilato cmpil...@collab.net CollabNet www.collab.net Distributed Development On Demand signature.asc Description: OpenPGP digital signature
Re: Public API documentation
Hyrum K. Wright hyrum_wright at mail.utexas.edu writes: Thanks. Though I'll probably wait a day or two, to let any dissenters make themselves known. :) -1/2. There are parts of your idea that I like, such as linking to common documentation, enumerating all of the errors that can be returned, and specifying whether parameters are in, out, or in/out, but in general I do not like the idea of changing the style of documentation to focus on documentation of the parameters. I am a new user of the API, so in some ways I think that my initial impression is representative of others newbies' initial impression of the documentation. And, my initial impression was that I liked how the all of the parameters were explained in enough detail to be able to actually use the function. In the case of `svn_client_checkout3`, for example, I like how the documentation currently points out that the most important field of the ctx is `notify_func2`. The proposed idea of simply linking to the documentation of `svn_client_ctx_t` would not be very helpful to a beginner, I think, because there are *so many* fields of the `svn_client_ctx_t` structure, and it is not immediately clear which ones affect the behavior of `svn_client_checkout3`. Also, other details were trimmed away in the mock-up (e.g.: the basic description went from the detailed Checkout a working copy of `URL` at `revision`, looked up at `peg_revision`, using `path` as the root directory of the newly checked out working copy, and authenticating with the authentication baton cached in `ctx` to Checkout a working copy from a repository; the description of `result_rev` is currently more detailed than in the proposed description). The use of structures to pass in bundles of parameters is problematic for the proposed style of documentation, because it is not obvious how the fields of the bundles of parameters should be listed as parameters. Do you document `ctx-notify_func2` as an [in] parameter? Also, Julian's example of `svn_client_checkout2` is another problem with the proposed style.
Re: svn commit: r993183 - in /subversion/trunk/subversion: include/ libsvn_diff/ libsvn_ra_serf/ libsvn_subr/ mod_dav_svn/reports/
Stefan Sperling s...@elego.de writes: The translation issues is why I didn't use the APR_[U]INT64_T_FMT. It's either using it and make life hard for translators, or have these warnings... neither solution is perfect. Or is there a better solution? Have two format strings and pick the right one based on sizeof(long). It's tricky to select at compile time but it could be done at run-time as this string only occurs in error messages and so is not performance critical. Or it could be done at configure time, borrowing code from APR, or we could get APR to provide APR_LONG_IS[32|64]BITS. -- Philip
Re: Copies via update -- net win or net loss? (Was: svn commit: r996914 ...)
C. Michael Pilato wrote on Tue, Sep 14, 2010 at 10:58:20 -0400: On 09/14/2010 10:25 AM, rhuij...@apache.org wrote: Author: rhuijben Date: Tue Sep 14 14:25:52 2010 New Revision: 996914 URL: http://svn.apache.org/viewvc?rev=996914view=rev Log: When mixing the two major hacks of the update editor (copy_from location and file externals), we can get into an unexpected state. Update an assertion to handle this state properly and slightly update the expected test result. In light of our vision for the future regarding the pristine cache, and the seeming flakiness of the special-case code added to handle copies during update, should we kill that feature? Could someone summarize the feature/hack/special-case being discussed here? Is it about using the copyfrom info to optimize some over-the-wire transmissions during an update, as opposed to always asking for the fulltext of a copied file? (e.g., I found locate_copyfrom() in update-editor.c) Thanks, Daniel. I've never been convinced that it was truly beneficial as written anyway -- it seems to just sorta throws what ifs across the wire and then burdens the client with verification and handling before falling back to doing what it has always done for non-copied files. -- C. Michael Pilato cmpil...@collab.net CollabNet www.collab.net Distributed Development On Demand
Re: svn commit: r993183 - in /subversion/trunk/subversion: include/ libsvn_diff/ libsvn_ra_serf/ libsvn_subr/ mod_dav_svn/reports/
Stefan Sperling wrote: On Wed, Sep 15, 2010 at 02:14:36PM +0100, Julian Foad wrote: On Tue, 2010-09-07, s...@apache.org wrote: * subversion/include/svn_string.h (svn_cstring_strtoi64, svn_cstring_atoi64, svn_cstring_atoi, svn_cstring_strtoui64, svn_cstring_atoui64, svn_cstring_atoui): Declare. * subversion/libsvn_subr/svn_string.c (): Include svn_private_config.h for the _() gettext macro. (svn_cstring_strtoi64, svn_cstring_strtoui64, svn_cstring_atoi64, svn_cstring_atoi): New. And the other two functions as well? (Just missing from the log message.) Will fix that. Modified: subversion/trunk/subversion/include/svn_string.h +/** + * Parse the C string @a str into a 64 bit number, and return it in @a *n. + * Assume that the number is represented in base @a base. + * Raise an error if conversion fails (e.g. due to overflow), or if the + * converted number is smaller than @a minval or larger than @a maxval. What are the semantics regarding 'str' - does it ignore leading spaces, a plus sign, decimal point, trailing spaces, trailing non-spaces? Preferably refer to a standard function to make this clear without trying to spell it all out in full. It's C-string, i.e. NUL-terminated. The number format is the same as apr_strtoi64() expects, which is equivalent to the C standard function strtol(). Great, either of those will make a good reference from the doc string. + * @since New in 1.7. + */ +svn_error_t * +svn_cstring_strtoi64(apr_int64_t *n, const char *str, + apr_int64_t minval, apr_int64_t maxval, + int base); +/** + * Parse the C string @a str into a 32 bit number, and return it in @a *n. What does parse into a 32 bit number mean when 'int' is not 32 bits? If the intention is to limit this to 32-bit numbers its name should reflect it. (The data type *could* stay as 'int' because we are allowed to assume it's at least 32 bits, but I don't know if we'd want to do that.) Otherwise it should support the range of 'int' and not assume 32 bits. I'll just change that to say integer. Great. Internally, it uses an int, and assumes that any value from APR_INT32_MIN to APR_INT32_MAX will fit. That's the inconsistency I mentioned below: it claims to support 'int' but the implementation enforces a 32-bit range limit regardless of the size of 'int'. It should enforce the range of the actual 'int' type instead. (If the code uses 64-bit internally and assumes that 'int' is = 64 bits, I'd be OK with that.) Modified: subversion/trunk/subversion/libsvn_subr/svn_string.c +svn_error_t * +svn_cstring_strtoui64(apr_uint64_t *n, const char *str, + apr_uint64_t minval, apr_uint64_t maxval, + int base) +{ + apr_int64_t parsed; + + /* We assume errno is thread-safe. */ + errno = 0; /* APR-0.9 doesn't always set errno */ + + /* ### We're throwing away half the number range here. + * ### Maybe implement our own number parser? */ What use is this function, compared to _strtoi64(), if it doesn't actually support the top half of the range? Forward planning :) As soon as apr_strtoui64() exists, we can switch this function to use it without much effort. OK, that's fine if we're only using it to read counts of real things, like real file sizes. If we want to read arbitrary 64-bit numbers such as digests or virtual memory addresses, then we might need the upper range to work. What's the failure mode? From a quick look at the present code I think it will return an error like Number '(2^63)' is out of range '[0, (2^64 - 1)]' which is not exactly correct but at least it leads the developer to the right part of the source code. Please add a note in the doc string that these two functions only support numbers up to 2^63-1. We can remove the restiction and the note in some later version. + parsed = apr_strtoi64(str, NULL, base); + if (errno != 0) +return svn_error_return( + svn_error_createf(SVN_ERR_INCORRECT_PARAMS, NULL, + _(Could not convert '%s' into a number), + str)); + if (parsed 0 || parsed minval || parsed maxval) Forget parsed 0; the parsed minval term will always catch that case because minval can't be negative. I'd prefer to keep it for clarity until this function can work with unsigned integers only. Note also a later commit which forced the minval and maxval comparisons to be signed. Oh, oops - I see: the comparison with minval can't catch it because it would convert both args to unsigned. The present version says: if ((errno == ERANGE (val == APR_INT64_MIN || val == APR_INT64_MAX)) || val 0 || (apr_uint64_t)val minval || (apr_uint64_t)val maxval) return svn_error_return(
Re: svn commit: r996581 - in /subversion/trunk/subversion: libsvn_diff/parse-diff.c tests/cmdline/patch_tests.py
Stefan Sperling wrote on Tue, Sep 14, 2010 at 21:18:26 +0200: On Tue, Sep 14, 2010 at 12:49:06PM +0100, Julian Foad wrote: Right. The issue seems to be How do I determine what path I should apply the patch to? If the patch style is --- wc/file +++ wc/file or --- old_wc/file +++ new_wc/file then you're going to be stripping the first component so it doesn't matter which path you use. If the patch is just to one file and looks something like --- wc/file.old +++ wc/file or --- wc/file +++ wc/file.new or --- wc/file.old +++ wc/file.new then obviously it's not so simple and the patch consumer (person) may need to be able to tell svn patch which path it should look at if we're to be able to handle cases like this. Yes. I've been trying to come up with a good UI for selecting the right filename for svn patch to use. Any ideas? Paraphrasing my commit reply a few hours ago, how about: * Always use the name from the /^+++/ line by default (regardless of --rd). * Have a --strip-extension flag. (Effect: filename loses everything after the last dot) * Have a the following config option: [[[ # ~/.subversion/config [miscellany] patch-strip-extensions = .new .old .orig .org ~ \ .merge-left .merge-right .working ]]] for extensions that are stripped automagically. Sounds sane? (I haven't thought about this /too/ much, so beware of bugs in the above idea) Daniel
Re: Copies via update -- net win or net loss? (Was: svn commit: r996914 ...)
On 09/15/2010 12:01 PM, Daniel Shahaf wrote: C. Michael Pilato wrote on Tue, Sep 14, 2010 at 10:58:20 -0400: On 09/14/2010 10:25 AM, rhuij...@apache.org wrote: Author: rhuijben Date: Tue Sep 14 14:25:52 2010 New Revision: 996914 URL: http://svn.apache.org/viewvc?rev=996914view=rev Log: When mixing the two major hacks of the update editor (copy_from location and file externals), we can get into an unexpected state. Update an assertion to handle this state properly and slightly update the expected test result. In light of our vision for the future regarding the pristine cache, and the seeming flakiness of the special-case code added to handle copies during update, should we kill that feature? Could someone summarize the feature/hack/special-case being discussed here? Is it about using the copyfrom info to optimize some over-the-wire transmissions during an update, as opposed to always asking for the fulltext of a copied file? (e.g., I found locate_copyfrom() in update-editor.c) That's the one. The server does a little extra work to calculate a copyfrom path, a little extra work to authorize that path. If it all checks out, the server sends that copyfrom info plus any additional content/prop deltas against that copyfrom source to the client instead of just transmitting the new contents of the copied thing in full. If the client sees copyfrom info, it then goes to try to find that same thing in the working copy. If it manages to find something, it does a local copy and then applies those extra deltas (if any) from the server. If it doesn't find the copyfrom source locally, it then asks the server for the pristine version of the copyfrom source so it has something to apply those deltas to. Yeah. locate_copyfrom(). Eek. I don't doubt that in some cases, this code might save someone a bit of network traffic. But our vision for this code in the future involves much more reliable, less-heuristically-driven approaches. SHA1 checksums into a database keyed on as much, etc. I seriously believe we'd be better off losing this codepath today, if only so we can simplify what is already a very complex operation (update). At this point, I'm not even really asking for permission any more. If somebody doesn't beat me to it, I'll gut the feature myself after I wrap up some other things that hold my attention right now. -- C. Michael Pilato cmpil...@collab.net CollabNet www.collab.net Distributed Development On Demand signature.asc Description: OpenPGP digital signature
[PATCH] NODES table presence values
Bert, Erik, Greg... I think the schema should not disallow the 'excluded' presence in NODES table where op_depth 0 (which corresponds roughly to old WORKING_NODE). There are already 'copy' cases where it is used, and seems useful and right. I also think the schema should not disallow 'absent'. We have already talked about how, if you allow a sub-tree copy (for example) to have a child that is 'absent' (unauthorized), you won't be able to commit that copy. But I don't think that means that the schema should forbid ever representing such a case. I think if we forbid it at this level we would possibly be making the rules more complex than they need to be and would possibly run into problems later when we might find that actually there are good cases to be made for representing such a state. So I think we should simplify the NODES table doc string as follows: [[[ Index: subversion/libsvn_wc/wc-metadata.sql === --- subversion/libsvn_wc/wc-metadata.sql(revision 997335) +++ subversion/libsvn_wc/wc-metadata.sql(working copy) @@ -709,38 +709,13 @@ CREATE TABLE NODES ( /* Is this node present or has it been excluded for some reason? In case 'op_depth' is equal to 0, this is part of the BASE tree; in that case, all presence values except 'base-deleted' are allowed. In case 'op_depth' is greater than 0, this is part of a layer of - working nodes; in that case, the following presence values apply: - - Only allowed values: normal, not-present, incomplete, base-deleted. - (the others do not make sense for the WORKING tree) - - normal: this node has been added/copied/moved-here. There may be an - underlying BASE node at this location, implying this is a replace. - Scan upwards from here looking for copyfrom or moved_here values - to detect the type of operation constructing this node. - - not-present: the node (or parent) was originally copied or moved-here. - A subtree of that source has since been deleted. There may be - underlying BASE node to replace. For a move-here or copy-here, the - records are simply removed rather than switched to not-present. - Note this reflects a deletion only. It is not possible move-away - nodes from the WORKING tree. The purported destination would receive - a copy from the original source of a copy-here/move-here, or if the - nodes were plain adds, those nodes would be shifted to that target - for addition. - - incomplete: nodes are being added into the WORKING tree, and the full - information about this node is not (yet) present. - - base-deleted: the underlying BASE node has been marked for deletion due - to a delete or a move-away (see the moved_to column to determine - which), and has not been replaced. */ + working nodes; in that case, all presence values are allowed. */ presence TEXT NOT NULL, /* NULL depth means default (typically svn_depth_infinity) */ /* ### depth on WORKING? seems this is a BASE-only concept. how do ### you do files on an added-directory? can't really ignore ### the subdirs! */ ]]] Thoughts? - Julian
Re: svn commit: r997249 - in /subversion/trunk/subversion: include/svn_client.h libsvn_client/patch.c svn/cl.h svn/main.c svn/patch-cmd.c tests/cmdline/patch_tests.py tests/libsvn_client/client-test.c
On Wed, Sep 15, 2010 at 04:51:15PM +0200, Daniel Shahaf wrote: s...@apache.org wrote on Wed, Sep 15, 2010 at 10:05:15 -: Author: stsp Date: Wed Sep 15 10:05:14 2010 New Revision: 997249 URL: http://svn.apache.org/viewvc?rev=997249view=rev Log: Add an --old-patch-target-names option to svn patch. This option is useful in two cases: 1) A patch contains names like --- foo.c +++ foo.c.new and should be applied to foo.c. 2) A patch contains names like --- foo.c.orig +++ foo.c and should be applied in reverse to foo.c, e.g. to undo prior application of the patch. What happens if a user forgets to supply the new option? (Does svn complain that 'foo.c.new is nonexistent/unversioned'?) $ svn patch test.diff Skipped missing target: 'foo.new' Summary of conflicts: Skipped paths: 1 For case (2): suppose I have such a patch (was emailed to me). I applied it using 'svn patch $patchfile'. Now I want to unapply it; so so I use 'svn patch --rd $patchfile' or 'svn patch --rd --optn $patchfile'? AIUI, currently only the latter will work? Which one works depends on the way path names appear in the patch. Say you have a diff produced with svn diff. That will work with either combination of options. $ svn diff Index: alpha === --- alpha (revision 2) +++ alpha (working copy) @@ -1 +1,2 @@ alpha +aaa You need the new option only to handle cases where people used e.g. the UNIX diff tool to create a patch using left/right names they made up while producing the diff. Is the UI 'svn patch' always uses the filename in the /^+++/ line? Yes. By default, that's the name that is used. If you reverse the diff, the filenames get swapped as well. So if you use the --optn option, the filenames get swapped again, or not at all, depending which way you look at it :) This UI may not be ideal, and I'm open for suggestions on how we could improve it. Maybe there's a way to handle this without user interaction, or a better name for the new option? The UNIX patch tool prompts for a filename when it cannot find a target. We could do the same, defining yet another prompt callback type. But I don't like the fact that UNIX patch does not do tab-completion in its prompt. I think the lack of tab-completion really affects usability. So if svn had a prompt for this, I'd like it to support tab-completion. So the new --optn option is an easier way to implement this. Maybe it's sufficient? Note that having the option around doesn't hurt even if we decide to implement prompting later. Stefan
Re: Public API documentation
On Wed, Sep 15, 2010 at 5:07 PM, Daniel Trebbien dtrebb...@gmail.com wrote: Hyrum K. Wright hyrum_wright at mail.utexas.edu writes: Thanks. Though I'll probably wait a day or two, to let any dissenters make themselves known. :) -1/2. There are parts of your idea that I like, such as linking to common documentation, enumerating all of the errors that can be returned, and specifying whether parameters are in, out, or in/out, but in general I do not like the idea of changing the style of documentation to focus on documentation of the parameters. I am a new user of the API, so in some ways I think that my initial impression is representative of others newbies' initial impression of the documentation. And, my initial impression was that I liked how the all of the parameters were explained in enough detail to be able to actually use the function. In the case of `svn_client_checkout3`, for example, I like how the documentation currently points out that the most important field of the ctx is `notify_func2`. The Actually, the important fields are notify_func2, and the auth_baton, but the current docs make zero mention of the later. proposed idea of simply linking to the documentation of `svn_client_ctx_t` would not be very helpful to a beginner, I think, because there are *so many* fields of the `svn_client_ctx_t` structure, and it is not immediately clear which ones affect the behavior of `svn_client_checkout3`. Then we need to make it clear, such as saying used for authentication and notification. In the client context docs, we'd have something like if used for authentication... and if used for notification Also, other details were trimmed away in the mock-up (e.g.: the basic description went from the detailed Checkout a working copy of `URL` at `revision`, looked up at `peg_revision`, using `path` as the root directory of the newly checked out working copy, and authenticating with the authentication baton cached in `ctx` to Checkout a working copy from a repository; the description of `result_rev` is currently more detailed than in the proposed description). But that's my point. We spent a lot of time writing the *exact same* documentation for many APIs. Instead of write of `URL` at `revision`, looked up at `peg_revision`, using `path` as the root directory of the newly checked out working copy, for every single API which takes a URL, peg revision and revision, I'd prefer to only write that text once, where it can then be referenced from every API which needs it. (Kinda like a documentation subroutine. :) In the bits that are referenced often, such as svn_client_ctx_t, it would be important that we are explicit about what the members mean, and what they do, much more than we currently are. And I pray we're currently being consistent on how we use those members between APIs. By the way, I think your perspective is very valuable. It's been a long time since I was new to the Subversion APIs, and have since lost that perspective. But I also use the API documentation on a regular basis, demonstrating that we should cater to both new and old API consumers. The use of structures to pass in bundles of parameters is problematic for the proposed style of documentation, because it is not obvious how the fields of the bundles of parameters should be listed as parameters. Do you document `ctx-notify_func2` as an [in] parameter? We document all of the client context as an [in] parameter. But yeah, having a massive bunch of stuff is not very conducive for documentation (or for maintenance, it turns out). Also, Julian's example of `svn_client_checkout2` is another problem with the proposed style. We'd leave that as-is. A side goal of this little project is to make the documentation better parable, so that automagic docs of the bindings, for instance, could be generated. But that's really tangential to human readability. Thanks for the feedback, -Hyrum
Re: svn commit: r997249 - in /subversion/trunk/subversion: include/svn_client.h libsvn_client/patch.c svn/cl.h svn/main.c svn/patch-cmd.c tests/cmdline/patch_tests.py tests/libsvn_client/client-test.c
On Wed, Sep 15, 2010 at 8:42 PM, Stefan Sperling s...@elego.de wrote: On Wed, Sep 15, 2010 at 04:51:15PM +0200, Daniel Shahaf wrote: s...@apache.org wrote on Wed, Sep 15, 2010 at 10:05:15 -: Author: stsp Date: Wed Sep 15 10:05:14 2010 New Revision: 997249 URL: http://svn.apache.org/viewvc?rev=997249view=rev Log: Add an --old-patch-target-names option to svn patch. This option is useful in two cases: 1) A patch contains names like --- foo.c +++ foo.c.new and should be applied to foo.c. 2) A patch contains names like --- foo.c.orig +++ foo.c and should be applied in reverse to foo.c, e.g. to undo prior application of the patch. What happens if a user forgets to supply the new option? (Does svn complain that 'foo.c.new is nonexistent/unversioned'?) $ svn patch test.diff Skipped missing target: 'foo.new' Summary of conflicts: Skipped paths: 1 For case (2): suppose I have such a patch (was emailed to me). I applied it using 'svn patch $patchfile'. Now I want to unapply it; so so I use 'svn patch --rd $patchfile' or 'svn patch --rd --optn $patchfile'? AIUI, currently only the latter will work? Which one works depends on the way path names appear in the patch. Say you have a diff produced with svn diff. That will work with either combination of options. $ svn diff Index: alpha === --- alpha (revision 2) +++ alpha (working copy) @@ -1 +1,2 @@ alpha +aaa You need the new option only to handle cases where people used e.g. the UNIX diff tool to create a patch using left/right names they made up while producing the diff. Is the UI 'svn patch' always uses the filename in the /^+++/ line? Yes. By default, that's the name that is used. If you reverse the diff, the filenames get swapped as well. So if you use the --optn option, the filenames get swapped again, or not at all, depending which way you look at it :) This UI may not be ideal, and I'm open for suggestions on how we could improve it. Maybe there's a way to handle this without user interaction, or a better name for the new option? The UNIX patch tool prompts for a filename when it cannot find a target. We could do the same, defining yet another prompt callback type. But I don't like the fact that UNIX patch does not do tab-completion in its prompt. I think the lack of tab-completion really affects usability. So if svn had a prompt for this, I'd like it to support tab-completion. So the new --optn option is an easier way to implement this. Maybe it's sufficient? Note that having the option around doesn't hurt even if we decide to implement prompting later. How about just adding '--patch-left' and '--patch-right' or something. The either both have to used, or neither, and if they are, use those names as the once in the patch. If people can use the UNIX diff tool to create left/right names they just made up, let's just let them worry about feeding those names back into Subversion (rather than guessing). -Hyrum
Fwd: Regression for issue:2362 ?
Forwarding to dev@ as this seems to me an obvious regression (unfortunately I got no reply from users@) -- Forwarded message -- From: Weirdan weir...@gmail.com Date: Wed, Sep 8, 2010 at 2:50 PM Subject: Regression for issue:2362 ? To: us...@subversion.apache.org Hello I'm using svn v1.6.12 and it seems that calling 'svn info --xml' on a file in an unversioned directory still causes svn to produce broken xml: [weir...@localhost /home/sam/trunk] mkdir dir [weir...@localhost /home/sam/trunk] touch dir/file [weir...@localhost /home/sam/trunk] svn info --xml dir/file ?xml version=1.0? info svn: 'dir' is not a working copy [weir...@localhost /home/sam/trunk] svn --version svn, version 1.6.12 (r955767) compiled Aug 5 2010, 15:04:19 Copyright (C) 2000-2009 CollabNet. Subversion is open source software, see http://subversion.tigris.org/ This product includes software developed by CollabNet ( http://www.Collab.Net/). The following repository access (RA) modules are available: * ra_neon : Module for accessing a repository via WebDAV protocol using Neon. - handles 'http' scheme - handles 'https' scheme * ra_svn : Module for accessing a repository using the svn network protocol. - with Cyrus SASL authentication - handles 'svn' scheme * ra_local : Module for accessing a repository on local disk. - handles 'file' scheme * ra_serf : Module for accessing a repository via WebDAV protocol using serf. - handles 'http' scheme - handles 'https' scheme [weir...@localhost /home/sam/trunk] While the output is not exactly the same as in this ticket: http://subversion.tigris.org/issues/show_bug.cgi?id=2887 , it seems to be similar enough to call this a regression. I'd comment on original issue, but it doesn't seem to be possible, so I'm posting this to the mailing list hoping someone from dev team would pick this up. -- Best regards, Bruce Weirdan mailto: weir...@gmail.com -- Best regards, Bruce Weirdan mailto: weir...@gmail.com
Re: svn commit: r993183 - in /subversion/trunk/subversion: include/ libsvn_diff/ libsvn_ra_serf/ libsvn_subr/ mod_dav_svn/reports/
On Wed, Sep 15, 2010 at 05:12:43PM +0100, Julian Foad wrote: + * @since New in 1.7. + */ +svn_error_t * +svn_cstring_strtoi64(apr_int64_t *n, const char *str, + apr_int64_t minval, apr_int64_t maxval, + int base); +/** + * Parse the C string @a str into a 32 bit number, and return it in @a *n. What does parse into a 32 bit number mean when 'int' is not 32 bits? If the intention is to limit this to 32-bit numbers its name should reflect it. (The data type *could* stay as 'int' because we are allowed to assume it's at least 32 bits, but I don't know if we'd want to do that.) Otherwise it should support the range of 'int' and not assume 32 bits. I'll just change that to say integer. Great. Internally, it uses an int, and assumes that any value from APR_INT32_MIN to APR_INT32_MAX will fit. That's the inconsistency I mentioned below: it claims to support 'int' but the implementation enforces a 32-bit range limit regardless of the size of 'int'. It should enforce the range of the actual 'int' type instead. (If the code uses 64-bit internally and assumes that 'int' is = 64 bits, I'd be OK with that.) Should we use INT_MIN and INT_MAX so? Modified: subversion/trunk/subversion/libsvn_subr/svn_string.c +svn_error_t * +svn_cstring_strtoui64(apr_uint64_t *n, const char *str, + apr_uint64_t minval, apr_uint64_t maxval, + int base) +{ + apr_int64_t parsed; + + /* We assume errno is thread-safe. */ + errno = 0; /* APR-0.9 doesn't always set errno */ + + /* ### We're throwing away half the number range here. + * ### Maybe implement our own number parser? */ What use is this function, compared to _strtoi64(), if it doesn't actually support the top half of the range? Forward planning :) As soon as apr_strtoui64() exists, we can switch this function to use it without much effort. OK, that's fine if we're only using it to read counts of real things, like real file sizes. If we want to read arbitrary 64-bit numbers such as digests or virtual memory addresses, then we might need the upper range to work. What's the failure mode? From a quick look at the present code I think it will return an error like Number '(2^63)' is out of range '[0, (2^64 - 1)]' which is not exactly correct but at least it leads the developer to the right part of the source code. Please add a note in the doc string that these two functions only support numbers up to 2^63-1. We can remove the restiction and the note in some later version. If we allowed ourselves to use strtoull() directly instead of using apr_strtoi64(), this problem would go away. Apparently it's part of C99 only though. If we decide to do this, I'll skip apr_strtoi64() as well, and just call strtoll() directly. We could also use strtol() and strtoul(), which are C89. + parsed = apr_strtoi64(str, NULL, base); + if (errno != 0) +return svn_error_return( + svn_error_createf(SVN_ERR_INCORRECT_PARAMS, NULL, + _(Could not convert '%s' into a number), + str)); + if (parsed 0 || parsed minval || parsed maxval) Forget parsed 0; the parsed minval term will always catch that case because minval can't be negative. I'd prefer to keep it for clarity until this function can work with unsigned integers only. Note also a later commit which forced the minval and maxval comparisons to be signed. Oh, oops - I see: the comparison with minval can't catch it because it would convert both args to unsigned. The present version says: if ((errno == ERANGE (val == APR_INT64_MIN || val == APR_INT64_MAX)) || val 0 || (apr_uint64_t)val minval || (apr_uint64_t)val maxval) return svn_error_return( svn_error_createf(SVN_ERR_INCORRECT_PARAMS, NULL, _(Number '%s' is out of range '[%lu, %lu]'), str, minval, maxval)); It forces the comparisons to be unsigned. AFAIK that's the same as C's rules for signed/unsigned comparisons of the same size anyway. It might be, but I can't keep all rules of C in my head. So it's easier to be explicit. What's with the ERANGE (val == ...) test? I notice the Posix spec for 'strtol' says when it sets errno to ERANGE it also returns LONG_MIN or LONG_MAX, but what's the point of testing that, even if apr_strtoi64() does the same (which it doesn't document)? When reading this code, it looks like you're deliberately excluding some other ERANGE conditions from this test and allowing them to be treated as success conditions, but I don't think that's what you mean. Strip the redundant tests! POSIX does that to allow the caller to tell whether underflow or overflow
Re: svn commit: r995475 - /subversion/trunk/subversion/libsvn_repos/load.c
On Fri, Sep 10, 2010 at 04:02:52PM +0200, Branko Čibej wrote: On 10.09.2010 15:26, Stefan Sperling wrote: I know we're using C89, but maybe it's time to move on and upgrade to C99 where the benefits are desirable? When Subversion was started, C89 was about a decade old, and C99 is just as old now... Microsoft's C compiler, to name only one, still does not provide most of the C99 features. I don't know about standard library support. However, I don't see where you gain with using strtol(). First of all, it was already in C89 and has effectively the same interface as the APR conversions. C99 added strtoll() but it has the same interface. What's the benefit? Raising this thread again, because there is a benefit to using strtol()'s unsigned cousin, strtoul(). APR does not provide an interface for it. Can we use it? Can we also use its 64-bit cousin strtoull()? If we can use those two, we might as well use strtol() and strtoul() and skip apr_strtoi64() completely. Stefan
Re: svn commit: r995475 - /subversion/trunk/subversion/libsvn_repos/load.c
On Wed, Sep 15, 2010 at 9:03 PM, Stefan Sperling s...@elego.de wrote: On Fri, Sep 10, 2010 at 04:02:52PM +0200, Branko Čibej wrote: On 10.09.2010 15:26, Stefan Sperling wrote: I know we're using C89, but maybe it's time to move on and upgrade to C99 where the benefits are desirable? When Subversion was started, C89 was about a decade old, and C99 is just as old now... Microsoft's C compiler, to name only one, still does not provide most of the C99 features. I don't know about standard library support. However, I don't see where you gain with using strtol(). First of all, it was already in C89 and has effectively the same interface as the APR conversions. C99 added strtoll() but it has the same interface. What's the benefit? Raising this thread again, because there is a benefit to using strtol()'s unsigned cousin, strtoul(). APR does not provide an interface for it. Can we use it? Can we also use its 64-bit cousin strtoull()? If we can use those two, we might as well use strtol() and strtoul() and skip apr_strtoi64() completely. Adding an implementation to APR isn't a problem, but it would take a while to trickle down (we'd still need a private implementation). -Hyrum
Re: svn commit: r997249 - in /subversion/trunk/subversion: include/svn_client.h libsvn_client/patch.c svn/cl.h svn/main.c svn/patch-cmd.c tests/cmdline/patch_tests.py tests/libsvn_client/client-test.c
On Wed, Sep 15, 2010 at 08:53:17PM +0200, Hyrum K. Wright wrote: How about just adding '--patch-left' and '--patch-right' or something. The either both have to used, or neither, and if they are, use those names as the once in the patch. I don't like these names because it's not clear what left and right refer to. Also, why would you need to use both these options together? If people can use the UNIX diff tool to create left/right names they just made up, let's just let them worry about feeding those names back into Subversion (rather than guessing). One goal of svn patch is to facilitate application of arbitrary unidiffs. The person generating the patch may not be the same person worrying about applying the patch to a Subversion working copy. In any case, people can always edit the patch file. But having an option like this saves a bit of manual edits in the cases described. Stefan
Re: svn commit: r996581 - in /subversion/trunk/subversion: libsvn_diff/parse-diff.c tests/cmdline/patch_tests.py
On Wed, Sep 15, 2010 at 06:15:58PM +0200, Daniel Shahaf wrote: Stefan Sperling wrote on Tue, Sep 14, 2010 at 21:18:26 +0200: On Tue, Sep 14, 2010 at 12:49:06PM +0100, Julian Foad wrote: Right. The issue seems to be How do I determine what path I should apply the patch to? If the patch style is --- wc/file +++ wc/file or --- old_wc/file +++ new_wc/file then you're going to be stripping the first component so it doesn't matter which path you use. If the patch is just to one file and looks something like --- wc/file.old +++ wc/file or --- wc/file +++ wc/file.new or --- wc/file.old +++ wc/file.new then obviously it's not so simple and the patch consumer (person) may need to be able to tell svn patch which path it should look at if we're to be able to handle cases like this. Yes. I've been trying to come up with a good UI for selecting the right filename for svn patch to use. Any ideas? Paraphrasing my commit reply a few hours ago, how about: * Always use the name from the /^+++/ line by default (regardless of --rd). * Have a --strip-extension flag. (Effect: filename loses everything after the last dot) * Have a the following config option: [[[ # ~/.subversion/config [miscellany] patch-strip-extensions = .new .old .orig .org ~ \ .merge-left .merge-right .working ]]] for extensions that are stripped automagically. Sounds sane? (I haven't thought about this /too/ much, so beware of bugs in the above idea) Looking at the patch(1) man page, it sounds like we don't really need to reinvent the wheel here: patch will choose the file name by performing the following steps, with the first match used: 1. If patch is operating in strict IEEE Std 1003.1-2008 (``POSIX'') mode, the first of the ``old'', ``new'' and ``index'' file names that exist is used. Otherwise, patch will examine either the ``old'' and ``new'' file names or, for a non-context diff, the ``index'' file name, and choose the file name with the fewest path components, the shortest basename, and the shortest total file name length (in that order). 2. If no file exists, patch checks for the existence of the files in an SCCS or RCS directory (using the appropriate prefix or suffix) using the criteria specified above. If found, patch will attempt to get or check out the file. 3. If no suitable file was found to patch, the patch file is a context or unified diff, and the old file was zero length, the new file name is created and used. 4. If the file name still cannot be determined, patch will prompt the user for the file name to use. I suppose we could implement the method described in step 1. That would make us compatible to UNIX patch, and should resolve the two use cases I have in mind. Step 2 doesn't apply to us. Step 3 is used already. Step 4 could be implemented as well (but I'd prefer to do that post-1.7). Thoughts? Stefan
Pending 1.6.x backport proposals.
I've committed a bunch of changes for issue #3709 and related situations (r996884, r997026, r997457, r997466, r997471, and r997474). I *do* plan to propose them for backport to 1.6.x, but that work will have to wait until tomorrow. Just letting folks know. -- C. Michael Pilato cmpil...@collab.net CollabNet www.collab.net Distributed Development On Demand signature.asc Description: OpenPGP digital signature
Pending 1.6.x backport proposals.
I've committed a bunch of changes for issue #3709 and related situations (r996884, r997026, r997457, r997466, r997471, and r997474). I *do* plan to propose them for backport to 1.6.x, but that work will have to wait until tomorrow. Just letting folks now. -- C. Michael Pilato cmpil...@collab.net CollabNet www.collab.net Distributed Development On Demand signature.asc Description: OpenPGP digital signature
Re: svn commit: r997403 - /subversion/trunk/subversion/libsvn_wc/wc-metadata.sql
On Wed, Sep 15, 2010 at 13:19, julianf...@apache.org wrote: ... - - ### A comment on 'repos_id' and 'repos_relpath' says they may be null; - is this true and wanted? Yes. That meant use the parent's information, and extend it to compute this node's value. + - The Node-Rev group points to the corresponding repository node-rev. - - ### A comment on 'revnum' says, this could be NULL for non-present - nodes -- no info; is this true and wanted? + - If not used (as specified by the 'presence' table above), the values + are undefined. + ### Perhaps we should set them to null to make it clearer. Yes, we should. Content columns: (kind, properties, depth, target, checksum) -- - -- @@ -165,6 +177,7 @@ Overview of BASE_NODE columns: - Includes a copy of the corresponding date and author rev-props. - If not used (as specified by the 'presence' table above), all null. + ### Not checked; in practice these columns may have undefined values. They should never be undefined. They should be null. We should never have undefined information in our schema. That leads to madness... Cheers, -g
Re: svn commit: r997334 - /subversion/trunk/subversion/libsvn_wc/wc-metadata.sql
On Wed, Sep 15, 2010 at 09:45, julianf...@apache.org wrote: ... +++ subversion/trunk/subversion/libsvn_wc/wc-metadata.sql Wed Sep 15 13:45:12 2010 @@ -618,6 +618,21 @@ PRAGMA user_version = BASE nodes and on top of other WORKING nodes, due to nested tree structure changes. The layers are modelled using the op_depth column. + An 'operation depth' refers to the number of directory levels down from + the WC root at which a tree-change operation (delete, add?, copy, move) + was performed. A row's 'op_depth' does NOT refer to the depth of its own + 'local_relpath', but rather to the depth of the nearest tree change that + affects that node. + + The row with op_depth=0 for any given local relpath represents the base + node that is created and updated by checkout, update, switch and commit + post-processing. The row with the highest op_depth for a particular + local_relpath represents the working version. Any rows with intermediate Strictly speaking: wc_id, local_relpath We just happen to only ever have one wc_id value right now. In the future, should one metadata store be used for multiple working copies, then the wc_id value will become very meaningful. ... Cheers, -g
Re: svn commit: r996811 - in /subversion/trunk/subversion/libsvn_wc: upgrade.c wc.h
On Tue, Sep 14, 2010 at 06:08, rhuij...@apache.org wrote: Author: rhuijben Date: Tue Sep 14 10:08:35 2010 New Revision: 996811 URL: http://svn.apache.org/viewvc?rev=996811view=rev Log: Move some defines that are only used on upgrades from wc.h to upgrade.c. * subversion/libsvn_wc/upgrade.c (SVN_WC__BASE_EXT, SVN_WC__WORK_EXT, SVN_WC__REVERT_EXT): Move here... In the past, I've dropped the SVN prefix when these became local. Cheers, -g
Re: svn commit: r997493 - /subversion/object-model/
On Wed, Sep 15, 2010 at 16:39, hwri...@apache.org wrote: Author: hwright Date: Wed Sep 15 20:39:44 2010 New Revision: 997493 URL: http://svn.apache.org/viewvc?rev=997493view=rev Log: Create the object-model branch, wherein we will attempt to put into practice some of the ideas defined in notes/object-model.txt. This is not just because I like syncing branches with trunk. No, really. (And yes, gstein, there will be a BRANCH-README forthcoming.) Oh, I don't need to see a BRANCH-README in that directory. Maybe a TOPLEVEL-README or somesuch would be appropriate. :-P Cheers, -g
Re: svn commit: r997493 - /subversion/object-model/
On Wed, Sep 15, 2010 at 10:43 PM, Greg Stein gst...@gmail.com wrote: On Wed, Sep 15, 2010 at 16:39, hwri...@apache.org wrote: Author: hwright Date: Wed Sep 15 20:39:44 2010 New Revision: 997493 URL: http://svn.apache.org/viewvc?rev=997493view=rev Log: Create the object-model branch, wherein we will attempt to put into practice some of the ideas defined in notes/object-model.txt. This is not just because I like syncing branches with trunk. No, really. (And yes, gstein, there will be a BRANCH-README forthcoming.) Oh, I don't need to see a BRANCH-README in that directory. Maybe a TOPLEVEL-README or somesuch would be appropriate. Har har. I'm just trying to inflate rev numbers to get r1M. See r997494, -5. -Hyrum
UTF-8 NFC/NFD paths issue
Yesterday, I was talking to CMike about our long-standing issue with UTF-8 strings designating a certain path not neccessarily being equal to other strings designating the same path. The issue has to do with NFC (composed) and NFD (decomposed) representation of Unicode characters. CMike nicely called the issue the Erik Huelsmann issue yesterday :-) The issue consists of two parts: 1. The repository which should determine that paths being added by a commit are unique, regardless of their encoding (NFC/NFD) 2. The client which should detect that the pathnames coming in from the filesystem may differ in encoding from what's in the working copy administrative files [this is mainly an issue on the Mac: http://subversion.tigris.org/issues/show_bug.cgi?id=2464] Mike, the thing I have been trying to find around our filesystem implementation is where an editor drive adding a path [add_directory() or add_file()] checks whether the file already exists. The check at that point should be encoding independent, for example by making all paths NFC (or NFD) before comparison. You could use utf8proc ( http://www.flexiguided.de/publications.utf8proc.en.html) to do the normalization - it's very light-weight in contrast to ICU which provides the same fuctionality, but has a much broader scope. The problem I was telling you about is that I was looking in libsvn_fs_base to find where the existence check is performed, but I couldn't find it. Basically what I was trying to do is: do what we do now (ie fail if the path exists and succeed if it doesn't), with the only difference that the paths used for comparison are guarenteed to be the same normalization - meaning they are the same byte sequence when they're equal unicode. Bye, Erik.
Re: UTF-8 NFC/NFD paths issue
Erik Huelsmann wrote on Wed, Sep 15, 2010 at 23:20:06 +0200: Yesterday, I was talking to CMike about our long-standing issue with UTF-8 strings designating a certain path not neccessarily being equal to other strings designating the same path. The issue has to do with NFC (composed) and NFD (decomposed) representation of Unicode characters. CMike nicely called the issue the Erik Huelsmann issue yesterday :-) The issue consists of two parts: 1. The repository which should determine that paths being added by a commit are unique, regardless of their encoding (NFC/NFD) Will you assume that all paths in the repository have been Unicode-canonicalized prior to entering the repository? If yes, then we infer that no two in-repository paths (which are bytewise different) canonicalize to the same byte sequence. Which is pretty useful precondition to have, i.e., what /can/ svn do on a legacy repository where some two paths are bytewise-different yet Unicode-equal? 2. The client which should detect that the pathnames coming in from the filesystem may differ in encoding from what's in the working copy administrative files [this is mainly an issue on the Mac: http://subversion.tigris.org/issues/show_bug.cgi?id=2464] ... Basically what I was trying to do is: do what we do now (ie fail if the path exists and succeed if it doesn't), with the only difference that the paths used for comparison are guarenteed to be the same normalization - meaning they are the same byte sequence when they're equal unicode.
Re: UTF-8 NFC/NFD paths issue
On Wed, Sep 15, 2010 at 23:35, Daniel Shahaf d...@daniel.shahaf.name wrote: Erik Huelsmann wrote on Wed, Sep 15, 2010 at 23:20:06 +0200: Yesterday, I was talking to CMike about our long-standing issue with UTF-8 strings designating a certain path not neccessarily being equal to other strings designating the same path. The issue has to do with NFC (composed) and NFD (decomposed) representation of Unicode characters. CMike nicely called the issue the Erik Huelsmann issue yesterday :-) The issue consists of two parts: 1. The repository which should determine that paths being added by a commit are unique, regardless of their encoding (NFC/NFD) Will you assume that all paths in the repository have been Unicode-canonicalized prior to entering the repository? If yes, then we infer that no two in-repository paths (which are bytewise different) canonicalize to the same byte sequence. Which is pretty useful precondition to have, i.e., what /can/ svn do on a legacy repository where some two paths are bytewise-different yet Unicode-equal? This will be *very* difficult to manage. Even if a given repository somehow manages to rewrite history to fix the paths, then you may have an unknown number of downstream synchronized repositories to similarly fix. I think an answer might be to rely on the upcoming obliterate feature's out of band change descriptions. For example, a repository might tell a working copy, hey: file XYZ was obliterated since you last talked to me. if you happen to have it, then get rid of it. I won't recognize it henceforth. You can see a similar descriptor sent to working copies or repositories that says I recoded XYZ. update to the new encoding. These change descriptors are effectively annotations that occur outside the standard revision history of a repository. We could use them to transmit path-encoding changes, along with obliteration notices. Cheers, -g