Re: Branching slow 1.8.11 https

2015-04-19 Thread Greg Stein
On Wed, Apr 01, 2015 at 09:41:53AM +0200, Johan Corveleyn wrote:
 On Tue, Mar 31, 2015 at 8:49 PM, Johan Corveleyn jcor...@gmail.com wrote:
  On Tue, Mar 31, 2015 at 2:19 AM, Johan Corveleyn jcor...@gmail.com wrote:
  ...
  I think I've found a workaround: it seems the tree walk by mod_dav is
  avoided when the request has a header Depth with value 0. I've tried
  adding
 
  If %{REQUEST_METHOD} == 'COPY'
  RequestHeader set Depth 0
  /If
 
  Apparently this workaround is specific to httpd 2.4 or higher
  (If/If is only available as of 2.4). Since the problem also exists
  in httpd 2.2.25 or higher, this might be a better way to do this:
 
  SetEnvIf Request_Method COPY method_is_copy
  RequestHeader set Depth 0 env=method_is_copy
 
  This should work both in 2.4 and 2.2.
 
 
 This problem and its workaround are now documented in our FAQ:
 http://subversion.apache.org/faq.html#dav-slow-copy

It has also been fixed on trunk (1.10), and nominated for backport to
1.8 and 1.9 (I lay good options on that happening).

(reference: issue 4531)

Cheers,
-g


branching over mod_dav 2.4.6 is O(tree) (was: Re: Branching slow 1.8.11 https)

2015-04-01 Thread Daniel Shahaf
[ moving to dev@, please remove users@ from replies ]

Johan Corveleyn wrote on Sun, Mar 29, 2015 at 19:57:34 +0200:
 On Sat, Mar 28, 2015 at 5:09 PM, Bert Huijben b...@qqmail.nl wrote:
  Httpd's mod_dav was updated in some recent version to do a full lock
  traversal on copies and moves. I think we already applied some
  optimizations, but the real fix would be that mod_dav shouldn't do
  this work (which our repos layer already does).
 
  I'm not sure which release we applied the first set of optimizations.
 
 
 Thanks for refreshing my memory.
 
 So the problem is known as issue #4531 (server-side copy (over dav)
 uses too much memory) [1]. The memory usage issue has been fixed in
 SVN 1.8.11 and 1.7.19 (see CHANGES), but a performance problem remains
 (copy is no longer O(1), but depends on the size of the tree being
 copied). That's a direct violation of one of Subversion's old selling
 points vs. CVS: that branching / tagging is O(1). Branching / tagging
 taking several minutes brings back fond memories from CVS' days.
 
 As Philip pointed out in his last comment on #4531 [2]: This issue is
 related to a change in mod_dav in 2.2.25 to fix PR54610 which
 added a walk over the copy source looking for lock tokens. (also
 released in 2.4.5; so both httpd 2.2.25+ and 2.4.5+ are affected --
 older httpd's won't have this problem I guess).
 
 Again quoting Philip: Apache knows in advance that the walk is
 redundant in cases such as Subversion's URL-to-URL copy but Subversion
 cannot avoid the read access. We should attempt to fix mod_dav to
 avoid the walk where possible.
 
 So my hope rests with Philip and others who might have the necessary
 knowledge to fix this in mod_dav. It's really not acceptable that
 branching / tagging (or I'm guessing also: moving a large tree with a
 server-side move) takes several minutes.

So, what will a mod_dav fix look like?  I understand the issue is that
it walks the copy source for locks.  Should it stop doing that?  Should
it allow the backend module (mod_dav_svn / mod_dav_fs) to implement the
walk in a more efficient manner — for example, adding a Are there any
locks under path X hook that the backend module could implement?

(Rather than the current design, which AIUI is the backend walks the
tree and mod_dav calls Is path X locked for each path reported by the
backend.)

Daniel


Re: Branching slow 1.8.11 https

2015-03-31 Thread Johan Corveleyn
On Tue, Mar 31, 2015 at 2:19 AM, Johan Corveleyn jcor...@gmail.com wrote:
 On Sun, Mar 29, 2015 at 7:57 PM, Johan Corveleyn jcor...@gmail.com wrote:
 On Sat, Mar 28, 2015 at 5:09 PM, Bert Huijben b...@qqmail.nl wrote:


 -Original Message-
 From: Johan Corveleyn [mailto:jcor...@gmail.com]
 Sent: vrijdag 27 maart 2015 22:03
 To: users@subversion.apache.org
 Subject: Branching slow 1.8.11 https

 Does the following ring a bell for someone?

 Recently upgraded our server (on Solaris 10 SPARC) from 1.5.4 to
 1.8.11 (CollabNet package). Some time after that, we discovered that
 branching was very slow. I'm talking about pure server-side branching
 ('svn copy $URL/trunk $URL/branches/br1'). I'm testing with a 1.8.11
 client (tried both from same machine as the server, and from another
 machine on the LAN (100 Mbit)).

 - Branching trunk (containing many directories and files): 6-8 minutes
 - Branching a subfolder of trunk: 20-30 seconds (still very slow)
 - Branching a single file is fast ( 0.5s or so).

 So it seems the performance degrades depending on the depth or size of the
 tree.

 Now, it gets more interesting:
 - The resulting rev file on the server is always very small (as it
 should be, it contains only a lightweight 'copy' of the trunk node).
 - Our repos is currently served via https (Apache 2.2.29).
 - Branching with file:/// urls is fast (branching trunk takes 0.6s).
 - When starting an svnserve instance serving the same repository, and
 branching with svn:// urls, it's fast as well (also 0.6s).
 - We reproduced it on a copy of the production repo.
 - Experimenting with the test copy, we found that
 $repos/dav/activities.d contains ~2000 files. When we clear that
 directory, the branching times go down by more than half (~2 minutes
 for trunk, ~10s for subdir of trunk --- i.e. still slow, but it
 definitely has an impact).
 - With a 1.7 client connecting with neon, the problem is the same.
 - During the 'svn copy', an httpd child consumes a lot of cpu (around
 half a core).
 - There is no authz configured for this repo (SVNPathAuthz off).
 - Backend is still in 1.5 format (we have not run svnadmin upgrade
 yet, a dump+load is planned in a couple of weeks).

 So it seems clearly mod_dav_svn related (and not for instance related
 to the FSFS backend).

 I don't think we have anything special in our httpd config:
 [[[
Location /test_svn
   SVNInMemoryCacheSize 131072
   SVNCacheFullTexts on
   SVNCacheTextDeltas on
   SSLRequireSSL
   AuthName TEST Subversion Repository
   AuthType Basic
   AuthBasicProvider ldap
   AuthBasicAuthoritative off
   AuthLDAPURL ldap://redacted:389;
   AuthLDAPBindDN redacted
   AuthLDAPBindPassword redacted
   Require ldap-group redacted
   DAV svn
   SVNPath /path/to/test_repos
   SVNPathAuthz off
/Location
 ]]]

 Any ideas?
 Why the cpu usage by the server, what's it doing?
 What is the dav/activities.d directory for? How come it contains so
 many files? Is it ok to purge the old files from that directory?

 Httpd's mod_dav was updated in some recent version to do a full lock 
 traversal on copies and moves. I think we already applied some 
 optimizations, but the real fix would be that mod_dav shouldn't do this 
 work (which our repos layer already does).

 I'm not sure which release we applied the first set of optimizations.


 Thanks for refreshing my memory.

 So the problem is known as issue #4531 (server-side copy (over dav)
 uses too much memory) [1]. The memory usage issue has been fixed in
 SVN 1.8.11 and 1.7.19 (see CHANGES), but a performance problem remains
 (copy is no longer O(1), but depends on the size of the tree being
 copied). That's a direct violation of one of Subversion's old selling
 points vs. CVS: that branching / tagging is O(1). Branching / tagging
 taking several minutes brings back fond memories from CVS' days.

 As Philip pointed out in his last comment on #4531 [2]: This issue is
 related to a change in mod_dav in 2.2.25 to fix PR54610 which
 added a walk over the copy source looking for lock tokens. (also
 released in 2.4.5; so both httpd 2.2.25+ and 2.4.5+ are affected --
 older httpd's won't have this problem I guess).

 Again quoting Philip: Apache knows in advance that the walk is
 redundant in cases such as Subversion's URL-to-URL copy but Subversion
 cannot avoid the read access. We should attempt to fix mod_dav to
 avoid the walk where possible.

 So my hope rests with Philip and others who might have the necessary
 knowledge to fix this in mod_dav. It's really not acceptable that
 branching / tagging (or I'm guessing also: moving a large tree with a
 server-side move) takes several minutes.

 [1] http://subversion.tigris.org/issues/show_bug.cgi?id=4531
 [2] http://subversion.tigris.org/issues/show_bug.cgi?id=4531#desc12

 I think I've found a workaround: it seems the tree walk by mod_dav is
 avoided when the request has a header Depth with value 0. I've 

Re: Branching slow 1.8.11 https

2015-03-31 Thread Mark Phippard

 On Mar 31, 2015, at 8:13 AM, Johan Corveleyn jcor...@gmail.com wrote:
 
 On Tue, Mar 31, 2015 at 2:19 AM, Johan Corveleyn jcor...@gmail.com wrote:
 On Sun, Mar 29, 2015 at 7:57 PM, Johan Corveleyn jcor...@gmail.com wrote:
 On Sat, Mar 28, 2015 at 5:09 PM, Bert Huijben b...@qqmail.nl wrote:
 
 
 -Original Message-
 From: Johan Corveleyn [mailto:jcor...@gmail.com]
 Sent: vrijdag 27 maart 2015 22:03
 To: users@subversion.apache.org
 Subject: Branching slow 1.8.11 https
 
 Does the following ring a bell for someone?
 
 Recently upgraded our server (on Solaris 10 SPARC) from 1.5.4 to
 1.8.11 (CollabNet package). Some time after that, we discovered that
 branching was very slow. I'm talking about pure server-side branching
 ('svn copy $URL/trunk $URL/branches/br1'). I'm testing with a 1.8.11
 client (tried both from same machine as the server, and from another
 machine on the LAN (100 Mbit)).
 
 - Branching trunk (containing many directories and files): 6-8 minutes
 - Branching a subfolder of trunk: 20-30 seconds (still very slow)
 - Branching a single file is fast ( 0.5s or so).
 
 So it seems the performance degrades depending on the depth or size of the
 tree.
 
 Now, it gets more interesting:
 - The resulting rev file on the server is always very small (as it
 should be, it contains only a lightweight 'copy' of the trunk node).
 - Our repos is currently served via https (Apache 2.2.29).
 - Branching with file:/// urls is fast (branching trunk takes 0.6s).
 - When starting an svnserve instance serving the same repository, and
 branching with svn:// urls, it's fast as well (also 0.6s).
 - We reproduced it on a copy of the production repo.
 - Experimenting with the test copy, we found that
 $repos/dav/activities.d contains ~2000 files. When we clear that
 directory, the branching times go down by more than half (~2 minutes
 for trunk, ~10s for subdir of trunk --- i.e. still slow, but it
 definitely has an impact).
 - With a 1.7 client connecting with neon, the problem is the same.
 - During the 'svn copy', an httpd child consumes a lot of cpu (around
 half a core).
 - There is no authz configured for this repo (SVNPathAuthz off).
 - Backend is still in 1.5 format (we have not run svnadmin upgrade
 yet, a dump+load is planned in a couple of weeks).
 
 So it seems clearly mod_dav_svn related (and not for instance related
 to the FSFS backend).
 
 I don't think we have anything special in our httpd config:
 [[[
   Location /test_svn
  SVNInMemoryCacheSize 131072
  SVNCacheFullTexts on
  SVNCacheTextDeltas on
  SSLRequireSSL
  AuthName TEST Subversion Repository
  AuthType Basic
  AuthBasicProvider ldap
  AuthBasicAuthoritative off
  AuthLDAPURL ldap://redacted:389;
  AuthLDAPBindDN redacted
  AuthLDAPBindPassword redacted
  Require ldap-group redacted
  DAV svn
  SVNPath /path/to/test_repos
  SVNPathAuthz off
   /Location
 ]]]
 
 Any ideas?
 Why the cpu usage by the server, what's it doing?
 What is the dav/activities.d directory for? How come it contains so
 many files? Is it ok to purge the old files from that directory?
 
 Httpd's mod_dav was updated in some recent version to do a full lock 
 traversal on copies and moves. I think we already applied some 
 optimizations, but the real fix would be that mod_dav shouldn't do this 
 work (which our repos layer already does).
 
 I'm not sure which release we applied the first set of optimizations.
 
 Thanks for refreshing my memory.
 
 So the problem is known as issue #4531 (server-side copy (over dav)
 uses too much memory) [1]. The memory usage issue has been fixed in
 SVN 1.8.11 and 1.7.19 (see CHANGES), but a performance problem remains
 (copy is no longer O(1), but depends on the size of the tree being
 copied). That's a direct violation of one of Subversion's old selling
 points vs. CVS: that branching / tagging is O(1). Branching / tagging
 taking several minutes brings back fond memories from CVS' days.
 
 As Philip pointed out in his last comment on #4531 [2]: This issue is
 related to a change in mod_dav in 2.2.25 to fix PR54610 which
 added a walk over the copy source looking for lock tokens. (also
 released in 2.4.5; so both httpd 2.2.25+ and 2.4.5+ are affected --
 older httpd's won't have this problem I guess).
 
 Again quoting Philip: Apache knows in advance that the walk is
 redundant in cases such as Subversion's URL-to-URL copy but Subversion
 cannot avoid the read access. We should attempt to fix mod_dav to
 avoid the walk where possible.
 
 So my hope rests with Philip and others who might have the necessary
 knowledge to fix this in mod_dav. It's really not acceptable that
 branching / tagging (or I'm guessing also: moving a large tree with a
 server-side move) takes several minutes.
 
 [1] http://subversion.tigris.org/issues/show_bug.cgi?id=4531
 [2] http://subversion.tigris.org/issues/show_bug.cgi?id=4531#desc12
 
 I think I've found a workaround: it seems the tree walk by 

Re: Branching slow 1.8.11 https

2015-03-31 Thread Johan Corveleyn
On Tue, Mar 31, 2015 at 2:19 AM, Johan Corveleyn jcor...@gmail.com wrote:
...
 I think I've found a workaround: it seems the tree walk by mod_dav is
 avoided when the request has a header Depth with value 0. I've tried
 adding

 If %{REQUEST_METHOD} == 'COPY'
 RequestHeader set Depth 0
 /If

Apparently this workaround is specific to httpd 2.4 or higher
(If/If is only available as of 2.4). Since the problem also exists
in httpd 2.2.25 or higher, this might be a better way to do this:

SetEnvIf Request_Method COPY method_is_copy
RequestHeader set Depth 0 env=method_is_copy

This should work both in 2.4 and 2.2.

-- 
Johan


RE: Branching slow 1.8.11 https

2015-03-31 Thread Bert Huijben


 -Original Message-
 From: Johan Corveleyn [mailto:jcor...@gmail.com]
 Sent: dinsdag 31 maart 2015 14:13
 To: users@subversion.apache.org
 Cc: Bert Huijben; Philip Martin; Ben Reser
 Subject: Re: Branching slow 1.8.11 https
 
 On Tue, Mar 31, 2015 at 2:19 AM, Johan Corveleyn jcor...@gmail.com
 wrote:
  On Sun, Mar 29, 2015 at 7:57 PM, Johan Corveleyn jcor...@gmail.com
 wrote:
  On Sat, Mar 28, 2015 at 5:09 PM, Bert Huijben b...@qqmail.nl wrote:
 
 
  -Original Message-
  From: Johan Corveleyn [mailto:jcor...@gmail.com]
  Sent: vrijdag 27 maart 2015 22:03
  To: users@subversion.apache.org
  Subject: Branching slow 1.8.11 https
 
  Does the following ring a bell for someone?
 
  Recently upgraded our server (on Solaris 10 SPARC) from 1.5.4 to
  1.8.11 (CollabNet package). Some time after that, we discovered that
  branching was very slow. I'm talking about pure server-side branching
  ('svn copy $URL/trunk $URL/branches/br1'). I'm testing with a 1.8.11
  client (tried both from same machine as the server, and from another
  machine on the LAN (100 Mbit)).
 
  - Branching trunk (containing many directories and files): 6-8 minutes
  - Branching a subfolder of trunk: 20-30 seconds (still very slow)
  - Branching a single file is fast ( 0.5s or so).
 
  So it seems the performance degrades depending on the depth or size of
 the
  tree.
 
  Now, it gets more interesting:
  - The resulting rev file on the server is always very small (as it
  should be, it contains only a lightweight 'copy' of the trunk node).
  - Our repos is currently served via https (Apache 2.2.29).
  - Branching with file:/// urls is fast (branching trunk takes 0.6s).
  - When starting an svnserve instance serving the same repository, and
  branching with svn:// urls, it's fast as well (also 0.6s).
  - We reproduced it on a copy of the production repo.
  - Experimenting with the test copy, we found that
  $repos/dav/activities.d contains ~2000 files. When we clear that
  directory, the branching times go down by more than half (~2 minutes
  for trunk, ~10s for subdir of trunk --- i.e. still slow, but it
  definitely has an impact).
  - With a 1.7 client connecting with neon, the problem is the same.
  - During the 'svn copy', an httpd child consumes a lot of cpu (around
  half a core).
  - There is no authz configured for this repo (SVNPathAuthz off).
  - Backend is still in 1.5 format (we have not run svnadmin upgrade
  yet, a dump+load is planned in a couple of weeks).
 
  So it seems clearly mod_dav_svn related (and not for instance related
  to the FSFS backend).
 
  I don't think we have anything special in our httpd config:
  [[[
 Location /test_svn
SVNInMemoryCacheSize 131072
SVNCacheFullTexts on
SVNCacheTextDeltas on
SSLRequireSSL
AuthName TEST Subversion Repository
AuthType Basic
AuthBasicProvider ldap
AuthBasicAuthoritative off
AuthLDAPURL ldap://redacted:389;
AuthLDAPBindDN redacted
AuthLDAPBindPassword redacted
Require ldap-group redacted
DAV svn
SVNPath /path/to/test_repos
SVNPathAuthz off
 /Location
  ]]]
 
  Any ideas?
  Why the cpu usage by the server, what's it doing?
  What is the dav/activities.d directory for? How come it contains so
  many files? Is it ok to purge the old files from that directory?
 
  Httpd's mod_dav was updated in some recent version to do a full lock
 traversal on copies and moves. I think we already applied some optimizations,
 but the real fix would be that mod_dav shouldn't do this work (which our repos
 layer already does).
 
  I'm not sure which release we applied the first set of optimizations.
 
 
  Thanks for refreshing my memory.
 
  So the problem is known as issue #4531 (server-side copy (over dav)
  uses too much memory) [1]. The memory usage issue has been fixed in
  SVN 1.8.11 and 1.7.19 (see CHANGES), but a performance problem remains
  (copy is no longer O(1), but depends on the size of the tree being
  copied). That's a direct violation of one of Subversion's old selling
  points vs. CVS: that branching / tagging is O(1). Branching / tagging
  taking several minutes brings back fond memories from CVS' days.
 
  As Philip pointed out in his last comment on #4531 [2]: This issue is
  related to a change in mod_dav in 2.2.25 to fix PR54610 which
  added a walk over the copy source looking for lock tokens. (also
  released in 2.4.5; so both httpd 2.2.25+ and 2.4.5+ are affected --
  older httpd's won't have this problem I guess).
 
  Again quoting Philip: Apache knows in advance that the walk is
  redundant in cases such as Subversion's URL-to-URL copy but Subversion
  cannot avoid the read access. We should attempt to fix mod_dav to
  avoid the walk where possible.
 
  So my hope rests with Philip and others who might have the necessary
  knowledge to fix this in mod_dav. It's really not acceptable that
  branching / tagging (or I'm guessing

Re: Branching slow 1.8.11 https

2015-03-31 Thread Branko Čibej
On 31.03.2015 14:43, Mark Phippard wrote:
 On Mar 31, 2015, at 8:13 AM, Johan Corveleyn jcor...@gmail.com wrote:

 On Tue, Mar 31, 2015 at 2:19 AM, Johan Corveleyn jcor...@gmail.com wrote:
 On Sun, Mar 29, 2015 at 7:57 PM, Johan Corveleyn jcor...@gmail.com wrote:
 On Sat, Mar 28, 2015 at 5:09 PM, Bert Huijben b...@qqmail.nl wrote:


 -Original Message-
 From: Johan Corveleyn [mailto:jcor...@gmail.com]
 Sent: vrijdag 27 maart 2015 22:03
 To: users@subversion.apache.org
 Subject: Branching slow 1.8.11 https

 Does the following ring a bell for someone?

 Recently upgraded our server (on Solaris 10 SPARC) from 1.5.4 to
 1.8.11 (CollabNet package). Some time after that, we discovered that
 branching was very slow. I'm talking about pure server-side branching
 ('svn copy $URL/trunk $URL/branches/br1'). I'm testing with a 1.8.11
 client (tried both from same machine as the server, and from another
 machine on the LAN (100 Mbit)).

 - Branching trunk (containing many directories and files): 6-8 minutes
 - Branching a subfolder of trunk: 20-30 seconds (still very slow)
 - Branching a single file is fast ( 0.5s or so).

 So it seems the performance degrades depending on the depth or size of 
 the
 tree.

 Now, it gets more interesting:
 - The resulting rev file on the server is always very small (as it
 should be, it contains only a lightweight 'copy' of the trunk node).
 - Our repos is currently served via https (Apache 2.2.29).
 - Branching with file:/// urls is fast (branching trunk takes 0.6s).
 - When starting an svnserve instance serving the same repository, and
 branching with svn:// urls, it's fast as well (also 0.6s).
 - We reproduced it on a copy of the production repo.
 - Experimenting with the test copy, we found that
 $repos/dav/activities.d contains ~2000 files. When we clear that
 directory, the branching times go down by more than half (~2 minutes
 for trunk, ~10s for subdir of trunk --- i.e. still slow, but it
 definitely has an impact).
 - With a 1.7 client connecting with neon, the problem is the same.
 - During the 'svn copy', an httpd child consumes a lot of cpu (around
 half a core).
 - There is no authz configured for this repo (SVNPathAuthz off).
 - Backend is still in 1.5 format (we have not run svnadmin upgrade
 yet, a dump+load is planned in a couple of weeks).

 So it seems clearly mod_dav_svn related (and not for instance related
 to the FSFS backend).

 I don't think we have anything special in our httpd config:
 [[[
   Location /test_svn
  SVNInMemoryCacheSize 131072
  SVNCacheFullTexts on
  SVNCacheTextDeltas on
  SSLRequireSSL
  AuthName TEST Subversion Repository
  AuthType Basic
  AuthBasicProvider ldap
  AuthBasicAuthoritative off
  AuthLDAPURL ldap://redacted:389;
  AuthLDAPBindDN redacted
  AuthLDAPBindPassword redacted
  Require ldap-group redacted
  DAV svn
  SVNPath /path/to/test_repos
  SVNPathAuthz off
   /Location
 ]]]

 Any ideas?
 Why the cpu usage by the server, what's it doing?
 What is the dav/activities.d directory for? How come it contains so
 many files? Is it ok to purge the old files from that directory?
 Httpd's mod_dav was updated in some recent version to do a full lock 
 traversal on copies and moves. I think we already applied some 
 optimizations, but the real fix would be that mod_dav shouldn't do this 
 work (which our repos layer already does).

 I'm not sure which release we applied the first set of optimizations.
 Thanks for refreshing my memory.

 So the problem is known as issue #4531 (server-side copy (over dav)
 uses too much memory) [1]. The memory usage issue has been fixed in
 SVN 1.8.11 and 1.7.19 (see CHANGES), but a performance problem remains
 (copy is no longer O(1), but depends on the size of the tree being
 copied). That's a direct violation of one of Subversion's old selling
 points vs. CVS: that branching / tagging is O(1). Branching / tagging
 taking several minutes brings back fond memories from CVS' days.

 As Philip pointed out in his last comment on #4531 [2]: This issue is
 related to a change in mod_dav in 2.2.25 to fix PR54610 which
 added a walk over the copy source looking for lock tokens. (also
 released in 2.4.5; so both httpd 2.2.25+ and 2.4.5+ are affected --
 older httpd's won't have this problem I guess).

 Again quoting Philip: Apache knows in advance that the walk is
 redundant in cases such as Subversion's URL-to-URL copy but Subversion
 cannot avoid the read access. We should attempt to fix mod_dav to
 avoid the walk where possible.

 So my hope rests with Philip and others who might have the necessary
 knowledge to fix this in mod_dav. It's really not acceptable that
 branching / tagging (or I'm guessing also: moving a large tree with a
 server-side move) takes several minutes.

 [1] http://subversion.tigris.org/issues/show_bug.cgi?id=4531
 [2] http://subversion.tigris.org/issues/show_bug.cgi?id=4531#desc12
 I think I've found a workaround: it 

Re: Branching slow 1.8.11 https

2015-03-30 Thread Johan Corveleyn
On Sun, Mar 29, 2015 at 7:57 PM, Johan Corveleyn jcor...@gmail.com wrote:
 On Sat, Mar 28, 2015 at 5:09 PM, Bert Huijben b...@qqmail.nl wrote:


 -Original Message-
 From: Johan Corveleyn [mailto:jcor...@gmail.com]
 Sent: vrijdag 27 maart 2015 22:03
 To: users@subversion.apache.org
 Subject: Branching slow 1.8.11 https

 Does the following ring a bell for someone?

 Recently upgraded our server (on Solaris 10 SPARC) from 1.5.4 to
 1.8.11 (CollabNet package). Some time after that, we discovered that
 branching was very slow. I'm talking about pure server-side branching
 ('svn copy $URL/trunk $URL/branches/br1'). I'm testing with a 1.8.11
 client (tried both from same machine as the server, and from another
 machine on the LAN (100 Mbit)).

 - Branching trunk (containing many directories and files): 6-8 minutes
 - Branching a subfolder of trunk: 20-30 seconds (still very slow)
 - Branching a single file is fast ( 0.5s or so).

 So it seems the performance degrades depending on the depth or size of the
 tree.

 Now, it gets more interesting:
 - The resulting rev file on the server is always very small (as it
 should be, it contains only a lightweight 'copy' of the trunk node).
 - Our repos is currently served via https (Apache 2.2.29).
 - Branching with file:/// urls is fast (branching trunk takes 0.6s).
 - When starting an svnserve instance serving the same repository, and
 branching with svn:// urls, it's fast as well (also 0.6s).
 - We reproduced it on a copy of the production repo.
 - Experimenting with the test copy, we found that
 $repos/dav/activities.d contains ~2000 files. When we clear that
 directory, the branching times go down by more than half (~2 minutes
 for trunk, ~10s for subdir of trunk --- i.e. still slow, but it
 definitely has an impact).
 - With a 1.7 client connecting with neon, the problem is the same.
 - During the 'svn copy', an httpd child consumes a lot of cpu (around
 half a core).
 - There is no authz configured for this repo (SVNPathAuthz off).
 - Backend is still in 1.5 format (we have not run svnadmin upgrade
 yet, a dump+load is planned in a couple of weeks).

 So it seems clearly mod_dav_svn related (and not for instance related
 to the FSFS backend).

 I don't think we have anything special in our httpd config:
 [[[
Location /test_svn
   SVNInMemoryCacheSize 131072
   SVNCacheFullTexts on
   SVNCacheTextDeltas on
   SSLRequireSSL
   AuthName TEST Subversion Repository
   AuthType Basic
   AuthBasicProvider ldap
   AuthBasicAuthoritative off
   AuthLDAPURL ldap://redacted:389;
   AuthLDAPBindDN redacted
   AuthLDAPBindPassword redacted
   Require ldap-group redacted
   DAV svn
   SVNPath /path/to/test_repos
   SVNPathAuthz off
/Location
 ]]]

 Any ideas?
 Why the cpu usage by the server, what's it doing?
 What is the dav/activities.d directory for? How come it contains so
 many files? Is it ok to purge the old files from that directory?

 Httpd's mod_dav was updated in some recent version to do a full lock 
 traversal on copies and moves. I think we already applied some 
 optimizations, but the real fix would be that mod_dav shouldn't do this work 
 (which our repos layer already does).

 I'm not sure which release we applied the first set of optimizations.


 Thanks for refreshing my memory.

 So the problem is known as issue #4531 (server-side copy (over dav)
 uses too much memory) [1]. The memory usage issue has been fixed in
 SVN 1.8.11 and 1.7.19 (see CHANGES), but a performance problem remains
 (copy is no longer O(1), but depends on the size of the tree being
 copied). That's a direct violation of one of Subversion's old selling
 points vs. CVS: that branching / tagging is O(1). Branching / tagging
 taking several minutes brings back fond memories from CVS' days.

 As Philip pointed out in his last comment on #4531 [2]: This issue is
 related to a change in mod_dav in 2.2.25 to fix PR54610 which
 added a walk over the copy source looking for lock tokens. (also
 released in 2.4.5; so both httpd 2.2.25+ and 2.4.5+ are affected --
 older httpd's won't have this problem I guess).

 Again quoting Philip: Apache knows in advance that the walk is
 redundant in cases such as Subversion's URL-to-URL copy but Subversion
 cannot avoid the read access. We should attempt to fix mod_dav to
 avoid the walk where possible.

 So my hope rests with Philip and others who might have the necessary
 knowledge to fix this in mod_dav. It's really not acceptable that
 branching / tagging (or I'm guessing also: moving a large tree with a
 server-side move) takes several minutes.

 [1] http://subversion.tigris.org/issues/show_bug.cgi?id=4531
 [2] http://subversion.tigris.org/issues/show_bug.cgi?id=4531#desc12

I think I've found a workaround: it seems the tree walk by mod_dav is
avoided when the request has a header Depth with value 0. I've tried
adding

If %{REQUEST_METHOD} == 'COPY'
RequestHeader set 

Re: Branching slow 1.8.11 https

2015-03-29 Thread Johan Corveleyn
On Sat, Mar 28, 2015 at 5:09 PM, Bert Huijben b...@qqmail.nl wrote:


 -Original Message-
 From: Johan Corveleyn [mailto:jcor...@gmail.com]
 Sent: vrijdag 27 maart 2015 22:03
 To: users@subversion.apache.org
 Subject: Branching slow 1.8.11 https

 Does the following ring a bell for someone?

 Recently upgraded our server (on Solaris 10 SPARC) from 1.5.4 to
 1.8.11 (CollabNet package). Some time after that, we discovered that
 branching was very slow. I'm talking about pure server-side branching
 ('svn copy $URL/trunk $URL/branches/br1'). I'm testing with a 1.8.11
 client (tried both from same machine as the server, and from another
 machine on the LAN (100 Mbit)).

 - Branching trunk (containing many directories and files): 6-8 minutes
 - Branching a subfolder of trunk: 20-30 seconds (still very slow)
 - Branching a single file is fast ( 0.5s or so).

 So it seems the performance degrades depending on the depth or size of the
 tree.

 Now, it gets more interesting:
 - The resulting rev file on the server is always very small (as it
 should be, it contains only a lightweight 'copy' of the trunk node).
 - Our repos is currently served via https (Apache 2.2.29).
 - Branching with file:/// urls is fast (branching trunk takes 0.6s).
 - When starting an svnserve instance serving the same repository, and
 branching with svn:// urls, it's fast as well (also 0.6s).
 - We reproduced it on a copy of the production repo.
 - Experimenting with the test copy, we found that
 $repos/dav/activities.d contains ~2000 files. When we clear that
 directory, the branching times go down by more than half (~2 minutes
 for trunk, ~10s for subdir of trunk --- i.e. still slow, but it
 definitely has an impact).
 - With a 1.7 client connecting with neon, the problem is the same.
 - During the 'svn copy', an httpd child consumes a lot of cpu (around
 half a core).
 - There is no authz configured for this repo (SVNPathAuthz off).
 - Backend is still in 1.5 format (we have not run svnadmin upgrade
 yet, a dump+load is planned in a couple of weeks).

 So it seems clearly mod_dav_svn related (and not for instance related
 to the FSFS backend).

 I don't think we have anything special in our httpd config:
 [[[
Location /test_svn
   SVNInMemoryCacheSize 131072
   SVNCacheFullTexts on
   SVNCacheTextDeltas on
   SSLRequireSSL
   AuthName TEST Subversion Repository
   AuthType Basic
   AuthBasicProvider ldap
   AuthBasicAuthoritative off
   AuthLDAPURL ldap://redacted:389;
   AuthLDAPBindDN redacted
   AuthLDAPBindPassword redacted
   Require ldap-group redacted
   DAV svn
   SVNPath /path/to/test_repos
   SVNPathAuthz off
/Location
 ]]]

 Any ideas?
 Why the cpu usage by the server, what's it doing?
 What is the dav/activities.d directory for? How come it contains so
 many files? Is it ok to purge the old files from that directory?

 Httpd's mod_dav was updated in some recent version to do a full lock 
 traversal on copies and moves. I think we already applied some optimizations, 
 but the real fix would be that mod_dav shouldn't do this work (which our 
 repos layer already does).

 I'm not sure which release we applied the first set of optimizations.


Thanks for refreshing my memory.

So the problem is known as issue #4531 (server-side copy (over dav)
uses too much memory) [1]. The memory usage issue has been fixed in
SVN 1.8.11 and 1.7.19 (see CHANGES), but a performance problem remains
(copy is no longer O(1), but depends on the size of the tree being
copied). That's a direct violation of one of Subversion's old selling
points vs. CVS: that branching / tagging is O(1). Branching / tagging
taking several minutes brings back fond memories from CVS' days.

As Philip pointed out in his last comment on #4531 [2]: This issue is
related to a change in mod_dav in 2.2.25 to fix PR54610 which
added a walk over the copy source looking for lock tokens. (also
released in 2.4.5; so both httpd 2.2.25+ and 2.4.5+ are affected --
older httpd's won't have this problem I guess).

Again quoting Philip: Apache knows in advance that the walk is
redundant in cases such as Subversion's URL-to-URL copy but Subversion
cannot avoid the read access. We should attempt to fix mod_dav to
avoid the walk where possible.

So my hope rests with Philip and others who might have the necessary
knowledge to fix this in mod_dav. It's really not acceptable that
branching / tagging (or I'm guessing also: moving a large tree with a
server-side move) takes several minutes.

[1] http://subversion.tigris.org/issues/show_bug.cgi?id=4531
[2] http://subversion.tigris.org/issues/show_bug.cgi?id=4531#desc12

-- 
Johan


RE: Branching slow 1.8.11 https

2015-03-28 Thread Bert Huijben


 -Original Message-
 From: Johan Corveleyn [mailto:jcor...@gmail.com]
 Sent: vrijdag 27 maart 2015 22:03
 To: users@subversion.apache.org
 Subject: Branching slow 1.8.11 https
 
 Does the following ring a bell for someone?
 
 Recently upgraded our server (on Solaris 10 SPARC) from 1.5.4 to
 1.8.11 (CollabNet package). Some time after that, we discovered that
 branching was very slow. I'm talking about pure server-side branching
 ('svn copy $URL/trunk $URL/branches/br1'). I'm testing with a 1.8.11
 client (tried both from same machine as the server, and from another
 machine on the LAN (100 Mbit)).
 
 - Branching trunk (containing many directories and files): 6-8 minutes
 - Branching a subfolder of trunk: 20-30 seconds (still very slow)
 - Branching a single file is fast ( 0.5s or so).
 
 So it seems the performance degrades depending on the depth or size of the
 tree.
 
 Now, it gets more interesting:
 - The resulting rev file on the server is always very small (as it
 should be, it contains only a lightweight 'copy' of the trunk node).
 - Our repos is currently served via https (Apache 2.2.29).
 - Branching with file:/// urls is fast (branching trunk takes 0.6s).
 - When starting an svnserve instance serving the same repository, and
 branching with svn:// urls, it's fast as well (also 0.6s).
 - We reproduced it on a copy of the production repo.
 - Experimenting with the test copy, we found that
 $repos/dav/activities.d contains ~2000 files. When we clear that
 directory, the branching times go down by more than half (~2 minutes
 for trunk, ~10s for subdir of trunk --- i.e. still slow, but it
 definitely has an impact).
 - With a 1.7 client connecting with neon, the problem is the same.
 - During the 'svn copy', an httpd child consumes a lot of cpu (around
 half a core).
 - There is no authz configured for this repo (SVNPathAuthz off).
 - Backend is still in 1.5 format (we have not run svnadmin upgrade
 yet, a dump+load is planned in a couple of weeks).
 
 So it seems clearly mod_dav_svn related (and not for instance related
 to the FSFS backend).
 
 I don't think we have anything special in our httpd config:
 [[[
Location /test_svn
   SVNInMemoryCacheSize 131072
   SVNCacheFullTexts on
   SVNCacheTextDeltas on
   SSLRequireSSL
   AuthName TEST Subversion Repository
   AuthType Basic
   AuthBasicProvider ldap
   AuthBasicAuthoritative off
   AuthLDAPURL ldap://redacted:389;
   AuthLDAPBindDN redacted
   AuthLDAPBindPassword redacted
   Require ldap-group redacted
   DAV svn
   SVNPath /path/to/test_repos
   SVNPathAuthz off
/Location
 ]]]
 
 Any ideas?
 Why the cpu usage by the server, what's it doing?
 What is the dav/activities.d directory for? How come it contains so
 many files? Is it ok to purge the old files from that directory?

Httpd's mod_dav was updated in some recent version to do a full lock traversal 
on copies and moves. I think we already applied some optimizations, but the 
real fix would be that mod_dav shouldn't do this work (which our repos layer 
already does).

I'm not sure which release we applied the first set of optimizations.

Bert
 
 --
 Johan