Hi guys,

[root@repos ~]# rpm -qa | grep subversion
subversion-libs-1.14.5-3.el10.x86_64_v2
subversion-1.14.5-3.el10.x86_64_v2
subversion-tools-1.14.5-3.el10.x86_64_v2
[root@repos ~]#
[root@repos ~]#
[root@repos ~]# cat /etc/redhat-release
AlmaLinux release 10.1 (Heliotrope Lion)
[root@repos ~]#
[root@repos ~]# rpm -qa | grep svn
mod_dav_svn-1.14.5-3.el10.x86_64_v2


I think I have hit a bug.  We have a relatively large SVN repository -
About 2TB uncompressed.  2 weeks ago, someone noticed that we can't commit
any more.  When we looked at it, it looked like a permission issue.

This is the error we are getting from the SVN client.

svn: E000001: Commit failed (details follow):
svn: E000001: Can't set permissions on
'/var/repos/svn/application/db/revs/270'
william@william-ryzen:~/Documents/application$ [email protected]

So we first attempted to "FIX" permissions.  Didn't work.  Then, when
reading, I noticed opening up the permission is not a good idea as I can
see from strace that svn does try to narrow down the file permissions.  So
we decided to restore a fresh to remove all the chmod we had done to the
files and directory.

This time, when restoring, we created the repo using an apache account.  We
also loaded the dump file using the svn account.  This removed all
possibility this is a file permission issue.  The repo was restored
successfully, BUT the problem remained.

We looked at the file limit too and to make sure that isn't causing any
issues, we changed it to unlimited for the apache user.  The problem
remained.

It's at this point we attempted to use strace to see what svn is doing.
Oddly, it seemed to be looking for files from directory 270, which didn't
exist at the time.  We came to the conclusion that was what was happening
because these came up when one attempts to commit.

openat(AT_FDCWD, "/var/repos/svn/projects/db/revs/270/270336",
O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/var/repos/svn/projects/db/revs/270/270336",
O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/var/repos/svn/projects/db/revs/270/270080",
O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/var/repos/svn/projects/db/revs/270/270080",
O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/var/repos/svn/projects/db/revs/270/270016",
O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/var/repos/svn/projects/db/revs/270/270016",
O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/var/repos/svn/projects/db/revs/270/270000",
O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/var/repos/svn/projects/db/revs/270/270000",
O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)

Note, the directory 270 didn't exist.  According to google, this can happen
due to either a hook or CI committing too often creating corruption.  We
can't seem to clear the corruption though.  We are also a bit skeptical of
its corruption as it should have cleared on a freshly restored repository.


So over the weekend, we attempted to separate the projects on the repo to
individual repos.  This is to reduce the size of the repo as it could have
been causing the issue.  We however kept the revision numbers as they are
in documentations and tags.  However, on the now small repo, with only 187
GB in size, but with the last successful commit # 269999, we still see the
problem.

This now looks more like a software bug than a setup issue.  We have "yum
updated" the system in an attempt to make sure we have all the bug fixes
out there but nothing works.

Would anyone know if there is a transaction limit in the current SVN code?
Anyone have a repository with more than 270000 revisions out there please?
Any pointers to something we may have overlooked so far?  Something else,
the svndump is cleanly at 2 TB, any chance there may be a limit of how big
a repo glow??  This was our first thought but since we  have created
smaller repos and the issue remains, it now looks far more like a sharding
logic error to us.


Regards,
William

Reply via email to