> Hang on. In your previous reply you wrote that commits that "complete > with a post-commit error" was a problem the patch was designed to solve, > but now you say that's not an immediate concern. Which is it? We can't > design a solution if it's not clear what set of problems the solution is > supposed to solve.
I don't think there is a contradiction somewhere. What I was trying to say is: - The problem is that rep-cache verification leads to errors for other writes to the rep-cache. For example, this happens during commit post-processing. - One visible consequence of this problem is that commits can hang. - Another visible consequence is that commits complete with post-commit errors. - Another visible consequence is that new data will not be deduplicated. - I think the root cause of problem is that entire rep-cache verification runs with holding a SQLite lock for a long time. - I don't think that there are issues with the existing way to report post-commit errors to clients. Also, I don't think that changing it would be relevant to the problem. > > For example, 'svnadmin verify' can be checking rep-cache for a long time and > > 'svnadmin build-repcache' will keep completing with an error in this case. > > All > > commits during this time will work without deduplication for new data. > > I take it that in your use-case it's important that a commit's rep-cache rows > should be added as soon after the commit as possible, then? I think it is important that entries don't get lost and do not make deduplication worse. The rep-cache verification can take a long time for real-world reportistories, for example an hour or more. And all commits during this time will work without deduplication for new data and I think that currently there is no way to fix it. In case of regular verifications, this can be a significant issue. > Yes, and it's up to the repository administrator to provide such > guarantees when scheduling «verify» and «build-repcache» operations on > their repository. I think that in general it may be difficult to have such guarantees. For example, verification can be performed with the API or by third-party tools. And based on the above it would mean that it is necessary to also prohibit commits during verification. The limitation seems too strong and so I think it would be better to fix the problem with holding a SQLite lock during entire verification. > For example, how about sharding rep-cache.db to multiple separate > databases, according to the most significant nibbles of the hash value? > This way, instead of having one big database, we'd have 2**(4N) smaller > databases. If I am not mistaken, the sizes of such shards will still be O(N), but not O(1), as in the patch. Therefore, this problem can still occur at some rep-cache sizes. > I understand this argument. However, at the end of the day, the patch > makes «verify» read a database as it's being written to, which is more > likely to run into bugs (either in our code or in SQLite's), so I'm not > a fan of it. > > Maybe I'm being overly careful. I think that concurrent reading and writing to rep-cache.db is already possible in the case of concurrent commits. It is a bit unclear to me why the (optional) verification has to be different from the commit in this aspect. Especially considering that the current approach leads to the problem. > P.S. How about using creating a temporary copy of rep-cache.db, and then > have «verify» iterate that at its leisure and delete it when done with it? I think that compared to the patch this approach has a problem that every verify will lead to a lot of write operations. For example, the size of rep-cache.db in the ASF repository is 1.1 GB. Just in case, I will be happy to work on any specific things in the patch, if there are some things you think can be improved. Regards, Denis Kovalchuk