Re: svn commit: r1915519 [1/4] - in /subversion/branches/pristine-checksum-salt: ./ build/ build/ac-macros/ build/generator/ build/generator/swig/ contrib/client-side/svn_load_dirs/ contrib/hook-scrip
Daniel Sahlberg writes: > Log: > On branch pristine-checksum-salt: > > Catchup merge with trunk `pristine-checksum-salt` is branched off from branches/pristine-checksum-kind (rather than from trunk), so in this case it should probably be better to merge the trunk changes into the parent branch first, and propagate them into the `pristine-checksum-salt` branch from parent. Thanks, Evgeny Kotkov
Re: Changing the permission checks in libsvn_subr/io.c
Daniel Sahlberg writes: >> Index: subversion/libsvn_subr/io.c >> === >> --- subversion/libsvn_subr/io.c (revision 1915064) >> +++ subversion/libsvn_subr/io.c (working copy) >> @@ -2535,7 +2535,14 @@ svn_io__is_finfo_read_only(svn_boolean_t *read_onl >>apr_uid_t uid; >>apr_gid_t gid; >> >> - *read_only = FALSE; >> + *read_only = (access(file_info->fname, W_OK) != 0); There are a few problems, because this approach adds a I/O syscall (that checks access to a file by its name) into a fairly low-level function. Previously this function was analyzing the file information from the passed-in apr_finfo_t structure. The new version, however, performs an I/O call, and that leads to a couple of problems: 1) Potential performance regression, not limited to just the revert. For example, all calls to open_pack_or_rev_file() in libsvn_fs_fs would now start to make additional access() calls, thus slowing down the repository operations. 2) This adds an opportunity for a desynchronization/race between the information in apr_finfo_t and the result of access(), because access() works by a filename. For example, in a case where a file is recreated between the two calls, its result will correspond to a completely different file, compared to the one that produced the apr_finfo_t. Overall, I have a feeling that solving this specific edge-case might not be worth introducing these regressions (which look significant, at least to me). Thanks, Evgeny Kotkov
Re: Switching from SHA1 to a checksum type without known collisions in 1.15 working copy format
Daniel Sahlberg writes: > As far as I understand, the point of multi-hash is to keep the WC format > between versions (so older clients can continue to use the WC). Just as a minor note, the working copies created using the implementation on the `pristine-checksum-salt` branch don't multi-hash the contents, but rather make the [single] used checksum kind configurable and persist it at the moment when a working copy is created or upgraded. > I need some help to understand how that would work in practice. Let's say > that 1.15 adds SHAABC, 1.16 adds SHAXYZ. Then 1.17 drops SHA1. But... > - A 1.17 client will only use SHAABC or SHAXYZ hashes. > - A 1.16 client can use SHA1, SHAABC and SHAXYZ hashes. > - A 1.15 client can only use SHA1 and SHAABC hashes. > > How can these work together? A WC created in 1.17 can't be used by a 1.15 > client and a WC created in 1.15 (with SHA1) can't be used by a 1.17 client. > How is this different from bumping the format? How do we detect this? In the current design available on the `pristine-checksum-salt` branch, the supported checksum kinds are tied to a working copy format, and any supported checksum kind may additionally use a dynamic salt. For example, format 33 supports only SHA-1 (regular or dynamically salted), but a newer format 34 can add support for another checksum kind such as SHA-2 if necessary. When an existing working copy is upgraded to a newer format, its current checksum kind is retained as is (we can't rehash the content in a `--store-pristine=no` case because the pristines are not available). I don't know if we'll find ourselves having to forcefully phase out SHA-1 *even* for such working copies that retain an older checksum kind, i.e., it might be enough to use the new checksum kind only for freshly created working copies. However, there would be a few options to consider: I think that milder options could include warning the user to check out a new working copy (that would use a different checksum kind), and a harsher option could mean adding a new format that doesn't support SHA-1 under any circumstances, and declaring all previously available working copy formats unsupported. Regards, Evgeny Kotkov
Re: Getting to first release of pristines-on-demand feature (#525).
Evgeny Kotkov writes: > Merged in https://svn.apache.org/r1905955 > > I'm going to respond on the topic of SHA1 a bit later. For the history: thread [1] proposes the `pristine-checksum-salt` branch that adds the infrastructure to support new pristine checksum kinds in the working copy and makes a switch to the dynamically-salted SHA1. >From the technical standpoint, I think that it would be better to release the first version of the pristines-on-demand feature having this branch merged, because now we rely on the checksum comparison to determine if a file has changed — and currently it's a checksum kind with known collisions. At the same time, having that branch merged probably isn't a formal release blocker for the pristines-on-demand feature. Also, considering that the `pristine-checksum-salt` branch is currently vetoed by danielsh (presumably, for an indefinite period of time), I'd like to note that personally I have no objections to proceeding with a release of the pristines-on-demand feature without this branch. [1] https://lists.apache.org/thread/xmd7x6bx2mrrbw7k5jr1tdmhhrlr9ljc Regards, Evgeny Kotkov
Re: Switching from SHA1 to a checksum type without known collisions in 1.15 working copy format
Daniel Shahaf writes: > Procedurally, the long hiatus is counterproductive. This reminds me that the substantive discussion of your veto ended with my email from 8 Feb 2023 that had four direct questions to you and was left without an answer: `` > That's not how design discussions work. A design discussion doesn't go > "state decision; state pros; implement"; it goes "state problem; discuss > potential solutions, pros, cons; decide; implement" (cf. [4, 5, 6]). Well, I think it may not be as simple as it seems to you. Who decided that we should follow the process you're describing? Is there a thread with a consensus on this topic? Or do you insist on using this specific process because it's the only process that seems obvious to you? What alternatives to it have been considered? As far as I can tell, the process you're suggesting is effectively a waterfall-like process, and there are quite a lot of concerns about its effectiveness, because the decisions have to be made in the conditions of a lack of information. `` It's been more than 11 months since that email, and those questions still don't have an answer. So if we are to resume this discussion, let's do it from the proper point. > You guys are welcome to try to /convince/ me to change my opinion, or to > have the veto invalidated. In either case, you will be more likely to > succeed should your arguments relate not only to the veto's implications > but also to its /sine qua non/ component: its rationale. Just in case, my personal opinion here is that the veto is invalid. Firstly, based on my understanding, the ASF rules prohibit casting a veto without an appropriate technical justification (see [1], which I personally agree with). Secondly, it seems that the process you are imposing hasn't been accepted in this community. As far as I know, this topic was tangentially discussed before (see [2], for example), and it looks like there hasn't been a consensus to change our current Commit-Then-Review process into some sort of Review-Then-Commit. (At the same time I won't even try to /convince/ you, sorry.) [1] https://www.apache.org/foundation/voting.html [2] https://lists.apache.org/thread/ow2x68g2k4lv2ycr81d14p8r8w2jj1xl Regards, Evgeny Kotkov
Re: E160056 Item index too large in revision
Daniel Sahlberg writes: > In the TortoiseSVN-dev group it is claimed to be because of APR 1.7.3 and > that reverting back to 1.7.2 restores normal operation, I havn't yet > verified nor tried to figure out which of the changes between 1.7.2 > and 1.7.3 may be responsible for this issue. As I know some of the > APR committers are also here, I'm hoping for some help. The problem seems to be caused by a change in APR 1.7.3, authored by me in an attempt to fix wrong file offsets reported after opening files on Windows and Unix: *) Don't seek to the end when opening files with APR_FOPEN_APPEND on Windows. Unfortunately, I missed a case where flushing a APR_APPEND | APR_BUFFERED file was implicitly relying on that specific file offset, and that is what caused the regression you're seeing. I have now reverted this change in https://svn.apache.org/r1909088 and https://svn.apache.org/r1909089, so hopefully this regression will be fixed in the next APR 1.7.x patch release. Thanks, Evgeny Kotkov
Re: Switching from SHA1 to a checksum type without known collisions in 1.15 working copy format
Daniel Shahaf writes: > What's the question or action item to/for me? Thanks. I'm afraid I don't fully understand your question. As you probably remember, the change is blocked by your veto. To my knowledge, this veto hasn't been revoked as of now, and I simply mentioned that in my email. It is entirely your decision whether or not to take any action regarding this matter. Thanks, Evgeny Kotkov
Re: [PROPOSAL] Allow plaintext passwords again.
Nathan Hartman writes: > I think a good middle ground is: > > * Build with --enable-plaintext-password-storage by default; users who > want to harden their system can do so, but will need to build their > own client. +1. > * Set the default run-time config to store-plaintext-passwords = no > (if it isn't already; haven't checked) and instruct users on how to > change it. This makes the decision to store in plaintext an explicit > one that the user can opt into. (I appreciate that this could be > changed without the user's knowledge; perhaps the systemwide config > should always take precedence over the user-controlled one for this > setting?) So, apparently, the current default is "ask". I haven't checked all the details, but I think that defaulting to "ask" already makes the user decision explicit and allows it to happen naturally, without requiring any additional instructions or knowledge. If we change the default to "no", this part of the experience could be worse, because for the end users it might look like the credentials aren't being stored for unknown reasons / a bug in the software. Thanks, Evgeny Kotkov
Re: Switching from SHA1 to a checksum type without known collisions in 1.15 working copy format
Evgeny Kotkov writes: > > Now, how hard would this be to actually implement? > > To have a more or less accurate estimate, I went ahead and prepared the > first-cut implementation of an approach that makes the pristine checksum > kind configurable in a working copy. > > The current implementation passes all tests in my environment and seems to > work in practice. It is available on the branch: > > https://svn.apache.org/repos/asf/subversion/branches/pristine-checksum-kind > > The implementation on the branch allows creating working copies that use a > checksum kind other than SHA-1. I extended the current implementation to use a dynamically salted SHA-1 checksum, rather than a SHA-1 with a statically hardcoded salt. The dynamic salt is generated during the creation of a wc.db. The implementation is available on a separate branch: https://svn.apache.org/repos/asf/subversion/branches/pristine-checksum-salt The change is a bit massive, but in the meantime I think that it should solve the potential problem without any practical drawbacks, except for the lack of the mentioned ra_serf fetch optimization. So overall I'd propose to bring this change to trunk, to improve the current state around checksum collisions in the working copy, and to also have the infrastructure for supporting different checksum kinds in place, in case we need it in the future. This change is still being blocked by a veto, but if danielsh changes his mind and if there won't be other objections, I'm ready to complete the few remaining bits and merge it to trunk. Thanks, Evgeny Kotkov
Re: svn commit: r1907965 - in /subversion/trunk/subversion: include/ include/private/ libsvn_client/ libsvn_wc/ svn/ tests/cmdline/
Jun Omae writes: > After r1907965, build on Windows is failing with the following errors. > > [[[ > svn_client-1.lib(upgrade.obj) : error LNK2001: unresolved external symbol > svn_wc__version_string_from_format > [D:\a\subversion\subversion\build\win32\vcnet-vcproj\libsvn_client_dll.vcxproj] > D:\a\subversion\subversion\Release\subversion\libsvn_client\libsvn_client-1.dll > : fatal error LNK1120: 1 unresolved externals > [D:\a\subversion\subversion\build\win32\vcnet-vcproj\libsvn_client_dll.vcxproj] > ]]] Thanks, should be fixed in r1908042. Regards, Evgeny Kotkov
Re: Initial patch for storing the pristines-on-demand setting in the working copy (Issue 4889)
Evgeny Kotkov writes: > While working on the patch, I have stumbled across a couple of issues: > > A) `svn upgrade` without arguments fails for a working copy with latest format > > $ svn checkout --compatible-version=1.15 wc > $ svn upgrade wc > $ svn: E155021: Working copy '…' is already at version 1.15 (format 32) > and cannot be downgraded to version 1.8 (format 31) While the work on pristine checksum kinds is blocked by a veto, I took a look at other things we'd probably want to handle before the release. I think that the above case qualifies as such, so I made a related improvement by introducing a new config option and also fixed the described error: - r1907964 introduces a new `compatible-version` config setting, to allow configuring the desired default wc compatibility level globally. - r1907965 fixes an error when `svn upgrade` is called without any arguments for a working copy of the newer format. Thanks, Evgeny Kotkov
Re: Switching from SHA1 to a checksum type without known collisions in 1.15 working copy format
Daniel Shahaf writes: > Look, it's pretty simple. You said "We should do Y because it > addresses X". You didn't explain why X needs to be addressed, didn't > consider what alternatives there are to Y, didn't consider any cons that > Y may have… and when people had questions, you just began to > implement Y, without responding to or even acknowledging those > questions. > > That's not how design discussions work. A design discussion doesn't go > "state decision; state pros; implement"; it goes "state problem; discuss > potential solutions, pros, cons; decide; implement" (cf. [4, 5, 6]). Well, I think it may not be as simple as it seems to you. Who decided that we should follow the process you're describing? Is there a thread with a consensus on this topic? Or do you insist on using this specific process because it's the only process that seems obvious to you? What alternatives to it have been considered? As far as I can tell, the process you're suggesting is effectively a waterfall-like process, and there are quite a lot of concerns about its effectiveness, because the decisions have to be made in the conditions of a lack of information. Personally, I prefer an alternative process that starts from finding out all available bits of information, which are then used to make informed decisions. The unfortunate reality, however, is that the only guaranteed way of collecting all information means implementing all (or almost all) significant parts in code. Roughly speaking, this process looks like a research project that gets completed by trial and error. Based on what you've been saying so far, I wouldn't be surprised if you disagree. But I still think that forcing the others to follow a certain process by such means as vetoing a code change is maybe a bit over the top. (In the meantime, I certainly won't object if you're going to use this waterfall-like process for the changes that you implement yourself.) Regards, Evgeny Kotkov
Re: Switching from SHA1 to a checksum type without known collisions in 1.15 working copy format
Daniel Shahaf writes: > > (I'm not saying that the above rules have to be used in this particular case > > and that a veto is invalid, but still thought it’s worth mentioning.) > > > > I vetoed the change because it hadn't been designed on the dev@ list, > had not garnered dev@'s consensus, and was being railroaded through. > (as far as I could tell) I have *absolutely* no idea where "being railroaded through" comes from. Really, it's a wrong way of portraying and thinking about the events that have happened so far. Reiterating over those events: I wrote an email containing my thoughts and explaining the motivation for such change. I didn't reply to some of the questions (including some tricky questions, such as the one featuring a theoretical hash function), because they have been at least partly answered by others in the thread, and I didn't have anything valuable to add at that time. During that time, I was actively coding the core part of the change, to check if it's possible technically. Which is important, as far as I believe, because not all theoretically possible solutions can be implemented without facing significant practical or implementation-related issues, and it seems to me that you significantly undervalue such an approach. I do not say my actions were exemplary, but as far as I can tell, they're pretty much in line with how svn-dev has been operating so far. But, it all resulted in an unclear veto without any _technical_ arguments, where what's being vetoed is unclear as well, because the change was not ready at the moment veto got casted. And because your veto goes in favor of a specific process (considering that no other arguments were given), the only thing that's *actually* being railroaded is an odd form of an RTC (review-then-commit) process that is against our usual CTR (commit-then-review) [1,2]. That's railroading, because it hasn't been explicitly discussed anywhere and a consensus on it has not been reached. [1] https://www.apache.org/foundation/glossary.html#CommitThenReview [2] https://www.apache.org/foundation/glossary.html#ReviewThenCommit Regards, Evgeny Kotkov
Re: Switching from SHA1 to a checksum type without known collisions in 1.15 working copy format
Daniel Shahaf writes: > > That could happen after a public disclosure of a pair of executable > > files/scripts where the forged version allows for remote code execution. > > Or maybe something similar with a file format that is often stored in > > repositories and that can be executed or used by a build script, etc. > > > > Err, hang on. Your reference described a chosen-prefix attack, while > this scenario concerns a single public collision. These are two > different things. A chosen-prefix attack allows finding more meaningful collisions such as working executables/scripts. When such collisions are made public, they would have a greater exploitation potential than just a random collision. > Disclosure of of a pair of executable files/scripts isn't by itself > a problem unless one of the pair ("file A") is in a repository > somewhere. Now, was the colliding file ("file B") generated _before_ or > _after_ file A was committed? > > - If _before_, then it would seem Mallory had somehow managed to: > > 1. get a file of his choosing committed to Alice's repository; and > > 2. get a wc of Alice's repository into one of the codepaths that > assume SHA-1 is one-to-one / collission-free (currently that's the > ra_serf optimization and the 1.15 wc status). Not only. There are cases when the working copy itself installs the working file with a hash lookup in the pristine store. This is more true for 1.14 than trunk, because in trunk we have the streamy checkout/update that avoid such lookups by writing straight to the working file. However, some of the code paths still install the contents from the pristine store by hash. Examples include reverting a file, copying an unmodified file, switching a file with keywords, the mentioned ra_serf optimization, and etc. > Now, step #1 seems plausible enough. As to step #2, it's not clear to > me how file B would reach the wc in step #2… If Mallory has write access, she could commit both files, thus arranging for a possible content change if both files are checked out to a single working copy. This isn't the same as just directly modifying the target file, because file content isn't expected to change due to changes in other files (that can be of any type), so this attack has much better chances of being unnoticed. If Mallory doesn't have write access, there should be other vectors, such as distributing a pair of files (harmless in the context of their respective file formats) separately via two upstream channels. Then, if both of the upstream distributions are committed into a repository and their files are checked out together, the content will change, allowing for a malicious action. Regards, Evgeny Kotkov
Re: Switching from SHA1 to a checksum type without known collisions in 1.15 working copy format
Daniel Shahaf writes: > > I can complete the work on this branch and bring it to a production-ready > > state, assuming there are no objections. > > Your assumption is counterfactual: > > https://mail-archives.apache.org/mod_mbox/subversion-dev/202301.mbox/%3C20230119152001.GA27446%40tarpaulin.shahaf.local2%3E > > https://mail-archives.apache.org/mod_mbox/subversion-dev/202212.mbox/%3CCAMHy98NqYBLZaTL5-FAbf24RR6bagPN1npC5gsZenewZb0-EuQ%40mail.gmail.com%3E I don't see any explicit objections in these two emails (here I assume that if something is not clear to a PMC member, it doesn't automatically become an objection). If the "why?" question is indeed an objection, then I would say it has already been discussed and responded to in the thread. Now, returning to the problem: As described in the advisory [1], we have a supported configuration that makes data forgery possible: - A repository with disabled rep-sharing allows storing different files with colliding SHA-1 values. - Having a repository with disabled rep-sharing is a supported configuration. There may be a certain number of such repositories in the wild (for example, created with SVN < 1.6 and not upgraded afterwise). - A working copy uses an assumption that the pristine contents are equal if their SHA-1 hashes are equal. - So committing different files with colliding SHA-1 values makes it possible to forge the contents of a file that will be checked-out and used by the client. I would say that this state is worrying just by itself. However, with the feasibility of chosen-prefix attacks on SHA-1 [2], it's probably only a matter of time until the situation becomes worse. That could happen after a public disclosure of a pair of executable files/scripts where the forged version allows for remote code execution. Or maybe something similar with a file format that is often stored in repositories and that can be executed or used by a build script, etc. [1] https://subversion.apache.org/security/sha1-advisory.txt [2] https://sha-mbles.github.io/ Speaking of the proposed switch to SHA-256 or a different checksum, there's an argument by contradiction: if we were designing the pristineless working copy from scratch today, would we choose SHA-1 as the best available hash that can be used to assert content equality? If yes, how can one prove that? > Objections have been raised, been left unanswered, and now implementation > work has commenced following the original design. That's not acceptable. > I'm vetoing the change until a non-rubber-stamp design discussion has > been completed on the public dev@ list. I would like to note that vetoing a code modification should be accompanied with a technical justification, and I have certain doubts that the above arguments qualify as such: https://www.apache.org/foundation/voting.html [[[ To prevent vetoes from being used capriciously, the voter must provide with the veto a technical justification showing why the change is bad (opens a security exposure, negatively affects performance, etc. ). A veto without a justification is invalid and has no weight. ]]] (I'm not saying that the above rules have to be used in this particular case and that a veto is invalid, but still thought it’s worth mentioning.) Anyway, I'll stop working on the branch, because a veto has been casted. Regards, Evgeny Kotkov
Re: Switching from SHA1 to a checksum type without known collisions in 1.15 working copy format
Karl Fogel writes: > Now, how hard would this be to actually implement? To have a more or less accurate estimate, I went ahead and prepared the first-cut implementation of an approach that makes the pristine checksum kind configurable in a working copy. The current implementation passes all tests in my environment and seems to work in practice. It is available on the branch: https://svn.apache.org/repos/asf/subversion/branches/pristine-checksum-kind The implementation on the branch allows creating working copies that use a checksum kind other than SHA-1. The checksum kind is persisted in the settings table. Upgraded working copies of the older formats will have SHA-1 recorded as their pristine checksum kind and will continue to use it for compatibility. Newly created working copies of the latest format (with --compatible-version=1.15 or --store-pristine=no), as currently implemented, will use the new pristine checksum kind. Currently, as a proof-of-concept, the branch uses salted SHA-1 as the new pristine checksum kind. For the production-ready state, I plan to support using multiple new checksum types such as SHA-256. I think that it would be useful for future compatibility, because if we encounter any issues with one checksum kind, we could then switch to a different kind without having to change the working copy format. One thing worth noting is that ra_serf contains a specific optimization for the skelta-style updates that allows skipping a GET request if the pristine store already contains an entry with the specified SHA-1 checksum. Switching to a different checksum type for the pristine entries is going to disable that specific optimization. Re-enabling it would require an update of the server-side. I consider this to be out of scope for this branch. I can complete the work on this branch and bring it to a production-ready state, assuming there are no objections. Thanks, Evgeny Kotkov
Re: Switching from SHA1 to a checksum type without known collisions in 1.15 working copy format
Karl Fogel writes: > Now, how hard would this be to actually implement? I plan to take a more detailed look at that, but I'm currently on vacation for the New Year holidays. Thanks, Evgeny Kotkov
Switching from SHA1 to a checksum type without known collisions in 1.15 working copy format (was: Re: Getting to first release of pristines-on-demand feature (#525).)
Karl Fogel writes: > > While here, I would like to raise a topic of incorporating a switch from > > SHA1 to a different checksum type (without known collisions) for the new > > working copy format. This topic is relevant to the pristines-on-demand > > branch, because the new "is the file modified?" check relies on the > > checksum comparison, instead of comparing the contents of working and > > pristine files. > > > > And so while I consider it to be out of the scope of the pristines-on- > > demand branch, I think that we might want to evaluate if this is something > > that should be a part of the next release. > > Good point. Maybe worth a new thread? [Moving discussion to a new thread] We currently have a problem that a working copy relies on the checksum type with known collisions (SHA1). A solution to that problem is to switch to a different checksum type without known collisions in one of the newer working copy formats. Since we plan on shipping a new working copy format in 1.15, this seems to be an appropriate moment of time to decide whether we'd also want to switch to a checksum type without known collisions in that new format. Below are the arguments for including a switch to a different checksum type in the working copy format for 1.15: 1) Since the "is the file modified?" check now compares checksums, leaving everything as-is may be considered a regression, because it would introduce additional cases where a working copy currently relies on comparing checksums with known collisions. 2) We already need a working copy format bump for the pristines-on-demand feature. So using that format bump to solve the SHA1 issue might reduce the overall number of required bumps for users (assuming that we'll still need to switch from SHA1 at some point later). 3) While the pristines-on-demand feature is not released, upgrading with a switch to the new checksum type seems to be possible without requiring a network fetch. But if some of the pristines are optional, we lose the possibility to rehash all contents in place. So we might find ourselves having to choose between two worse alternatives of either requiring a network fetch during upgrade or entirely prohibiting an upgrade of working copies with optional pristines. Thoughts? Thanks, Evgeny Kotkov
Re: Getting to first release of pristines-on-demand feature (#525).
Evgeny Kotkov writes: > I think that the `pristines-on-demand-on-mwf` branch is now ready for a > merge to trunk. I could do that, assuming there are no objections. Merged in https://svn.apache.org/r1905955 I'm going to respond on the topic of SHA1 a bit later. Thanks, Evgeny Kotkov
Re: Getting to first release of pristines-on-demand feature (#525).
Evgeny Kotkov writes: > > IMHO, once the tests are ready, we could merge it and release > > it to the world. > > Apart from the required test changes, there are some technical > TODOs that remain from the initial patch and should be resolved. > I'll try to handle them as well. I think that the `pristines-on-demand-on-mwf` branch is now ready for a merge to trunk. I could do that, assuming there are no objections. https://svn.apache.org/repos/asf/subversion/branches/pristines-on-demand-on-mwf The branch includes the following: – Core implementation of the new mode where required pristines are fetched at the beginning of the operation. – A new --store-pristine=yes/no option for `svn checkout` that is persisted as a working copy setting. – An update for `svn info` to display the value of this new setting. – A standalone test harness that tests main operations in both --store-pristine modes and gets executed on every test run. – A new --store-pristine=yes/no option for the test suite that forces all tests to run with a specific pristine mode. The branch passes all tests in my Windows and Linux environments, in both --store-pristine=yes and =no modes. While here, I would like to raise a topic of incorporating a switch from SHA1 to a different checksum type (without known collisions) for the new working copy format. This topic is relevant to the pristines-on-demand branch, because the new "is the file modified?" check relies on the checksum comparison, instead of comparing the contents of working and pristine files. And so while I consider it to be out of the scope of the pristines-on-demand branch, I think that we might want to evaluate if this is something that should be a part of the next release. Thanks, Evgeny Kotkov
Re: svn commit: r1905663 - in /subversion/branches/pristines-on-demand-on-mwf/subversion: include/ include/private/ libsvn_client/ libsvn_wc/
Bert Huijben writes: > All the now deprecated functions now fail unconditionally when the setting > is enabled. Isn’t it possible to do this more graceful whenever a file is > encountered which misses it’s prisite version? The problem with this approach is that the functions are going to work, but only *sometimes*, and will fail unpredictably, depending on whether a pristine was fetched or removed by any previous API call. With that in mind, failing consistently seems to be a more appropriate choice for a deprecated function than failing randomly. > As far as I know it is expected that some of the files do have pristines, > while others don’t… That would allow things like diffs on old clients that > didn’t switch apis yet. Thinking about this, the currently documented assumption is that a file has no pristine if and only if it's being added/replaced without a copyfrom. So maybe we cannot really extend that to "any file might not have a pristine" without it being an incompatible change. Thanks, Evgeny Kotkov
Re: svn commit: r1905663 - in /subversion/branches/pristines-on-demand-on-mwf/subversion: include/ include/private/ libsvn_client/ libsvn_wc/
Daniel Sahlberg writes: >> + /** @since New in 1.15 */ >> + SVN_ERRDEF(SVN_ERR_WC_DEPRECATED_API_STORE_PRISTINE, >> + SVN_ERR_WC_CATEGORY_START + 43, >> + "This client was not updated to support working copies " >> + "without local pristines") >> + >>/* fs errors */ > > Is it really "This client"? It looks more to be based on the WC setting. This new error should only occur when someone (a 3rd party client) uses the deprecated API function for a working copy without pristines. >From that point of view, I would say that the problem is in the client that is still using the deprecated API and thus cannot properly handle such working copies. Also, the current message aims to be somewhat consistent with another existing error for a case when a client is too old to read a newer working copy format: "This client is too old to work with the working copy […] (format 32)" (After writing this, I realized that the "was not updated" part of the message may be misinterpreted as a requirement to update to the newer version of the software. In r1905682, I slightly rephrased that part so that it would say: "This client uses a deprecated API that does not support working copies without local pristines") Thanks, Evgeny Kotkov
Re: Getting to first release of pristines-on-demand feature (#525).
Karl Fogel writes: > Thank you, Evgeny! Just to make sure I understand correctly -- > the status now on the 'pristines-on-demand-on-mwf' branch is: > > 1) One can do 'svn checkout --store-pristines=no' to get an > entirely pristine-less working copy. In that working copy, > individual files will be hydrated/dehydrated automagically on an > as-needed basis. > > 2) There is no command to hydrate or dehydrate a particular file. > Hydration and dehydration only happen as a side effect of other > regular Subversion operations. > > 3) There is no way to rehydrate the entire working copy. E.g., > something like 'svn update --store-pristines=yes' or 'svn hydrate > --depth=infinity' does not exist yet. > > 4) Likewise, there is no way to dehydrate an existing working copy > that currently has its pristines (even if that working copy is at > a high-enough version format to support pristinelessness). E.g., > something like 'svn update --store-pristines=no' or 'svn dehydrate > --depth=infinity' does not exist yet. > > Is that all correct? Yes, I believe that is correct. > By the way, I do not think (2), (3), and (4) are blockers. Just > (1) by itself is a huge step forward and solves issue #525; +1 on keeping the scope of the feature to just (1) for now. > IMHO, once the tests are ready, we could merge it and release > it to the world. Apart from the required test changes, there are some technical TODOs that remain from the initial patch and should be resolved. I'll try to handle them as well. Thanks, Evgeny Kotkov
Re: Getting to first release of pristines-on-demand feature (#525).
Evgeny Kotkov writes: > Perhaps we could transition into that state by committing the patch > and maybe re-evaluate things from there. I could do that, assuming > no objections, of course. Committed the patch in https://svn.apache.org/r1905324 I'll try to handle the related tasks in the near future. Thanks, Evgeny Kotkov
Re: Getting to first release of pristines-on-demand feature (#525).
Karl Fogel writes: > By the way, in that thread, Evgeny Kotkov -- whose initial work > much of this is based on -- follows up with a patch that does a > first-pass implementation of 'svn checkout --store-pristines=no' > (by implementing a new persistent setting in wc.db). Perhaps we could transition into that state by committing the patch and maybe re-evaluate things from there. I could do that, assuming no objections, of course. Thanks, Evgeny Kotkov