On 19.12.2014 13:23, Julian Foad wrote: > I believe the following symmetries should be true, and testable, and we > should test them. > > For any valid repository: > > * we can dump it > * we can load the dump file into a new repository > * the new repo is equivalent to the old repo > > For any valid dump file: > > * we can load it into a new repository > * we can dump that repository > * the new dump file is equivalent to the old dump file
I agree that this should be our goal. However, consider that some of these symmetries depend on specific features of the repository implementation. For example, at some point you mentioned dump files with non-UTF-8 paths. Such dump files are clearly invalid, since we've maintained the restriction that all strings used internally must be encoded in UTF-8. So, such a dump file can only be the result of manual fiddling, or a bug in some version of some repository back-end implementation. A different and/or fixed backend will not accept non-UTF-8 paths at all; thus, we cannot maintain this particular symmetry. Conversely, if we decide that maintaining strict dump/load symmetry is more important, we're—unnecessarily, IMO—complicating future development (e.g., the idea that repos path lookup should preserve but ignore differences in Unicode character representation). I'm sure there are other cases where maintaining strict symmetry will turn out to be too constraining. An example from your own bailiwick: when we store mergeinfo in a more reasonable structure than a versioned property, a load from an older dumpfile will most likely loose details of exactly how the mergeinfo was represented; even though a later dump may produce svn:mergeinfo values that are different but semantically equivalent to the original. Clearly, dump/load asymmetry can be preserved even in the cases I mentioned, at the cost of maintaining more complex medatada (and related code) in the repository back-end. The question we have to answer is: what's the point, as long as semantics are not affected? -- Brane