On Wed, 2010-03-31, Stefan Sperling wrote: > On Wed, Mar 31, 2010 at 01:44:23PM +0100, Julian Foad wrote: > > The Source of the Incoming Change > > ================================= > > > > I suggest something like (? means an optional field): > > > > * A diff between nodes that have a location in a repository. > > > > left=(repo, rev, relpath, sha1?) > > right=(repo, rev, relpath, sha1?) > > > > # This source description is normally from merge/up/sw. > > # Full texts are (at least in principle) available. > > > > * A Patch File > > > > patch-file=(path, (info to identify which part of the > > patch file applies to this node - e.g. byte > > offsets or other info)?) > > reported-left=(repo?, rev?, relpath?) > > reported-right=(repo?, rev?, relpath?) > > > > # This source description is normally from operation 'patch'. > > # The reported-{left,right} are taken from info in the patch > > # file if available: e.g. from "+++ foo (r1000)" we can fill > > # in at least 'rev' and maybe 'relpath'. These cannot be > > # considered trustworthy even when given, as patch files can > > # be constructed and edited at will. > > > > * A diff between nodes that don't have a repository. > > # e.g. a merge from a diff between two locally-modified WCs. > > # We have no need of this yet (not supported) but we should > > # expect to support this kind of source of change some time. > > This sounds very interesting. > > Can you try to adjust notes/wc-ng/conflict-storage accordingly? > The notation you used is different from what is used there, > so I'm not sure I really understand whether the above is meant > to supplement the current design or not. > I'm making some suggestions below but maybe you have better ideas.
OK I've attached a patch with ideas and suggestions. (The above was meant to be a concept that should be embodied in the design, not a design in itself.) > > A Change Affects the Whole Node > > =============================== > > > > It is important to record the whole-node source of the change, as far as > > possible, even when only one property conflicts, because I want to be > > able to see what the other properties and text were on {left, right, > > mine} to help me resolve the conflict. So we must not think of "a > > property conflict" as being a fundamental thing: the fundamental thing > > is "a change to this node conflicts with another change to this node". > > The assumption that all properties are independent from each other and > > independent from the text is a high-level approximation which the client > > makes, to be more convenient in simple cases, but not a fundamental > > truth, and we must not tie the conflict storage design to only > > supporting this assumption. > > I think the whole-node information you want is stored by the OPERATION > skel I added today (we can still augment the OPERATION skel if you want > more information in it). Yup, that looks more like it. > The OPERATION skel contains information as to what caused the conflict > (the source), and the TYPE skel records details about the nature of the > conflict. See notes/wc-ng/conflict-storage. > > > In the same way, when we record a tree conflict, we want to record the > > {left, right, mine} versions of the whole node. We don't have the > > infrastructure to do this yet - we don't have a pristine store for sets > > of properties, let alone for directory trees - but we could do in > > principle and I hope we will do in practice one day. > > Recording left,right,mine trees would be nice, yes. > > > The Part(s) of the Node Affected By The Change > > ============================================== > > > > Briefly: the WC probably needs to know only "this node is in conflict" > > or perhaps tree-conflict versus content-conflict, and no more detail > > than that. By "know" I mean what the WC layer itself needs to > > understand and use. > > In the current model answering "is this node in conflict" means comparing > the conflict_data column to NULL. To find out what kinds of conflicts > there are, libsvn_wc can iterate the list of TYPE skels and look at the > very first atom in each skel. Great. > > It needs to be able to *supply* much more detail, > > of course, to the client layer, but that can (and should) be in a blob > > whose content is opaque to libsvn_wc. Layering is Good (unless the > > layers are too many, too thin; but the WC layer can hardly be accused of > > that). So I think it IS important to have a common part of the info that > > libsvn_wc understands itself, and a detailed part that libsvn_wc simply > > reports to libsvn_client and which libsvn_client composes and > > decomposes. > > Then maybe we should split the TYPE-specific and OPERATION skels > into a separate list, such as: > > ((KIND (OPERATION KIND-SPECIFIC)) ((KIND (OPERATION KIND-SPECIFIC) ...) > > The WC layer would only ever look at KIND, and not parse the skels any > further than that. > The client layer would supply and interpret data in (OPERATION KIND-SPECIFIC), > setting the "blob" via libsvn_wc functions. (As I said in my follow-up, I was confused between libsvn_client with upper layer of libsvn_wc. In my dream world, the logic of "update" is handled in the client layer as a sibling of "merge".) > Would this sufficiently fulfill the layering requirement? > Probably, but there's a slight problem with this scheme. > > E.g. I'd like the client layer to be able to ask about a conflict > on a specific property. Or about the reject conflict caused by a > specific hunk. This means that the wc layer will need to record some > sort of identifier (e.g. a property's name) to tell several recorded > property conflicts apart. It's all just a question of where you draw the line above/below that API and all the other ones you need. I don't see how this particular calling out of property names is special. > So, we'd have something like this instead: > > ((KIND [ID] (OPERATION KIND-SPECIFIC)) > (KIND [ID] (OPERATION KIND-SPECIFIC)) ...) > > For types which can only be recorded once, the ID can be ommitted. > > In such as scheme the client layer would use a low-level function > interface like: > > svn_wc__add_text_conflict(char *conflict_data) > svn_wc__add_prop_conflict(char *property_name, char *conflict_data) > svn_wc__add_tree_conflict(char *conflict_data) > svn_wc__add_reject_conflict(char *hunk_header, char* conflict_data) > > svn_wc__list_prop_conflicts(apr_array_header_t **property_names) > svn_wc__list_reject_conflicts(apr_array_header_t **hunk_headers) > > svn_wc__get_text_conflict(char *conflict_data) > svn_wc__get_prop_conflict(char *property_name, char *conflict_data) > svn_wc__get_tree_conflict(char *conflict_data) > svn_wc__get_reject_conflict(char *hunk_name, char *conflict_data) > > svn_wc__del_text_conflict(char *conflict_data) > svn_wc__del_prop_conflict(char *property_name, char *conflict_data) > svn_wc__del_tree_conflict(char *conflict_data) > svn_wc__del_reject_conflict(char *hunk_header, char* conflict_data) > > (parameter types above are just for illustration, I know much of them > could be const, and we'll also need pools etc.) > > Higher-level conflict handling functions would then live entirely > within libsvn_client. No comment on API ideas. > Note though that the current libsvn_client/libsvn_wc layering is > not really clean (e.g. update editor in libsvn_wc, merge editor > in libsvn_client), which may impose restrictions upon how dumb > libsvn_wc can really be about conflicts. Yup. > > A reject is one way to record that some text change could not be applied > > because it did not match the text found in the target file. That is > > logically a text conflict, but recorded in a different way, and with > > less information about which lines of text go where. It is certainly a > > conflict of text rather than of properties or of tree structure. > > > > So, we could have said: > > > > There are five types of conflict *description* (text conflict with > > full texts, text conflict from a patch, property conflict, tree > > conflict, and obstruction). > > > > But property and tree conflicts can (later) come from a patch, so: > > > > There are seven types of conflict *description* (text conflict with > > full texts, text conflict from a patch, property conflict with full > > sources, property conflict from a patch, tree conflict with full > > sources, tree conflict from a patch, and obstruction). > > > > But in my opinion that not good. As I said above, it's better to > > distinguish the source of change from the kind of change as two separate > > dimensions. > > I believe we already distinguish the source of the change from the kind. > I regard the reject as a separate kind of conflict at the storage layer > because the information we store about rejects is different than what we > store about text conflicts. I think we're just arguing terminology and categorization. We both now understand that both the "reject" and "text" kinds of conflict description describe a conflict in the text of the file. - Julian
Index: notes/wc-ng/conflict-storage =================================================================== --- notes/wc-ng/conflict-storage (revision 929653) +++ notes/wc-ng/conflict-storage (working copy) @@ -5,18 +5,46 @@ Conflict meta data is stored in the ACTUAL_NODE table, within the 'conflict_data' column. The data in this column is a skel containing -conflict information, or NULL (meaning no conflict is present). +conflict information (meaning the node is in conflict, and the details +are inside), or NULL (meaning no conflict is present). -There are five types of conflicts (text conflicts, property conflicts, -tree conflicts, reject conflicts, and obstructions). The conflict skel -has the form: +### JAF: What if the node is neither in WORKING nor in BASE? Currently, +### an entry in ACTUAL implies an entry in WORKING or BASE. +### Bert said: "We can just allow that case in WC-NG, by enumerating +### over the nodes in ACTUAL in svn_wc__db_read_conflict_victims(), +### while svn_wc__db_read_children() doesn't look at ACTUAL." +### JAF: Is it a good idea to special-case like that? I don't like the +### sound of the words "just allow that case" as it sounds like losing +### sight of the concepts, and would prefer to talk in terms like +### "adjust the meaning of 'child' to include ...". + +The conflict skel has the form: ((KIND OPERATION KIND-SPECIFIC) (KIND OPERATION KIND-SPECIFIC) ...) -where KIND is one of "text", "prop", "tree", "reject", or "obstructed". +KIND indicates the kind of conflict description that follows and is one +of: + + "text" - meaning a "normal" text conflict of the whole node (which must be a + file), with left/right/mine full texts saved, and (unless it's + "binary") conflict markers in the working text; + "prop" - meaning a "normal" property conflict, with left/right/mine full + values saved; + "tree" - meaning a tree conflict; + "reject" - meaning a text conflict for a single hunk of text, with the source + being a patch file (rather than left/right full texts), and with a + "reject" file being saved (?); + "obstructed" - meaning ### TODO + OPERATION indicates the operation which caused the conflict and is -detailed below. KIND-SPECIFIC is specific to each KIND, and is detailed -below. +detailed below. + +KIND-SPECIFIC is specific to each KIND, and is detailed below. + +There are restrictions on what mixture of conflicts can meaningfully be +recorded - e.g. there must not be two "text" nor one "text" and one +"reject". These restrictions are implied but not spelled out here. + ### stsp: need conflict data format version info inside skel, too? ### or do we bump the entire wc.db format number if we need to tweak @@ -42,11 +70,21 @@ Operation skel -------------- +Meaning: The Operation skel indicates what kind of operation was being +performed that resulted in a conflict, including the format and content +(or reference to content) of the diff that was being applied to this +node. + The OPERATION skel has the following form: (NAME OPERATION-SPECIFIC) -NAME is "update", "switch", "merge", or "patch". +NAME is one of: + + "update" - meaning a 3-way merge as in "svn update"; + "switch" - meaning a 3-way merge as in "svn switch"; + "merge" - meaning a 3(4?)-way merge as in "svn merge"; + "patch" - meaning application of a context-diff, as in "svn patch". OPERATION-SPECIFIC is as follows: @@ -125,6 +163,10 @@ ("text" OPERATION LEFT_SHA1 RIGHT_SHA1 MINE_SHA1) +### JAF: this skel is meant to be the one designated as KIND-SPECIFIC in +### the master skel description above, so the "text" atom at the +### beginning is redundant, isn't it? Not sure if that's intended. + {LEFT,RIGHT,MINE}_SHA1 are sha1 checksums of the full texts of the {left,right,mine} version of the file. File version's content can be obtained from the pristine store. @@ -161,6 +203,9 @@ -------------- Tree conflicts exist on files or directories. + +### JAF: And symlinks, I presume - or, if not, why not? + The following information is stored if there is a tree conflict on the node: ("tree" OPERATION