RE: Rename conflicts in the index
Edward Thomson [ethom...@microsoft.com] wrote: Junio C Hamano [mailto:gis...@pobox.com] wrote: * Path A may have only stage #1, while path B and C has only stage #2 and stage #3 (the user would have to notice these three correspond to each other, and resolve manually). You would want to annotate B at stage #2 seems to have been at A in the original (similarly for C#3) if you choose to do so. If we're going to make changes to the way conflicts are recorded in the main index, then I would prefer this approach. It is unambiguous and all data about all sides are recorded, including the names that items had in their respective branches. Junio, did you have additional thoughts on this? What would you like from me to proceed? If the aforementioned seems reasonable, I can update Documentation/technical/index-format.txt and we can iron out the details in that fashion? Thanks- -ed -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Rename conflicts in the index
Edward Thomson ethom...@microsoft.com writes: Edward Thomson [ethom...@microsoft.com] wrote: Junio C Hamano [mailto:gis...@pobox.com] wrote: * Path A may have only stage #1, while path B and C has only stage #2 and stage #3 (the user would have to notice these three correspond to each other, and resolve manually). You would want to annotate B at stage #2 seems to have been at A in the original (similarly for C#3) if you choose to do so. If we're going to make changes to the way conflicts are recorded in the main index, then I would prefer this approach. It is unambiguous and all data about all sides are recorded, including the names that items had in their respective branches. Junio, did you have additional thoughts on this? Not at this moment. I think we have covered the principles (do not unnecessarily duplicate information, do not break existing implementations unnecessarily, etc.) already, and we know how we want to record one side renamed A to B, the other side renamed A to C case, but I do not think the discussion covered all cases yet. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: Rename conflicts in the index
Junio C Hamano [mailto:gits...@pobox.com] wrote: Edward Thomson ethom...@microsoft.com writes: Junio, did you have additional thoughts on this? Not at this moment. I think we have covered the principles (do not unnecessarily duplicate information, do not break existing implementations unnecessarily, etc.) already, and we know how we want to record one side renamed A to B, the other side renamed A to C case, but I do not think the discussion covered all cases yet. Sorry, I'm not sure what you're asking for - would you just like some more examples of what this looks like with aforementioned exotic conflict types? Or are you looking for something more strict - BNF format, for example? -ed -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Rename conflicts in the index
Edward Thomson ethom...@microsoft.com writes: Junio C Hamano [mailto:gits...@pobox.com] wrote: Edward Thomson ethom...@microsoft.com writes: Junio, did you have additional thoughts on this? Not at this moment. I think we have covered the principles (do not unnecessarily duplicate information, do not break existing implementations unnecessarily, etc.) already, and we know how we want to record one side renamed A to B, the other side renamed A to C case, but I do not think the discussion covered all cases yet. Sorry, I'm not sure what you're asking for - would you just like some more examples of what this looks like with aforementioned exotic conflict types? Or are you looking for something more strict - BNF format, for example? Ehh, I wasn't asking for anything ;-) You asked if I had any additional thoughts, I answered there is nothing at this moment based on what I saw so far. It is not my immediate itch to update the index with more rename information, but it is yours, so I would imagine you would know what cases you would want to improve the end user experience better than I do ;-). If I were solving the issue, I would probably proceed like this: * Start from a rough sketch of what extra information I would want to store in the new index extension section. * Teach read-cache.c to read from the new extension and keep it in an in-core data structure, and read from the in-core data structure and seriealize it to write to the extension section. * Perhaps enhance update-index so that it can read textual representation of the contents of the new extension section, turn it into the in-core representation, so that it can write it out to the index file, as a debugging/development aid. * Teach read-cache.c to read from the new extension and keep it in an in-core data structure. * Teach wt-status.c to read from that in-core data structure and improve the presentation of the cases I care about using that information. Use the update-index development aid to prepare various cases you care about. - If the kind of information that is stored in the new extension turns out to be insufficient, go back to the beginning and iterate. - If the use the in-core data structure here turns out to be awkward, go back one step and iterate. - As I cover one more case, I would add a test to the test suite so that we would know what cases are covered and what the expected end-user presentation should be. * Once the result of the above covers all the cases I care about, then update merge-recursive.c to prepare the in-core data structure to be written out as the extension section. As I iterate, the rough sketch will hopefully cover all the cases I care about and I'll be ready to write them down as an update to the document somewhere in Documentation/technical/api-*. Thanks. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Rename conflicts in the index
Junio C Hamano gits...@pobox.com writes: If I were solving the issue, I would probably proceed like this: * Start from a rough sketch of what extra information I would want to store in the new index extension section. * Teach read-cache.c to read from the new extension and keep it in an in-core data structure, and read from the in-core data structure and seriealize it to write to the extension section. * Perhaps enhance update-index so that it can read textual representation of the contents of the new extension section, turn it into the in-core representation, so that it can write it out to the index file, as a debugging/development aid. * Teach read-cache.c to read from the new extension and keep it in an in-core data structure. Sorry, this is a dup of the second one. Please ignore. Also, all you in this section should read I (because this is a description of If I were solving it). * Teach wt-status.c to read from that in-core data structure and improve the presentation of the cases I care about using that information. Use the update-index development aid to prepare various cases you care about. - If the kind of information that is stored in the new extension turns out to be insufficient, go back to the beginning and iterate. - If the use the in-core data structure here turns out to be awkward, go back one step and iterate. - As I cover one more case, I would add a test to the test suite so that we would know what cases are covered and what the expected end-user presentation should be. * Once the result of the above covers all the cases I care about, then update merge-recursive.c to prepare the in-core data structure to be written out as the extension section. As I iterate, the rough sketch will hopefully cover all the cases I care about and I'll be ready to write them down as an update to the document somewhere in Documentation/technical/api-*. Thanks. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: Rename conflicts in the index
Junio C Hamano [mailto:gis...@pobox.com] wrote: As long as the format will be backward compatible to allow existing users use existing tools to deal with cases the existing tools can handle, then that is OK. I didn't get that impression which is where my non starter came from. I see now. Thank you for the clarification. I apologize if I was not clear about this; indeed, the duplication of data in my proposed extension was specifically to avoid any compatibility problems amongst clients. In particular, when we have a rename in ours, edit in theirs conflict, we store the conflict at the new (ours) path. If, for example, I rename a-b in my branch and merge a branch that edits a: mode hash 1 b mode hash 2 b mode hash 3 b This prohibits us from storing anything else in the theirs side at that path, so if I were to have added b in their branch in addition to modifying b, I cannot record it. I was assuming that any change to this behavior would be a breaking one, which is where the new section came from. * Path A may have only stage #1, while path B and C has only stage #2 and stage #3 (the user would have to notice these three correspond to each other, and resolve manually). You would want to annotate B at stage #2 seems to have been at A in the original (similarly for C#3) if you choose to do so. If we're going to make changes to the way conflicts are recorded in the main index, then I would prefer this approach. It is unambiguous and all data about all sides are recorded, including the names that items had in their respective branches. I would think that this might be a burden on current tools, however. Now if I rename a-b my just my branch, my conflict will be recorded as: mode hash 1 a mode hash 2 b mode hash 3 a And current git-status will not look at any rename annotations to know how to report this. However, maybe this is not as big a problem as I'm concerned it would be. * You can choose to favor our choice, and have path B with three stages (if we guessed wrong and the user wants to move it to C, the user can resolve and then git mv the path). I think this approach suffers from the drawback that the current approach has, wherein this conflicts when they had path B, also, as noted above. I think that if you were to put both B and C with all three stages, this would be problematic for the same reason. *1* Instead of a three-way merge that inspects only the endpoints, you might get a better rename trail if you looked at the histories of both branches. It would be a lot more expensive than the simple three-way, but burning CPU cycles is better than burning human neurons. For the record, I like this approach very much. It's not something that libgit2 will be able to tackle in the near future; we're in a sort of walk-before-you-can-run situation with merge at the moment, as you can probably see. But any improvement that avoids burning neurons is a valuable one. Thanks- -ed -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: Rename conflicts in the index
Junio C Hamano [mailto:gits...@pobox.com] wrote: Edward Thomson ethom...@microsoft.com writes: I would propose that this not simply track rename conflicts, but all conflicts. That is a no starter. So. Can you explain to me why this would be a non starter? Can you suggest some alternate strategy here? Maybe there's something I'm fundamentally misunderstanding. It seems that at present, git will: 1. Detect rename conflicts when performing a merge (at least, git-merge-recursive will, which is the default.) 2. If the rename itself caused a conflict (eg, renamed in one side, added in the other) then the merge cannot succeed. 3. The resultant index is written as if renames were not detected, which means - at best - records the files that went in to the name conflict and git status reports an added in ours conflict, which is a pretty disappointing conflict. Often, though, many of the files will not exist at higher stage entries, since without rename detection, they would have not been conflicts. At worst, one side is staged, there are no conflicts in the index and the user can commit (and thus lose the other side.) Thus it's not like we could add some extension that merely records the names that produced the rename conflicts and point them at the higher stage entries in the index. That would require that they actually be in the index. Thanks- -ed -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Rename conflicts in the index
Edward Thomson ethom...@microsoft.com writes: Junio C Hamano [mailto:gits...@pobox.com] wrote: Edward Thomson ethom...@microsoft.com writes: I would propose that this not simply track rename conflicts, but all conflicts. That is a no starter. So. Can you explain to me why this would be a non starter? At least two, IIRC. One is the consequence of the other. We do not gratuitously break existing implementations. If no conflict is stored as higher-stage index entries in an index that has your index extension, no existing implementation can read a conflicted index written by your implementation and have users resolve conflicts. When a path originally at A is moved to B on only one branch, and there are content-level conflicts between the changes made by one branch (while going from A to B) and by the other branch (while keeping it at A), we would end up having three stages for path B without any trace of path A. I do not offhand know how much it helps to learn A in such a situation in the real life, but we are indeed losing information, and I do not have any problem with an extension that records in the index the fact that in the two (of the three) commits involved in the merge, the path was at A. But people have been successfully using existing versions of Git without that information to merge branches with renames, and resolving the content-level conflicts. Your tool that _additionally_ records This path that currently has three stages for B was at A in the common ancestor (i.e. stage #1) and that branch (either stage #2 or stage #3) does not _have_ _to_ break these users by removing the three stages for B from the main index. Also we do not duplicate information unnecessarily. Nowhere in the above we have been losing the fact that two of the three had the contents we have at path B in the resulting unmerged index at path A, and that information might be useful as well, there is a reason to write another copy of mode or SHA-1 for any of the three variants. As I said, you do not live in the world where you are writing something like Git from scratch. Perhaps you do, but then the result will not be Git and we wouldn't be discussing that system on this mailing list. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: Rename conflicts in the index
Junio C Hamano [mailto:gits...@pobox.com] wrote: We do not gratuitously break existing implementations. If no conflict is stored as higher-stage index entries in an index that has your index extension, no existing implementation can read a conflicted index written by your implementation and have users resolve conflicts. I'm not suggesting that anybody stop writing 0 stage entries. When a path originally at A is moved to B on only one branch, and there are content-level conflicts between the changes made by one branch (while going from A to B) and by the other branch (while keeping it at A), we would end up having three stages for path B without any trace of path A. I do not offhand know how much it helps to learn A in such a situation in the real life, but we are indeed losing information, and I do not have any problem with an extension that records in the index the fact that in the two (of the three) commits involved in the merge, the path was at A. What you've described is true only for a certain class of rename conflicts, for example the rename/edit conflict you've described above. It's also true if you were to rename some item 'a' to 'b' in both branches. But when 'b' is sufficiently dissimilar to become a rewrite, then I end up with a rename of a-b on one side and deleting a and adding b on the other. The result is a mysterious added by us conflict: 100644 e2dd530c9f31550a2b0c90773ccde056929d6d66 2 b Worse yet is if I don't do the rename in my side, but I just add a new b so that in theirs I've renamed a to b and in mine I have both a and b. When I do the merge, I'm told I have conflicts, except that I don't: 100644 08d4f831774aed5d4c6cb496affefd4020dce40c 0 b The other branch's b is long gone and exists only as a dirty file in the workdir. But people have been successfully using existing versions of Git without that information to merge branches with renames, and resolving the content-level conflicts. But you aren't afforded the option to resolve content-level conflicts if you don't know where the conflict came from. For example, in a rename 1-2 conflict, we dutifully detect that a was renamed to both b and c and fail, but that fact is never given to the index. This conflict could be fed into a merge tool or, better, automerged, with the user only needing to pick a path: 100644 421c9102b8562ad227ba773ab1cf6bbed7b7496d 1 a 100644 421c9102b8562ad227ba773ab1cf6bbed7b7496d 3 b 100644 421c9102b8562ad227ba773ab1cf6bbed7b7496d 2 c I hate to sound like a broken record here, but without some more data in the index - anywhere, really - any tool that doesn't have the luxury of emitting data about what happened to stdout certainly can't infer anything about what happened in the merge. -ed -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Rename conflicts in the index
Edward Thomson ethom...@microsoft.com writes: Junio C Hamano [mailto:gits...@pobox.com] wrote: We do not gratuitously break existing implementations. If no conflict is stored as higher-stage index entries in an index that has your index extension, no existing implementation can read a conflicted index written by your implementation and have users resolve conflicts. I'm not suggesting that anybody stop writing 0 stage entries. Ah, OK, then I misread your original message. You said Having a single canonical location is preferable - if the index contains a CONF section (and the client supports it), it would use that. Otherwise, the client would look at stage 0 entries. which I read as an index with this extension would not have higher stage entries, an index without the extension records higher stage entries. As long as the format will be backward compatible to allow existing users use existing tools to deal with cases the existing tools can handle, then that is OK. I didn't get that impression which is where my non starter came from. What you've described is true only for a certain class of rename conflicts, for example the rename/edit conflict you've described above. As you asked me to explain why it was a non starter, I only illustrated with a renamed trivially, with content level conflict example that shows why dropping higher-stage entries in the main index would not be acceptable. The previous message did not even mean to cover any cases the *new* feature you have in mind is trying to address. Again, if it hurts existing users handling cases existing tools used to handle, that makes it a non starter. How new feature is designed, and extension is added to help that new feature, is a different matter. My original That's a non starter message didn't even go that far. In any case, the principle of always record the state 'merge' stopped to ask for help as higher stage entries to give existing tools and users a chance to manually resolve, and augment with optional extension to record additional information that might help, but do not gratiutously waste bytes on redundant information would apply to other exotic cases you would want to tackle with the new feature, I would think. If one branch moves path A in the original to path B and the other one moved it to path C, for example, we can record it in different ways, even in the main index. * Path A may have only stage #1, while path B and C has only stage #2 and stage #3 (the user would have to notice these three correspond to each other, and resolve manually). You would want to annotate B at stage #2 seems to have been at A in the original (similarly for C#3) if you choose to do so. * You can choose to favor our choice, and have path B with three stages (if we guessed wrong and the user wants to move it to C, the user can resolve and then git mv the path). You would want to annotate the other side wanted to have B at stage #3 at C in that case. * Or you may want to have in the main index both B and C (but not A) with all three stages (the user would have to choose which one survives, but discarding the other side with git rm would be easy). You would want to annotate the origin of the stage #1 for path B and C (these were originally at A), stage #2 for B (the other branch wants to have it at C), stage #3 for C (we want to have it at B). There may be other ways, and I do not offhand know what the current merge-recursive implementation does, but both of the latter two sound equally usable and reasonable ways, even without the annotation. And with your annotation that records different paths, the conflict may become even easier to resolve. I still do not need to duplicate mode, SHA-1 in the extensions to do the above, or do I? If the original path A was removed and a new path B was added, with contents that are modified from A beyond recognition, at the merge time you wouldn't know where B it came from or where A went, so annotating A at stage #1 to say it went to B is a nonsense. If you have algorithm to do so [*1*], you would be better off detecting it as a rename. [Footnote] *1* Instead of a three-way merge that inspects only the endpoints, you might get a better rename trail if you looked at the histories of both branches. It would be a lot more expensive than the simple three-way, but burning CPU cycles is better than burning human neurons. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: Rename conflicts in the index
Junio C Hamano [mailto:gits...@pobox.com] wrote: Edward Thomson ethom...@microsoft.com writes: I would propose that we store the data about the file in conflict as it occurred through the renames. For example, in a rename 1-2 conflict where A was renamed to both B and C, you would have a single conflict entry containing the data for A, B and C. This would allow us to provide more detailed information to the user - and allow them to (say) choose a single name to proceed with. Is this something that has value to core git as well? Alternately, is there something particularly stupid about this proposal? I do not offhand see anything particularly stupid; a new optional index extension section CACHE_EXT_RENAME_CONFLICT might be a good addition. Is one side moves A to B while the other side moves it to C the only case, or is it just an example? Off the top of my head, one side moves A to x while the other side moves B to x/y would also be something we would want to know. I am sure there are other cases that need to be considered. I do not think we can discuss the design at the concrete level until the proposal spells out to cover all interesting cases in order for implementations to agree on the common semantics. Sorry about the delay here: besides getting busy with some other things, I wanted both a complete writeup and to have taken a pass at a test implementation this in libgit2 to make sure seemed like a reasonably sensible approach. I would propose a new extension, 'CONF', to handle conflict data, differing from the stage 0 entries in the index in that this extension tracks the conflicting file across names if the underlying merge engine has support for renames. I made an attempt to keep the entry data similar to other entries in the index. I would propose that entries in the conflict are as follows: Flags Four octets that describe the conflict. Data includes: 0x01 HAS_ANCESTOR There is a file in the common ancestor branch that contributes to this conflict. Its data will follow. 0x02 HAS_OURS There is a file in our branch that contributes to this conflict. Its data will follow. 0x04 HAS_THEIRS There is a file in their branch that contributes to this conflict. Its data will follow. 0x08 NAME_CONFLICT_OURS This item has a path in our branch that overlaps a different item in their branch. (Eg, this conflict represents the our side of a rename/add conflict.) 0x10 NAME_CONFLICT_THEIRS This item has a path in their branch that overlaps a different item in our branch. (Eg, this conflict represents the theirs side of a rename/add conflict.) 0x20 DF_CONFLICT_FILE This is the file involved in a directory/file conflict. 0x40 DF_CONFLICT_CHILD This is a child of a directory involved in a directory/file conflict. Other bits are reserved. Conflict Sides The data about one side of a conflict will contain: mode (ASCII string representation of octal, null-terminated) path (null terminated) sha1 (raw bytes) The conflict sides will be written in this order: Ancestor (if HAS_ANCESTOR is set) Ours (if HAS_OURS is set) Theirs (if HAS_THEIRS is set) I would propose that this not simply track rename conflicts, but all conflicts. Having a single canonical location is preferable - if the index contains a CONF section (and the client supports it), it would use that. Otherwise, the client would look at stage 0 entries. I would propose that another extension, 'RSVD', track these conflicts once they are resolved. The format would be the same - when a conflict is resolved from the CONF the entry will be placed as-is in the RSVD. Examples are not an exhaustive list, but should help elucidate the name and d/f conflicts: Normal edit / edit conflict, where A is edited in ours and theirs: Conflict one: Flags = HAS_ANCESTOR|HAS_OURS|HAS_THEIRS Entry 1 = A [Ancestor] Entry 2 = B [Ancestor] Entry 3 = C [Ancestor] Rename / add conflict, where A is renamed to B in ours and B is added in theirs: Conflict one: Flags = HAS_ANCESTOR|HAS_OURS|NAME_CONFLICT_OURS Entry 1 = A [Ancestor] Entry 2 = B [Ours] Entry 3 = A [Theirs] Conflict two: Flags = HAS_THEIRS|NAME_CONFLICT_THEIRS Entry 1 = File B [Theirs] D/F conflict, where some file A is deleted in theirs, and a directory A is created with file child: Conflict one: Flags = HAS_ANCESTOR|HAS_OURS|HAS_THEIRS|DF_CONFLICT_FILE Entry 1 = A [Ancestor] Entry 2 = A [Ours] Conflict two: Flags = HAS_THEIRS|DF_CONFLICT_CHILD Entry 1 = A/child [Theirs] Thanks for your input on this. -ed -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Rename conflicts in the index
Edward Thomson ethom...@microsoft.com writes: I would propose a new extension, 'CONF', to handle conflict data, differing from the stage 0 entries in the index in that this extension tracks the conflicting file across names if the underlying merge engine has support for renames. I made an attempt to keep the entry data similar to other entries in the index. I would propose that entries in the conflict are as follows: Flags Four octets that describe the conflict. Data includes: 0x01 HAS_ANCESTOR There is a file in the common ancestor branch that contributes to this conflict. Its data will follow. 0x02 HAS_OURS There is a file in our branch that contributes to this conflict. Its data will follow. 0x04 HAS_THEIRS There is a file in their branch that contributes to this conflict. Its data will follow. 0x08 NAME_CONFLICT_OURS This item has a path in our branch that overlaps a different item in their branch. (Eg, this conflict represents the our side of a rename/add conflict.) 0x10 NAME_CONFLICT_THEIRS This item has a path in their branch that overlaps a different item in our branch. (Eg, this conflict represents the theirs side of a rename/add conflict.) 0x20 DF_CONFLICT_FILE This is the file involved in a directory/file conflict. 0x40 DF_CONFLICT_CHILD This is a child of a directory involved in a directory/file conflict. Other bits are reserved. Conflict Sides The data about one side of a conflict will contain: mode (ASCII string representation of octal, null-terminated) path (null terminated) sha1 (raw bytes) The conflict sides will be written in this order: Ancestor (if HAS_ANCESTOR is set) Ours (if HAS_OURS is set) Theirs (if HAS_THEIRS is set) Puzzled. Most of the above, except NAME_CONFLICT_{OURS,THEIRS} bits, look totally pointless duplication. When you are working with Git, you have to be prepared to read from the datafile like the index that other people (and your previous version) created, and you also have to make sure you do not make what you write out unusable by other people without a good reason. So your tool needs code to see higher stage entries in the main index to find mode,sha1 for the conflicted paths even without the index extension anyway, and if your tool does also perform merges, you would need to strive for writing the main index with conflicted entries and implementations that do not yet understand your extension can keep operating. For some types of extensions, the latter may be hard (and that is why I stopped at you would need to strive for, and not you must), but for the one under discussion, I do not think it is the case (by the way CONF sounds as if it is some sort of configuration data). If you are starting a brand new system from scratch, keeping only the resolved entries in the main index and having a separate section for conflicts might be also a valid design choice, but you do not live in that world if you are discussing the design on this mailing list. I would propose that this not simply track rename conflicts, but all conflicts. That is a no starter. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: Rename conflicts in the index
Junio C Hamano [mailto:gits...@pobox.com] wrote: Edward Thomson ethom...@microsoft.com writes: I would propose a new extension, 'CONF', to handle conflict data, differing from the stage 0 entries in the index in that this extension tracks the conflicting file across names if the underlying merge engine has support for renames. I made an attempt to keep the entry data similar to other entries in the index. I would propose that entries in the conflict are as follows: Flags Four octets that describe the conflict. Data includes: 0x01 HAS_ANCESTOR There is a file in the common ancestor branch that contributes to this conflict. Its data will follow. 0x02 HAS_OURS There is a file in our branch that contributes to this conflict. Its data will follow. 0x04 HAS_THEIRS There is a file in their branch that contributes to this conflict. Its data will follow. 0x08 NAME_CONFLICT_OURS This item has a path in our branch that overlaps a different item in their branch. (Eg, this conflict represents the our side of a rename/add conflict.) 0x10 NAME_CONFLICT_THEIRS This item has a path in their branch that overlaps a different item in our branch. (Eg, this conflict represents the theirs side of a rename/add conflict.) 0x20 DF_CONFLICT_FILE This is the file involved in a directory/file conflict. 0x40 DF_CONFLICT_CHILD This is a child of a directory involved in a directory/file conflict. Other bits are reserved. Conflict Sides The data about one side of a conflict will contain: mode (ASCII string representation of octal, null-terminated) path (null terminated) sha1 (raw bytes) The conflict sides will be written in this order: Ancestor (if HAS_ANCESTOR is set) Ours (if HAS_OURS is set) Theirs (if HAS_THEIRS is set) Puzzled. Most of the above, except NAME_CONFLICT_{OURS,THEIRS} bits, look totally pointless duplication. Obviously HAS_ANCESTOR / HAS_OURS / HAS_THEIRS is to indicate to a reader whether there is data to be read or not. Similar to how a mode of 0 in the REUC indicates that the rest of the record should not be read.) When you are working with Git, you have to be prepared to read from the datafile like the index that other people (and your previous version) created, and you also have to make sure you do not make what you write out unusable by other people without a good reason. I'm acutely aware that you need to be able to read an index that other people created - that's the problem at hand. git does not produce an index that allows anyone (including itself) to reason about rename conflicts. It doesn't even bother to write high-stage conflict entries in many instances, so you can have an instance where git tells you that a conflict occurred but one of those files is staged anyway, the other is just dirty in the workdir and you can commit immediately thereafter. While obviously it's possible to handle this situation (is file A in conflict? Look in the rename conflict extension. Not there? Okay, look in the index.) That's not exactly elegant. My goal here was to have a single source for conflicts. (by the way CONF sounds as if it is some sort of configuration data). There's only four letters, and not everything's as easy as TREE. REUC, for example, sounds like a donkey, though I suppose it depends on the language in question. -ed -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Rename conflicts in the index
Edward Thomson ethom...@microsoft.com writes: I would propose that we store the data about the file in conflict as it occurred through the renames. For example, in a rename 1-2 conflict where A was renamed to both B and C, you would have a single conflict entry containing the data for A, B and C. This would allow us to provide more detailed information to the user - and allow them to (say) choose a single name to proceed with. Is this something that has value to core git as well? Alternately, is there something particularly stupid about this proposal? I do not offhand see anything particularly stupid; a new optional index extension section CACHE_EXT_RENAME_CONFLICT might be a good addition. Is one side moves A to B while the other side moves it to C the only case, or is it just an example? Off the top of my head, one side moves A to x while the other side moves B to x/y would also be something we would want to know. I am sure there are other cases that need to be considered. I do not think we can discuss the design at the concrete level until the proposal spells out to cover all interesting cases in order for implementations to agree on the common semantics. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: Rename conflicts in the index
Junio C Hamano [mailto:gits...@pobox.com] wrote: I do not offhand see anything particularly stupid; a new optional index extension section CACHE_EXT_RENAME_CONFLICT might be a good addition. Is one side moves A to B while the other side moves it to C the only case, or is it just an example? Off the top of my head, one side moves A to x while the other side moves B to x/y would also be something we would want to know. I am sure there are other cases that need to be considered. Yes, that was just an example. Certainly I was intending that all conflicts that arose from renames would end up here since one can't really reason why the merge tool created a conflict by looking at the index alone - even knowing the merge tool's similarity algorithms, this would be awfully expensive to piece back together, even if the index did contain non-zero stage entries for all the items that were involved in the conflicts. That said, my rather naive initial thought was that we could repeat *all* conflicts in this area. This would give tools that knew how to understand this the ability to go to a single place for conflict data, rather than producing some merge of high-stage entries that comprise non-rename conflicts and data from the rename conflict area for rename conflicts. Thanks- -ed -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html