Re: [PATCH] fast-export: Allow pruned-references in mark file
On Sat, Apr 6, 2013 at 11:04 AM, Antoine Pelisse wrote: > fast-export can fail because of some pruned-reference when importing a > mark file. > > The problem happens in the following scenario: > > $ git fast-export --export-marks=MARKS master > (rewrite master) > $ git prune > $ git fast-export --import-marks=MARKS master > > This might fail if some references have been removed by prune > because some marks will refer to no longer existing commits. > git-fast-export will not need these objects anyway as they were no > longer reachable. > > We still need to update last_numid so we don't change the mapping > between marks and objects for remote-helpers. > Unfortunately, the mark file should not be rewritten without lost marks > if no new objects has been exported, as we could lose track of the last > last_numid. Makes sense to me. Reviewed-by: Felipe Contreras -- Felipe Contreras -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] fast-export: Allow pruned-references in mark file
> Yeah, I think I agree that you would need to make sure that the > other side does not use the revision marked with :2, once you retire > the object you originally marked with :2 by pruning. Shouldn't the > second export show :1 and :3 but not :2? It feels like a bug in the > exporter to me that the mark number is reused in such a case. It depends what you call a bug. If the last item from the list is pruned, and no new objects are exported, you will lose both reference and count to mark :2. In this situation, incrementing last_idnum was pointless. Assuming that we can't do anything about that, marks should be considered mutable (and I don't think there is any way it shouldn't). Then incrementing last_idnum is always useless. Now, if marks can change, I don't understand why we use them at all. (or don't provide the possibility to not use them at least). In the "hg <-> git" case, it seems like an unecessary step: hg revs <-> git marks <-> git sha1 Potentially forces the remote-helper to re-read the "marks <-> sha1" everytime. Also in the remote-helper, the "list" command requires sha1 for each heads, while "import/export" can't work with sha1 but only marks, which seems inconsistent. My last point is about "git-remote-hg" and still mutable revs. It seems like Felipe is using revs() rather than node() or hex() to refer to mercurial changeset while those revs are also mutable, and there exists immutable references: hex. To sum up, the whole idea is, why would we use unsafe mutable marks when we can use safer immutable references ? Cheers, Antoine -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] fast-export: Allow pruned-references in mark file
Antoine Pelisse writes: >> I am not sure I follow the above, but anyway, I think the patch does >> is safe because (1) future "fast-export" will not refer to these >> pruned objects in its output (we have decided that these pruned >> objects are not used anywhere in the history so nobody will refer to >> them) and (2) we still need to increment the id number so that later >> objects in the marks file get assigned the same id number as they >> were assigned originally (otherwise we will not name these objects >> consistently when we later talk about them). > > I fully agree on (1), not so much on (2) though. > ... > Both "commit mark :2" and "commit mark :3" end up being marked :2. > Any tool like git-remote-hg that is using a mapping from mark <-> hg changeset > could then fail. Yeah, I think I agree that you would need to make sure that the other side does not use the revision marked with :2, once you retire the object you originally marked with :2 by pruning. Shouldn't the second export show :1 and :3 but not :2? It feels like a bug in the exporter to me that the mark number is reused in such a case. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] fast-export: Allow pruned-references in mark file
On Mon, Nov 26, 2012 at 9:04 PM, Antoine Pelisse wrote: >> I am not sure I follow the above, but anyway, I think the patch does >> is safe because (1) future "fast-export" will not refer to these >> pruned objects in its output (we have decided that these pruned >> objects are not used anywhere in the history so nobody will refer to >> them) and (2) we still need to increment the id number so that later >> objects in the marks file get assigned the same id number as they >> were assigned originally (otherwise we will not name these objects >> consistently when we later talk about them). > > I fully agree on (1), not so much on (2) though. > > I have the following behavior using my patch and running that script > that doesn't look correct. > > echo "Working scenario" > git init test && > (cd test && > git commit --allow-empty -m "Commit mark :1" && > git commit --allow-empty -m "Commit mark :2" && > git fast-export --export-marks=MARKS master > /dev/null && > cat MARKS && > git reset HEAD~1 && > sleep 1 && > git reflog expire --all --expire=now && > git prune --expire=now && > git commit --allow-empty -m "Commit mark :3" && > git fast-export --import-marks=MARKS \ > --export-marks=MARKS master > /dev/null && > cat MARKS) && > rm -rf test > > echo "Non-working scenario" > git init test && > (cd test && > git commit --allow-empty -m "Commit mark :1" && > git commit --allow-empty -m "Commit mark :2" && > git fast-export --export-marks=MARKS master > /dev/null && > cat MARKS && > git reset HEAD~1 && > sleep 1 && > git reflog expire --all --expire=now && > git prune --expire=now && > git fast-export --import-marks=MARKS \ > --export-marks=MARKS master > /dev/null && > git commit --allow-empty -m "Commit mark :3" && > git fast-export --import-marks=MARKS \ > --export-marks=MARKS master > /dev/null && > cat MARKS) && > rm -rf test > > outputs something like this: > Working scenario > Initialized empty Git repository in /home/antoine/test/.git/ > [master (root-commit) 6cf350d] Commit mark :1 > [master 8f97f85] Commit mark :2 > :1 6cf350d7ecb3dc6573b00f839a6a51625ed28966 > :2 8f97f85e1e7badf6a3daf411cf8d1133b00d522e > [master 21cadfd] Commit mark :3 > warning: Could not read blob 8f97f85e1e7badf6a3daf411cf8d1133b00d522e > :1 6cf350d7ecb3dc6573b00f839a6a51625ed28966 > :3 21cadfd87d90c05ce8770c968e5ed3d072ead4ae > Non-working scenario > Initialized empty Git repository in /home/antoine/test/.git/ > [master (root-commit) 5b5f7ec] Commit mark :1 > [master b224390] Commit mark :2 > :2 b224390daee199644495c15503882eb84df07df5 > :1 5b5f7ec77768393aab2a0c2c11b4b8f7773f8678 > warning: Could not read blob b224390daee199644495c15503882eb84df07df5 > [master 181a774] Commit mark :3 > :1 5b5f7ec77768393aab2a0c2c11b4b8f7773f8678 > :2 181a7744c6d3428edb01a1adc9df247e9620be5f > > Both "commit mark :2" and "commit mark :3" end up being marked :2. > Any tool like git-remote-hg that is using a mapping from mark <-> hg changeset > could then fail. I don't understand. "commit mark :2" 'git fast-export' would never point to that object again, the new commit would override that mark: commit refs/heads/master mark :2 ... commit mark :3 Then 'git remote-hg' should override that mark as well. But it doesn't matter, because that would be the case only for the last object, as soon as you find another valid object, that object's mark will be considered the last one. And what Junio said is consistent with what you want: last_idnum should be updated even if the object is not valid. Cheers. -- Felipe Contreras -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] fast-export: Allow pruned-references in mark file
> I am not sure I follow the above, but anyway, I think the patch does > is safe because (1) future "fast-export" will not refer to these > pruned objects in its output (we have decided that these pruned > objects are not used anywhere in the history so nobody will refer to > them) and (2) we still need to increment the id number so that later > objects in the marks file get assigned the same id number as they > were assigned originally (otherwise we will not name these objects > consistently when we later talk about them). I fully agree on (1), not so much on (2) though. I have the following behavior using my patch and running that script that doesn't look correct. echo "Working scenario" git init test && (cd test && git commit --allow-empty -m "Commit mark :1" && git commit --allow-empty -m "Commit mark :2" && git fast-export --export-marks=MARKS master > /dev/null && cat MARKS && git reset HEAD~1 && sleep 1 && git reflog expire --all --expire=now && git prune --expire=now && git commit --allow-empty -m "Commit mark :3" && git fast-export --import-marks=MARKS \ --export-marks=MARKS master > /dev/null && cat MARKS) && rm -rf test echo "Non-working scenario" git init test && (cd test && git commit --allow-empty -m "Commit mark :1" && git commit --allow-empty -m "Commit mark :2" && git fast-export --export-marks=MARKS master > /dev/null && cat MARKS && git reset HEAD~1 && sleep 1 && git reflog expire --all --expire=now && git prune --expire=now && git fast-export --import-marks=MARKS \ --export-marks=MARKS master > /dev/null && git commit --allow-empty -m "Commit mark :3" && git fast-export --import-marks=MARKS \ --export-marks=MARKS master > /dev/null && cat MARKS) && rm -rf test outputs something like this: Working scenario Initialized empty Git repository in /home/antoine/test/.git/ [master (root-commit) 6cf350d] Commit mark :1 [master 8f97f85] Commit mark :2 :1 6cf350d7ecb3dc6573b00f839a6a51625ed28966 :2 8f97f85e1e7badf6a3daf411cf8d1133b00d522e [master 21cadfd] Commit mark :3 warning: Could not read blob 8f97f85e1e7badf6a3daf411cf8d1133b00d522e :1 6cf350d7ecb3dc6573b00f839a6a51625ed28966 :3 21cadfd87d90c05ce8770c968e5ed3d072ead4ae Non-working scenario Initialized empty Git repository in /home/antoine/test/.git/ [master (root-commit) 5b5f7ec] Commit mark :1 [master b224390] Commit mark :2 :2 b224390daee199644495c15503882eb84df07df5 :1 5b5f7ec77768393aab2a0c2c11b4b8f7773f8678 warning: Could not read blob b224390daee199644495c15503882eb84df07df5 [master 181a774] Commit mark :3 :1 5b5f7ec77768393aab2a0c2c11b4b8f7773f8678 :2 181a7744c6d3428edb01a1adc9df247e9620be5f Both "commit mark :2" and "commit mark :3" end up being marked :2. Any tool like git-remote-hg that is using a mapping from mark <-> hg changeset could then fail. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] fast-export: Allow pruned-references in mark file
Antoine Pelisse writes: > On Mon, Nov 26, 2012 at 12:37 PM, Felipe Contreras > wrote: >> On Mon, Nov 26, 2012 at 5:03 AM, Junio C Hamano wrote: >>> Is this a safe and sane thing to do, and if so why? Could you >>> describe that in the log message here? >> Why would fast-export try to export something that was pruned? Doesn't >> that mean it wasn't reachable? > > Hello Junio, > Hello Felipe, > > Actually the issue happened while using Felipe's branch with his > git-remote-hg. Everything was going fine until I (or did it run > automatically, I dont remember) ran git gc that pruned unreachable > objects. Of course some of the branch I had pushed to the hg remote > had been changed (most likely rebased). References no longer exists > in the repository (cleaned by gc), but the reference still exists in > mark file, as it was exported earlier. Thus the failure when git > fast-export reads the mark file. You described that part very well in your proposed log message and I got it just fine. > Then, is it safe ? > Updating the last_idnum as I do in the patch doesn't work because > if the reference is the last, the number is going to be overwriten > in the next run. > From git point of view, I guess it is fine. The file is fully read at > the beginning of fast-export and fully written at the end. I am not sure I follow the above, but anyway, I think the patch does is safe because (1) future "fast-export" will not refer to these pruned objects in its output (we have decided that these pruned objects are not used anywhere in the history so nobody will refer to them) and (2) we still need to increment the id number so that later objects in the marks file get assigned the same id number as they were assigned originally (otherwise we will not name these objects consistently when we later talk about them). And I wanted to see that kind of reasoning behind the patch in the proposed log message, because other people will need to refer to it when they read "git log" output to understand the change. Thanks. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] fast-export: Allow pruned-references in mark file
> If that's the case, I don't think it should throw a warning even just skip > them. Removing the warning seems fine to me. > Then, in the actual export if some of these objects are referenced the > export would fail anyway (but they won't). Of course it will fail to export anything that requires the missing object. As they are unreachable, it will be hard to provide a ref that needs it anyway. On the other hand, I'm afraid that your file '.git/hg//marks-hg' needs consistent references to mark. If a mark is removed, and then replaced by another object, can it break somehow git-remote-hg ? If not, I can provide a simpler patch. If it does, it will be more complicated. Cheers, Antoine -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] fast-export: Allow pruned-references in mark file
On Mon, Nov 26, 2012 at 2:23 PM, Antoine Pelisse wrote: > On Mon, Nov 26, 2012 at 12:37 PM, Felipe Contreras > wrote: >> On Mon, Nov 26, 2012 at 5:03 AM, Junio C Hamano wrote: >>> Is this a safe and sane thing to do, and if so why? Could you >>> describe that in the log message here? >> Why would fast-export try to export something that was pruned? Doesn't >> that mean it wasn't reachable? > > Hello Junio, > Hello Felipe, > > Actually the issue happened while using Felipe's branch with his > git-remote-hg. Everything was going fine until I (or did it run > automatically, I dont remember) ran git gc that pruned unreachable > objects. Of course some of the branch I had pushed to the hg remote > had been changed (most likely rebased). References no longer exists > in the repository (cleaned by gc), but the reference still exists in > mark file, as it was exported earlier. Thus the failure when git > fast-export reads the mark file. Ah, I see, so these objects are _before_ fast-export tries to do anything, it's just importing the marks without any knowledge if these objects are going to be used in the export or not. If that's the case, I don't think it should throw a warning even just skip them. Then, in the actual export if some of these objects are referenced the export would fail anyway (but they won't). Cheers. -- Felipe Contreras -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] fast-export: Allow pruned-references in mark file
On Mon, Nov 26, 2012 at 12:37 PM, Felipe Contreras wrote: > On Mon, Nov 26, 2012 at 5:03 AM, Junio C Hamano wrote: >> Is this a safe and sane thing to do, and if so why? Could you >> describe that in the log message here? > Why would fast-export try to export something that was pruned? Doesn't > that mean it wasn't reachable? Hello Junio, Hello Felipe, Actually the issue happened while using Felipe's branch with his git-remote-hg. Everything was going fine until I (or did it run automatically, I dont remember) ran git gc that pruned unreachable objects. Of course some of the branch I had pushed to the hg remote had been changed (most likely rebased). References no longer exists in the repository (cleaned by gc), but the reference still exists in mark file, as it was exported earlier. Thus the failure when git fast-export reads the mark file. Then, is it safe ? Updating the last_idnum as I do in the patch doesn't work because if the reference is the last, the number is going to be overwriten in the next run. >From git point of view, I guess it is fine. The file is fully read at the beginning of fast-export and fully written at the end. The issue is more for git-remote-hg that keeps track of matches between git marks and hg commits. The marks are going to change and be overriden. It will most likely need to read the mark file to see if a ref has changed, and update it's dictionary. One of the solution I'm thinking of, is to update the mark file with marks of newly exported objects instead of recreating it, and let obsolete references in the file. But of course that is not acceptable. Cheers, Antoine -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] fast-export: Allow pruned-references in mark file
On Mon, Nov 26, 2012 at 5:03 AM, Junio C Hamano wrote: > Antoine Pelisse writes: > >> fast-export can fail because of some pruned-reference when importing a >> mark file. >> >> The problem happens in the following scenario: >> >> $ git fast-export --export-marks=MARKS master >> (rewrite master) >> $ git prune >> $ git fast-export --import-marks=MARKS master >> >> This might fail if some references have been removed by prune >> because some marks will refer to non-existing commits. >> >> Let's warn when we have a mark for a commit we don't know. >> Also, increment the last_idnum before, so we don't override >> the mark. > > Is this a safe and sane thing to do, and if so why? Could you > describe that in the log message here? Why would fast-export try to export something that was pruned? Doesn't that mean it wasn't reachable? Essentially, if 'git rev-list $foo' can't possibly export this pruned object, why would 'git fast-export $foo' would? Cheers. -- Felipe Contreras -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] fast-export: Allow pruned-references in mark file
Antoine Pelisse writes: > fast-export can fail because of some pruned-reference when importing a > mark file. > > The problem happens in the following scenario: > > $ git fast-export --export-marks=MARKS master > (rewrite master) > $ git prune > $ git fast-export --import-marks=MARKS master > > This might fail if some references have been removed by prune > because some marks will refer to non-existing commits. > > Let's warn when we have a mark for a commit we don't know. > Also, increment the last_idnum before, so we don't override > the mark. Is this a safe and sane thing to do, and if so why? Could you describe that in the log message here? > Signed-off-by: Antoine Pelisse > --- > builtin/fast-export.c | 11 +++ > 1 file changed, 7 insertions(+), 4 deletions(-) > > diff --git a/builtin/fast-export.c b/builtin/fast-export.c > index 12220ad..141b245 100644 > --- a/builtin/fast-export.c > +++ b/builtin/fast-export.c > @@ -607,16 +607,19 @@ static void import_marks(char *input_file) > || *mark_end != ' ' || get_sha1(mark_end + 1, sha1)) > die("corrupt mark line: %s", line); > > + if (last_idnum < mark) > + last_idnum = mark; > + > object = parse_object(sha1); > - if (!object) > - die ("Could not read blob %s", sha1_to_hex(sha1)); > + if (!object) { > + warning("Could not read blob %s", sha1_to_hex(sha1)); > + continue; > + } > > if (object->flags & SHOWN) > error("Object %s already has a mark", > sha1_to_hex(sha1)); > > mark_object(object, mark); > - if (last_idnum < mark) > - last_idnum = mark; > > object->flags |= SHOWN; > } -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html