Re: Avoiding broken Gitweb links and deleted objects

2013-05-22 Thread Junio C Hamano
Matt McClure  writes:

> On Fri, May 10, 2013 at 12:22 PM, Junio C Hamano  wrote:
>> I think what I missed is that the same logic to ignore side branches
>> whose history gets cauterised with such an "ours" merge may apply to
>> an "ours" merge that people already make, but the latter may want to
>> take both histories into account.
>>
>> So I guess it is not such a great idea.
>
> The particular proposed implementation? Or the broader idea to save
> loose commits more permanently? I'm still interested in a solution for
> the latter.

Recording such an "otherwise should not be recorded as a merge" side
history as if it were "-s ours" merge is what I judged as "not a
great idea".

If you want to keep older commits, either you make sure you point at
them with some refs, or not prune the repository.  I do not think of
any other solution offhand.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Avoiding broken Gitweb links and deleted objects

2013-05-22 Thread William Swanson
On Wed, May 22, 2013 at 6:32 AM, Matt McClure  wrote:
> Is there a way to push/pull reflogs among different repositories?

Not that I am aware of, at least not in core git.

> In my original scenario:
>
> 1. the commits are created on a developer machine
> 2. pushed to a central origin repository running Gitweb
> 3. the branch is rebased on the developer machine
> 4. the branch is push --force'd to the origin
>
> Later, git push tells me:
>
> warning: There are too many unreachable loose objects; run 'git
> prune' to remove them.

You don't need to share reflogs in this case. Assuming the server were
to keep logs of its own, the forced update would create a new reflog
entry showing something like "   Forced
push", so the pre-rebase version would still be reachable from the
reflogs, keeping it around.

> or I want to delete old topic branch HEADs to improve performance.
>
> But I never want to let Git delete the underlying commit objects since
> there could be Gitweb links pointing at them.

The reflog thing won't help you in this case, since reflogs are
deleted when their branches are deleted. it sounds like you never want
to delete anything, so it would make more sense to just disable
garbage collection entirely.

-William
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Avoiding broken Gitweb links and deleted objects

2013-05-22 Thread Matt McClure
On Fri, May 10, 2013 at 3:34 AM, William Swanson  wrote:
> I started working on something like this a few weeks ago, but
> eventually came to the conclusion that this information does not
> belong in the commit graph itself.
>
> A better approach, I think, would be to enhance the reflogs to the
> point where they can provide this information in a reliable manner.

Is there a way to push/pull reflogs among different repositories?

In my original scenario:

1. the commits are created on a developer machine
2. pushed to a central origin repository running Gitweb
3. the branch is rebased on the developer machine
4. the branch is push --force'd to the origin

Later, git push tells me:

warning: There are too many unreachable loose objects; run 'git
prune' to remove them.

or I want to delete old topic branch HEADs to improve performance.

But I never want to let Git delete the underlying commit objects since
there could be Gitweb links pointing at them.

-- 
Matt McClure
http://matthewlmcclure.com
http://www.mapmyfitness.com/profile/matthewlmcclure
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Avoiding broken Gitweb links and deleted objects

2013-05-22 Thread Matt McClure
On Fri, May 10, 2013 at 12:22 PM, Junio C Hamano  wrote:
> I think what I missed is that the same logic to ignore side branches
> whose history gets cauterised with such an "ours" merge may apply to
> an "ours" merge that people already make, but the latter may want to
> take both histories into account.
>
> So I guess it is not such a great idea.

The particular proposed implementation? Or the broader idea to save
loose commits more permanently? I'm still interested in a solution for
the latter.

-- 
Matt McClure
http://matthewlmcclure.com
http://www.mapmyfitness.com/profile/matthewlmcclure
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Avoiding broken Gitweb links and deleted objects

2013-05-10 Thread Junio C Hamano
Junio C Hamano  writes:

> Duy Nguyen  writes:
>
>> On Fri, May 10, 2013 at 1:37 PM, Junio C Hamano  wrote:
>>> Johannes Sixt  writes:
>>> Imagine that a user runs "git rebase" on a history leading to commit
>>> X to create an alternate, improved history that leads to commit Y.
>>> What if we teach "git rebase" to record, perhaps by default, an
>>> "ours" merge on top of Y that takes the tree state of Y but has X as
>>> its second parent, and "git log" and its family to ignore such an
>>> artificial "ours" merge that records a tree that is identical to one
>>> of its parents, again perhaps by default?  "git log" works more or
>>> less in such a way already, but we might want to teach other modes
>>> like --full-history and --simplify-merges to ignore "ours" to hide
>>> such an artificial merge by default, with an audit option to
>>> unignore them.
>>
>> What about git-merge? Will it be fooled by these merges while looking
>> for merge bases?
>
> I thought it was obvious that we should ignore the side branches
> that were superseded this way, as by definition they did not
> contribute to the end result at all.
>
> But there must be something huge that I missed...

I think what I missed is that the same logic to ignore side branches
whose history gets cauterised with such an "ours" merge may apply to
an "ours" merge that people already make, but the latter may want to
take both histories into account.

So I guess it is not such a great idea.

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Avoiding broken Gitweb links and deleted objects

2013-05-10 Thread Duy Nguyen
On Fri, May 10, 2013 at 2:16 PM, Junio C Hamano  wrote:
> Duy Nguyen  writes:
>
>> On Fri, May 10, 2013 at 1:37 PM, Junio C Hamano  wrote:
>>> Johannes Sixt  writes:
>>> Imagine that a user runs "git rebase" on a history leading to commit
>>> X to create an alternate, improved history that leads to commit Y.
>>> What if we teach "git rebase" to record, perhaps by default, an
>>> "ours" merge on top of Y that takes the tree state of Y but has X as
>>> its second parent, and "git log" and its family to ignore such an
>>> artificial "ours" merge that records a tree that is identical to one
>>> of its parents, again perhaps by default?  "git log" works more or
>>> less in such a way already, but we might want to teach other modes
>>> like --full-history and --simplify-merges to ignore "ours" to hide
>>> such an artificial merge by default, with an audit option to
>>> unignore them.
>>
>> What about git-merge? Will it be fooled by these merges while looking
>> for merge bases?
>
> I thought it was obvious that we should ignore the side branches
> that were superseded this way, as by definition they did not
> contribute to the end result at all.
>
> But there must be something huge that I missed; otherwise you
> wouldn't be asking such a question. It is already late and my brain
> is no longer quite working, so I cannot figure out what it is X-<.

No, I was at work and could not spend more time thinking about it (I
asked stupid questions all the time, you should know ;). You were
right, these multiple parent commits have nothing to do with merge
bases.

Although I think this is an abuse of merge commits. Maybe git-notes is
a better way to publish rebase history.
--
Duy
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Avoiding broken Gitweb links and deleted objects

2013-05-10 Thread William Swanson
On Thu, May 9, 2013 at 11:37 PM, Junio C Hamano  wrote:
>>> What's a good strategy for avoiding breaking those links?
>>
>> Do not rebase published history.
>
> All true, but I think we could do a bit "better", although I am
> still on the fence if what I am going to suggest in this message is
> truly "better".
>
> Let me idly speculate and think aloud, "what if".
>
> Imagine that a user runs "git rebase" on a history leading to commit
> X to create an alternate, improved history that leads to commit Y.
> What if we teach "git rebase" to record, perhaps by default, an
> "ours" merge on top of Y that takes the tree state of Y but has X as
> its second parent, and "git log" and its family to ignore such an
> artificial "ours" merge that records a tree that is identical to one
> of its parents, again perhaps by default?  "git log" works more or
> less in such a way already, but we might want to teach other modes
> like --full-history and --simplify-merges to ignore "ours" to hide
> such an artificial merge by default, with an audit option to
> unignore them.
>
> The history transfer will not break, as there is a true ancestry
> that preserves the superseded history leading to X, while in the
> daily use and inspection of the history, such a superseded history
> will not bother the user by default.  When the user really wants to
> see it (e.g. following a stale gitweb link, or with "git log $X"),
> such a superseded side history is still there.
>
> Private history rewriting lets us pretend to be perfect, which is a
> major plus in the distributed workflow Git gives us, and such a mode
> of operation will defeat that in a big way, which might turn out to
> be a major downside, of course.
>
> Also, rebases and filter branches that are done in order to excise
> unwanted objects from the history (committed a password in a file,
> anybody?) need a way to turn it off.

I started working on something like this a few weeks ago, but
eventually came to the conclusion that this information does not
belong in the commit graph itself. You have already identified some of
the same problems I found, so I will not repeat them. In the end, you
either publish everything (including bad things like passwords or
dead-ends), or you leave the the rebase history-preservation feature
turned off all the time and then forget to turn it on when it really
matters.

A better approach, I think, would be to enhance the reflogs to the
point where they can provide this information in a reliable manner.
The Git garbage collector already skips objects mentioned in the
reflogs, so "git reflog expire" just needs to learn how to avoid
deleting topologically-interesting entries like rebases. For a shared
scenario like github, this would prevent the server from expiring
published commits and creating broken links.

Since Git maintains reflogs for all heads, including those in
refs/remotes, this strategy for preserving history also works in a
collaborative environment. Each repository remembers what it has seen,
including rebases from remotes (which appear as "forced updates"). On
the other hand, work-in-progress commits only appear in the local
reflogs, and won't appear in other repositories unless someone pulls
or pushes them.

If it does become necessary to delete some published historical
information (like passwords), it is still possible to delete reflog
entries by hand. They are not part of the object database, so doing
this doesn't break any hashes.

-William
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Avoiding broken Gitweb links and deleted objects

2013-05-10 Thread Junio C Hamano
Duy Nguyen  writes:

> On Fri, May 10, 2013 at 1:37 PM, Junio C Hamano  wrote:
>> Johannes Sixt  writes:
>> Imagine that a user runs "git rebase" on a history leading to commit
>> X to create an alternate, improved history that leads to commit Y.
>> What if we teach "git rebase" to record, perhaps by default, an
>> "ours" merge on top of Y that takes the tree state of Y but has X as
>> its second parent, and "git log" and its family to ignore such an
>> artificial "ours" merge that records a tree that is identical to one
>> of its parents, again perhaps by default?  "git log" works more or
>> less in such a way already, but we might want to teach other modes
>> like --full-history and --simplify-merges to ignore "ours" to hide
>> such an artificial merge by default, with an audit option to
>> unignore them.
>
> What about git-merge? Will it be fooled by these merges while looking
> for merge bases?

I thought it was obvious that we should ignore the side branches
that were superseded this way, as by definition they did not
contribute to the end result at all.

But there must be something huge that I missed; otherwise you
wouldn't be asking such a question. It is already late and my brain
is no longer quite working, so I cannot figure out what it is X-<.

Other things that I thought were obvious include format-patch (side
branch and the capping merge did not exist), another rebase (just
rebase the primary history ignoring the side branch and the capping
merge, and then cap the result with another artificial merge), and
shortlog (it should pretend that the side branch and the capping
merge never happened).

Of course, there should be a way for any of these to take the side
branch into account as if they are normal side branches as an
option.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Avoiding broken Gitweb links and deleted objects

2013-05-10 Thread Duy Nguyen
On Fri, May 10, 2013 at 1:37 PM, Junio C Hamano  wrote:
> Johannes Sixt  writes:
> Imagine that a user runs "git rebase" on a history leading to commit
> X to create an alternate, improved history that leads to commit Y.
> What if we teach "git rebase" to record, perhaps by default, an
> "ours" merge on top of Y that takes the tree state of Y but has X as
> its second parent, and "git log" and its family to ignore such an
> artificial "ours" merge that records a tree that is identical to one
> of its parents, again perhaps by default?  "git log" works more or
> less in such a way already, but we might want to teach other modes
> like --full-history and --simplify-merges to ignore "ours" to hide
> such an artificial merge by default, with an audit option to
> unignore them.

What about git-merge? Will it be fooled by these merges while looking
for merge bases?
--
Duy
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Avoiding broken Gitweb links and deleted objects

2013-05-10 Thread Johannes Sixt
Am 5/10/2013 8:37, schrieb Junio C Hamano:
> What if we teach "git rebase" to record, perhaps by default, an
> "ours" merge on top of Y that takes the tree state of Y but has X as
> its second parent, ...

Please let's not go that route...

-- Hannes
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Avoiding broken Gitweb links and deleted objects

2013-05-09 Thread Junio C Hamano
Johannes Sixt  writes:

> Am 5/8/2013 18:16, schrieb Matt McClure:
>> That begs a follow-up question. It sounds as though Git will typically 
>> delete unreachable objects. My team often shares links like 
>> https://git.example.com/foo.git/log/d59051721bb0a3758f7c6ea0452bac122a377645?hp=0055e0959cd13780494fe33832bae9bcf91e4a90
>>
>> . If I later rebase the branch containing those commits and d590517
>> becomes unreachable, do I risk that link breaking when Git deletes 
>> d590517?
>
> Yes.
>
> When we explain 'rebase', we usually say "you make the life hard for
> people who build on (published) history that you later rebase". But you
> inconvenience not only people who build their own history on top of your
> outdated history, but also those who operate with (web) links into that
> history.
>
>> What's a good strategy for avoiding breaking those links?
>
> Do not rebase published history.

All true, but I think we could do a bit "better", although I am
still on the fence if what I am going to suggest in this message is
truly "better".

Let me idly speculate and think aloud, "what if".

Imagine that a user runs "git rebase" on a history leading to commit
X to create an alternate, improved history that leads to commit Y.
What if we teach "git rebase" to record, perhaps by default, an
"ours" merge on top of Y that takes the tree state of Y but has X as
its second parent, and "git log" and its family to ignore such an
artificial "ours" merge that records a tree that is identical to one
of its parents, again perhaps by default?  "git log" works more or
less in such a way already, but we might want to teach other modes
like --full-history and --simplify-merges to ignore "ours" to hide
such an artificial merge by default, with an audit option to
unignore them.

The history transfer will not break, as there is a true ancestry
that preserves the superseded history leading to X, while in the
daily use and inspection of the history, such a superseded history
will not bother the user by default.  When the user really wants to
see it (e.g. following a stale gitweb link, or with "git log $X"),
such a superseded side history is still there.

Private history rewriting lets us pretend to be perfect, which is a
major plus in the distributed workflow Git gives us, and such a mode
of operation will defeat that in a big way, which might turn out to
be a major downside, of course.

Also, rebases and filter branches that are done in order to excise
unwanted objects from the history (committed a password in a file,
anybody?) need a way to turn it off.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Avoiding broken Gitweb links and deleted objects

2013-05-09 Thread Johannes Sixt
Am 5/8/2013 18:16, schrieb Matt McClure:
> That begs a follow-up question. It sounds as though Git will typically 
> delete unreachable objects. My team often shares links like 
> https://git.example.com/foo.git/log/d59051721bb0a3758f7c6ea0452bac122a377645?hp=0055e0959cd13780494fe33832bae9bcf91e4a90
>
> . If I later rebase the branch containing those commits and d590517
> becomes unreachable, do I risk that link breaking when Git deletes 
> d590517?

Yes.

When we explain 'rebase', we usually say "you make the life hard for
people who build on (published) history that you later rebase". But you
inconvenience not only people who build their own history on top of your
outdated history, but also those who operate with (web) links into that
history.

> What's a good strategy for avoiding breaking those links?

Do not rebase published history.

-- Hannes
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Avoiding broken Gitweb links and deleted objects

2013-05-09 Thread Matt McClure
On Wed, May 8, 2013 at 12:05 PM, Matt McClure  wrote:
> On Wed, May 8, 2013 at 10:41 AM, Johannes Sixt  wrote:
>> git gc moves unreachable objects that were packed before to the loose
>> object store, from where they can be pruned.
>
> Thanks. That was the piece I was missing. I assumed `git gc` did the opposite.

That begs a follow-up question. It sounds as though Git will typically
delete unreachable objects. My team often shares links like
https://git.example.com/foo.git/log/d59051721bb0a3758f7c6ea0452bac122a377645?hp=0055e0959cd13780494fe33832bae9bcf91e4a90
. If I later rebase the branch containing those commits and d590517
becomes unreachable, do I risk that link breaking when Git deletes
d590517?

What's a good strategy for avoiding breaking those links?

-- 
Matt McClure
http://matthewlmcclure.com
http://www.mapmyfitness.com/profile/matthewlmcclure
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Avoiding broken Gitweb links and deleted objects

2013-05-08 Thread Matt McClure
On Wed, May 8, 2013 at 12:05 PM, Matt McClure  wrote:
> On Wed, May 8, 2013 at 10:41 AM, Johannes Sixt  wrote:
>> git gc moves unreachable objects that were packed before to the loose
>> object store, from where they can be pruned.
>
> Thanks. That was the piece I was missing. I assumed `git gc` did the opposite.

That begs a follow-up question. It sounds as though Git will typically
delete unreachable objects. My team often shares links like
https://git.example.com/foo.git/log/d59051721bb0a3758f7c6ea0452bac122a377645?hp=0055e0959cd13780494fe33832bae9bcf91e4a90
. If I later rebase the branch containing those commits and d590517
becomes unreachable, do I risk that link breaking when Git deletes
d590517?

What's a good strategy for avoiding breaking those links?

-- 
Matt McClure
http://matthewlmcclure.com
http://www.mapmyfitness.com/profile/matthewlmcclure
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html