I haven't tried to follow that example properly yet.. However one other thing to look at is the "History Simplification" that includes parent re-writing that's in the rev-list-options.txt file and then included in a number of man pages (log, show, short-log,..). There are some slippery concepts in there, often context dependent!
https://git-scm.com/docs/git-log#_history_simplification On Wednesday, February 22, 2023 at 5:00:51 AM UTC s...@codeapprove.com wrote: > Thank you both for getting back to me. The discussion in the docs about > flattening was really interesting! I should note that the git clone / git > log command pair I provided gives me almost exactly what I want, but I need > to combine the diffs. It seems to contain the correct changes, and the > speed is pretty good too. > > Let me give an example of the situation I am optimizing for. I apologize > in advance I am going to use GitHub terms which I know are not pure git, > but in the end my question is a git question. > > Say you're a developer working in a many-developer repository. Here's the > sequence: > > - On Day 0 you check out "main" and create "my-topic-branch". You add > commits A, B, C, D to that branch. > - Now you open a pull request on GitHub asking to merge your branch > "my-topic-branch" into "master". > - You see a collaborator has landed a change to "main" since you > started. So you do "git fetch origin main && git merge main" and make a > merge commit in your branch. > - Then you add three more commits E, F, G on top of that and push your > branch again. So you have: A, B, C, D, (merge main), E, F, G. > - A coworker has already looked at commits A, B, C and wants to see > what you've done then. So they ask GitHub to show the diff from commits D > through G (including the merge). > > When you do this, GitHub does something which (to me, anyway) is pretty > magical. You are shown only the changes that you committed to your branch > in D, E, F, and G. Changes which you merged in, which may or may not > involve the files in your Pull Request, are not shown at all since they're > not "yours". > > Here's a public example showing a team using this pattern. This one has > multiple merges, so I may need to find a cleaner example but hopefully this > makes sense. > > - Consider this PR: > https://github.com/firebase/firebase-tools/pull/5478/files > - This is the full diff (according to GitHub) and we can see > exactly one added line in CHANGELOG.md > - Here's a merge commit: > > https://github.com/firebase/firebase-tools/pull/5478/commits/ebce28ceb799f721d36b986705c54cbcd597a27a > - We can see that on the base branch, "master", a line was added to > the *end* of the CHANGELOG.md file. There is no such addition > displayed in the full diff. > - Here's a "magic" diff where I selected three commits (before merge, > merge, and after merge): > > https://github.com/firebase/firebase-tools/pull/5478/files/28b8a72561b266a2086059c0d9840ab25f03d8ae..b2d89ebd67e3f8c17c4c607c630c18096303096b > - We can see that the changes from the merge commit are not shown > as additions! But they are present as context lines. > > I need to find a sequence of git commands to produce the same exact diff > that GitHub produces (and ideally do it very quickly even in a large > repository) and I just can't figure it out. > > Thanks, > Sam > > > > > On Tuesday, 21 February 2023 at 14:12:48 UTC-8 philip...@iee.email wrote: > >> This may also be an issue of the History Simplification process and / or >> the 'flattening' processes for history linearisation and rebases. >> >> The flattening is a known phenomena and was currently being mentioned on >> the Git List, so I have noted this there. >> [1] https://lore.kernel.org/git/a856dd16-9876-509b...@iee.email/ >> <https://lore.kernel.org/git/a856dd16-9876-509b-6a99-11ea0020633c@iee.email/> >> >> There is a technical discussion of flattening in the docs at >> https://github.com/git/git/blob/master/Documentation/howto/keep-canonical-history-correct.txt >> >> >> Do note the original email title "Pull is mostly evil" ;-) (whole thread >> at https://lore.kernel.org/git/5363bb9...@xiplink.com/ >> <https://lore.kernel.org/git/5363bb9f.40...@xiplink.com/>) >> >> Clarifying the " excluding merge commit changes" (or misunderstandings if >> you've there were some..) would be really useful. The existing devs do have >> the 'curse of knowledge' so often can't see the problems. >> On Tuesday, February 21, 2023 at 5:29:36 PM UTC Konstantin Khomoutov >> wrote: >> >>> On Mon, Feb 20, 2023 at 09:27:20PM -0800, 'Samuel Stern' via Git for >>> human beings wrote: >>> >>> > This is an *extremely* specific question which I've been trying to get >>> an >>> > answer to for quite a while now, so hopefully someone here knows the >>> answer. >>> > >>> > Let's say I am starting from nothing, an empty directory on a server. >>> I >>> > have: >>> > >>> > - The URL for a public git repository >>> > - Two endpoint SHAs (commits on the same branch) >>> > >>> > I want to get the complete diff between those commits *excluding* >>> merge >>> > commit changes, and I want to do this as fast as possible (so much >>> faster >>> > than cloning everything and diffing). >>> > >>> > I am able to get almost there with the following sequence: >>> > >>> > # Fast clone >>> > git clone --verbose --no-checkout --filter=blob:limit=250k >>> --single-branch >>> > --branch=${branch} --depth=${depth} $REPO_URL >>> > >>> > # Get a series of patches >>> > git log --no-merges --first-parent --patch ${base.sha}..${head.sha} >>> > >>> > However I need to get a *single* patch that represents all the changes >>> > combined, not a series of patches from the log. >>> >>> Isn't mere >>> >>> git diff ${head.sha} ${base.sha} >>> >>> is what you're looking for? >>> >>> Otherwise, I'm with Philipp in that your statements (rephrased) >>> >>> - I want to get a single combined change ("patch") describing the >>> literal >>> set of changes between such and such commits. >>> >>> - I want changes brought in by merge commits excluded. >>> >>> Contradict each other: I could in principle envision some algorithm >>> which >>> would try to incrementally produce a diff as in walks a chain of commits >>> and >>> tries to ignore the changes introduced by merge commits located in that >>> chain, >>> but leaving aside the fact such an algotithm would be very brittle for >>> any >>> real-world cases, I simply see no use for it - even a theoretical one. >>> >>> >>> You might got trapped by the fact you have found `git log` first in your >>> search, and this command traverses all individual commits in the >>> subgraph it's >>> told to traverse - including "sidelines" brought in by merge commits. >>> Instead, plain old `git diff` does not traverse anything: it takes two >>> states >>> of the project and compares them. >>> >>> -- You received this message because you are subscribed to the Google Groups "Git for human beings" group. To unsubscribe from this group and stop receiving emails from it, send an email to git-users+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/git-users/10da522b-6128-4edf-9e0a-9e4f53aeb105n%40googlegroups.com.