Re: [git-users] git rev-list --objects doesn't show moves

2014-10-01 Thread Roman Neuhauser
# wor...@alum.mit.edu / 2014-10-01 11:37:38 -0400:
> > From: Roman Neuhauser 
> 
> > yup, i'd like a plumbing equivalent of `git log --raw ...`.  AFAICT
> > the closest to that is git-diff-tree, except that implies N invocations
> > instead of one, a sad loss of efficiency i'd love to avoid.

> [...] there is no stored summary of "what is changed by this commit",
> the only way to determine that information is to compare each file
> reference of each commit with the cognate file reference in its
> predecessor commit.

> There's no way to do that whose run time is not proportional to both
> the number of commits and the number of files.

the inefficiency i'd like to avoid is in the diff-tree initialization.
while the strace output is nice and short, most of it is loading shared
libraries and reading the various .git* files; i hoped there would be
a way to spend that energy once per N commits described.

it's a "storm in a tea cup" in this particular use case, but i've been
thinking about the distinction between plumbing and porcelain, and how
well the plumbing fulfills its promise of enabling other porcelains.
it looks like any theoretical git-log competition written on top of the
plumbing is quite badly undercut by git-diff-tree being a single-pair
operation.

-- 
roman

-- 
You received this message because you are subscribed to the Google Groups "Git 
for human beings" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to git-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [git-users] Is it possible to git add a set of files as non-text, irrespective of any .gitattributes files?

2014-10-01 Thread Dale R. Worley
> From: Sam Roberts 

> I could write a local .gitattributes, do the `git add -A -f .`, and
> remove the .gitattributes... but that's not "atomic". I'm doing this
> all in a much larger script, I'm worried about damaging the local
> user's repo.

I'm no expert here, but how would it "damage the local user's repo"?
All you have to ensure is that the .gitattributes not be part of the
commit, and that during the git-add, it contains the right contents to
get the effect you want.

You *might* leave the modified .gitattributes file in the user's
working directory, but as long as you arrange that the modifications
only describe the "temporary" text files, that shouldn't interfere
with any "real" files.

Dale

-- 
You received this message because you are subscribed to the Google Groups "Git 
for human beings" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to git-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [git-users] Re: File dates after CHECKOUT

2014-10-01 Thread Dale R. Worley
Ultimately, the reason that Git (and other Un*x-focused SCMs) behave
this way is so that if you check out a new version and then run
'make', everything needed is rebuilt.  If a file had new contents but
its creation date is in the past, then files derived from it might not
be rebuilt.

> From: Gergely Polonkai 
> 
> That said, it still can be done, although it is not natively supported, you
> may do it with some custom tool. By finding the last commit a specific file
> was modified in, you may apply the date of the commit to that file.
> However, if you have a large repository, looking at this information for
> each single file may take really long. Still, it looks like an interesting
> project if your build environment really requires it…

You know, a C program that walked up the tree of ancestor commits
reading out the blob indexes in the file trees to assemble that data
would run rather fast, because there's not a lot of data to be
processed.

There is a treacherous design question of defining exactly what date
you want on each file.  What *exactly* is the debugger looking for?
(It probably isn't the date the file contents were first committed to
the repository.)  If the answer is "I need the file creation dates to
match the dates recorded in XYZ." probably the easiest solution is to
read the dates out of XYZ and apply them to the files.

Dale

-- 
You received this message because you are subscribed to the Google Groups "Git 
for human beings" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to git-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [git-users] git rev-list --objects doesn't show moves

2014-10-01 Thread Dale R. Worley
> From: Roman Neuhauser 

> yup, i'd like a plumbing equivalent of `git log --raw ...`.  AFAICT
> the closest to that is git-diff-tree, except that implies N invocations
> instead of one, a sad loss of efficiency i'd love to avoid.

You may be beyond my knowledge here, but if you want to list the
changes that were made to the file-tree by each of a series of
commits, you will pretty much have to do one invocation of diff-tree
for each commit, or something else that is functionally equivalent.
The reason is that there is no stored summary of "what is changed by
this commit", the only way to determine that information is to compare
each file reference of each commit with the cognate file reference in
its predecessor commit.  There's no way to do that whose run time is
not proportional to both the number of commits and the number of
files.

Dale

-- 
You received this message because you are subscribed to the Google Groups "Git 
for human beings" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to git-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [git-users] Re: File dates after CHECKOUT

2014-10-01 Thread Gergely Polonkai
I was thinking about this approach only for debugging purposes the OP
mentioned. In usual environments I wouldn't dare doing it. If the
build/debug system is strange enough and cannot be changed, this seems to
be an (almost) reliable solution.
On 1 Oct 2014 17:19, "Konstantin Khomoutov" 
wrote:

> On Tue, 30 Sep 2014 21:51:12 +0200
> Gergely Polonkai  wrote:
>
> > That said, it still can be done, although it is not natively
> > supported, you may do it with some custom tool. By finding the last
> > commit a specific file was modified in, you may apply the date of the
> > commit to that file. However, if you have a large repository, looking
> > at this information for each single file may take really long. Still,
> > it looks like an interesting project if your build environment really
> > requires it…
>
> I'm afraid this approach is wrong.
>
> The main issue with it is that Git is only concerned about the file's
> contents, not its modification date.  If you change the modification
> time of a file in the work tree without touching its contents the only
> effect this will have is forcing Git perform an extra check of the
> file's contents against the hash value currently recorded for that file
> in the index to verify the file's contents haven't really changed (and
> the corresponding entry in the index will then be updated with the
> actual file's mtime to avoid such check the next time).
>
> The second problem is that *if* Git would store time values from the
> filesystem's metadata along with blobs it would need to deal with
> the fact a file might legitimately have different time values but the
> same contents in different commits.
>
> Hence the only approach that works is performing a recursive crawl of
> the work tree, recording the time values from the filesystem metadata
> for each file of interest and then somehow storing the properly
> represented result in the repository -- linked with the commit that
> crawl has been done for.
>
> The format of the data produced is not really interesting (it should be
> easy to parse by a tool which would update the metadata of the files in
> the work tree, when requested) but chosing the way to store the
> resulting file(s) in the object database is a more interesting topic.
>
> One approach which seems to be rather common is gathering the metadata
> before the commit is done (say, in a pre-commit hook), storing it in
> the repository as an object and writing a special machine-parsable
> header to the commit message referring to that object.  The problem
> which arises is the need to keep live references to those objects for
> them not being garbage-collected.  This can be done in a number of ways
> (tags, a dedicated programmatically updated branch).
>
> Another possibility is adding this metadata directly to commits
> themselves.  Say, before a commit is recorded, gather the metadata
> and write a file with it under a directory named ".metadata", then
> add it to the index and commit.  The file should have a well-known
> static name.  The upside of this approach is its simplicity, the
> downside is that the metainformation is tightly coupled with the
> contents (and will get in the way when merging etc).
>
> One could also explore if `git notes` can be (ab)used to attach such
> metainformation to commits.  Git notes are not pushed and fetched by
> default, but this can be done.
>

-- 
You received this message because you are subscribed to the Google Groups "Git 
for human beings" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to git-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [git-users] Re: File dates after CHECKOUT

2014-10-01 Thread Konstantin Khomoutov
On Tue, 30 Sep 2014 21:51:12 +0200
Gergely Polonkai  wrote:

> That said, it still can be done, although it is not natively
> supported, you may do it with some custom tool. By finding the last
> commit a specific file was modified in, you may apply the date of the
> commit to that file. However, if you have a large repository, looking
> at this information for each single file may take really long. Still,
> it looks like an interesting project if your build environment really
> requires it…

I'm afraid this approach is wrong.

The main issue with it is that Git is only concerned about the file's
contents, not its modification date.  If you change the modification
time of a file in the work tree without touching its contents the only
effect this will have is forcing Git perform an extra check of the
file's contents against the hash value currently recorded for that file
in the index to verify the file's contents haven't really changed (and
the corresponding entry in the index will then be updated with the
actual file's mtime to avoid such check the next time).

The second problem is that *if* Git would store time values from the
filesystem's metadata along with blobs it would need to deal with
the fact a file might legitimately have different time values but the
same contents in different commits.

Hence the only approach that works is performing a recursive crawl of
the work tree, recording the time values from the filesystem metadata
for each file of interest and then somehow storing the properly
represented result in the repository -- linked with the commit that
crawl has been done for.

The format of the data produced is not really interesting (it should be
easy to parse by a tool which would update the metadata of the files in
the work tree, when requested) but chosing the way to store the
resulting file(s) in the object database is a more interesting topic.

One approach which seems to be rather common is gathering the metadata
before the commit is done (say, in a pre-commit hook), storing it in
the repository as an object and writing a special machine-parsable
header to the commit message referring to that object.  The problem
which arises is the need to keep live references to those objects for
them not being garbage-collected.  This can be done in a number of ways
(tags, a dedicated programmatically updated branch).

Another possibility is adding this metadata directly to commits
themselves.  Say, before a commit is recorded, gather the metadata
and write a file with it under a directory named ".metadata", then
add it to the index and commit.  The file should have a well-known
static name.  The upside of this approach is its simplicity, the
downside is that the metainformation is tightly coupled with the
contents (and will get in the way when merging etc).

One could also explore if `git notes` can be (ab)used to attach such
metainformation to commits.  Git notes are not pushed and fetched by
default, but this can be done.

-- 
You received this message because you are subscribed to the Google Groups "Git 
for human beings" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to git-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [git-users] git rev-list --objects doesn't show moves

2014-10-01 Thread Roman Neuhauser
# wor...@alum.mit.edu / 2014-09-29 11:45:00 -0400:
> > From: Roman Neuhauser 
> > 
> > i'm writing an alternative to git-requet-pull.  its output includes
> > a log of the commit range, eg:
> > 
> >   1/3 76a23b86 043603cc README fancier
> >   162441d0 README
> >   2/3 87990615 ab984c9b ignore vim swapfiles
> >   32682119 .gitignore
> >   3/3 2c842d2d 2ab371a4 README is now README.txt
> > 
> > each commit is represented by a line giving its position in the range,
> > the treeid, the commitid and the subject line, followed by a series of
> > lines identifying affected files, each line with the objectid and path.
> > 
> > i'm gathering the data with `git-rev-list --objects`, but it doesn't
> > mention objects that were moved (git mv) in a given commit; this is
> > visible in the last (3/3) commit in the example above: that commit was
> > just `git mv README README.txt`.
> > 
> > i want the output to identify moves and copies.  what are my options?
> > am i missing an option in git-rev-list(1)?  should i use a different
> > piece of plumbing?
> 
> The fundamental problem is that Git's data structures don't list moves
> and copies.  For that matter, they don't list adds and deletes,
> either.  As stored, each commit just tells the contents of the
> directory tree.  What you appear to want is something that compares
> one or more commits and tells what the differences between them are.

yup, i'd like a plumbing equivalent of `git log --raw ...`.  AFAICT
the closest to that is git-diff-tree, except that implies N invocations
instead of one, a sad loss of efficiency i'd love to avoid.
 
> OTOH, is that what you *really* want?  You say that you're "writing an
> alternative to git-request-pull".  What is the definition of this
> output?  What purposes do you expect the output to be put to?
> 
> For instance, when you're pulling commit 3/3 from the remote, you
> don't *need* to download the blob that is the current contents of
> README.txt (and the former contents of README) because you already
> have it in your repository.  So "git-rev-list --objects" doesn't list
> it.

this is for human consumption in an email-based code review process.
think git-request-pull for the overall picture plus git-format-patch
for individual commits.  from the readme:

  Pull requests are often sent repeatedly: Alice clones Bob's
  repository, commits some changes and sends him a pull request.
  Bob reviews the proposed changes and requests a few modifications.

  Alice tweaks her branch as requested and sends another pull request.
  Bob is a busy person and wants a very quick overview of the
  differences between the old and new pull request. Alice would do well
  to tell Bob which parts of the patch series changed in the second
  iteration.

-- 
roman

-- 
You received this message because you are subscribed to the Google Groups "Git 
for human beings" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to git-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [git-users] any suggestions for pruning all upstream branches after a github fork?

2014-10-01 Thread Thomas Ferris Nicolaisen
On Tuesday, September 30, 2014 8:49:11 PM UTC+2, Sam Roberts wrote:
>
>
> So what I want to do in effect is: 
>
> for each $branch in remote "origin", 
>   if head(origin/$branch) is in ancestry of upstream/$branch 
> git push --delete origin $branch 
>   end 
> end  

 
>

How about this approach:

(The thing is to avoid GitHub's Fork button, instead create it manually 
with only the branches you want)

* git clone github:joyent/node
* cd node
* git branch #one branch: master
* git remote add sam github:sam/node #where node is a freshly initialized 
GH repo
* git push sam master # pushes only one branch to your GitHub repo

Now you can clone the one-branch repo:
* git clone github:sam/node
* cd node
* git remote add joyent github:joyent/node
* git fetch joyent

Now you have your fork with just one branch in it. The branches from joyent 
are available in their remote, and I believe they will be updated with 
every `fetch joyent` that you make. Note that once you start creating 
tracking branches, you need update them manually or use a script (like 
git-up ). 

If you want to see all remote branches in your github repo, but not see the 
50 remote joyent branches, you can use

git branch --list -a origin\* 

(might want to make an alias for it).

-- 
You received this message because you are subscribed to the Google Groups "Git 
for human beings" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to git-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.