[git-users] Re: Odd performance problem with git diff

2011-04-01 Thread Thomas Ferris Nicolaisen
Yay! These .svn folders are causing havoc to our workspace in general. 
Amazing how much they slow it down. Another good reason to switch to git :)

-- 
You received this message because you are subscribed to the Google Groups Git 
for human beings group.
To post to this group, send email to git-users@googlegroups.com.
To unsubscribe from this group, send email to 
git-users+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/git-users?hl=en.



[git-users] Re: Odd performance problem with git diff

2011-03-31 Thread Dave R
In the course of messing around with this issue, I discovered
something interesting.  There were a number of .svn (Subversion)
directories in my repository because the git repository was created
from directories checked out of subversion.  Also, a number of
those .svn directories contained binary files such as .jpg, .gif,
etc.  So that content was essentially doubled in the repository.

I deleted the .git repository and rebuilt it, but this time with .svn
in the .gitignore file.

Lo and behold the original git diff command returned in less than a
second on the new repository.

So even though there weren't *that* many binary files in this
repository, there were enough to badly slow down the diff command.

All is well now.  :-)

On Mar 24, 10:30 am, Dave R dran...@yahoo-inc.com wrote:
 Thanks for your help on this.  Maybe I'll look into using git log
 instead of git diff.  I've never seen git whatchanged before so I'll
 check that out too.

 On Mar 24, 1:33 am, Thomas Ferris Nicolaisen tfn...@gmail.com wrote:

  Actually this was a much better guide to the git date formats:

 http://www.alexpeattie.com/blog/working-with-dates-in-git/

-- 
You received this message because you are subscribed to the Google Groups Git 
for human beings group.
To post to this group, send email to git-users@googlegroups.com.
To unsubscribe from this group, send email to 
git-users+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/git-users?hl=en.



[git-users] Re: Odd performance problem with git diff

2011-03-24 Thread Thomas Ferris Nicolaisen
Yes, git log http://www.kernel.org/pub/software/scm/git/docs/git-log.htmlcan 
take two times as arguments.

git log --since two weeks ago --until yesterday

You might also want to consider git 
whatchangedhttp://www.kernel.org/pub/software/scm/git/docs/git-whatchanged.html
:

git whatchanged --since two weeks ago --until yesterday

The arguments accepted are pretty versatile. For details, see the Specifying 
Revisions section of 
git-rev-parsehttp://www.kernel.org/pub/software/scm/git/docs/v1.5.0.7/git-rev-parse.html
.

I don't have any more ideas on figuring out what the performance bottleneck 
could be though, sorry. Maybe someone on the main git 
listhttp://vger.kernel.org/vger-lists.html#gitcan help more?

-- 
You received this message because you are subscribed to the Google Groups Git 
for human beings group.
To post to this group, send email to git-users@googlegroups.com.
To unsubscribe from this group, send email to 
git-users+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/git-users?hl=en.



[git-users] Re: Odd performance problem with git diff

2011-03-24 Thread Thomas Ferris Nicolaisen
Actually this was a much better guide to the git date formats: 

http://www.alexpeattie.com/blog/working-with-dates-in-git/

-- 
You received this message because you are subscribed to the Google Groups Git 
for human beings group.
To post to this group, send email to git-users@googlegroups.com.
To unsubscribe from this group, send email to 
git-users+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/git-users?hl=en.



[git-users] Re: Odd performance problem with git diff

2011-03-24 Thread Dave R
Thanks for your help on this.  Maybe I'll look into using git log
instead of git diff.  I've never seen git whatchanged before so I'll
check that out too.

On Mar 24, 1:33 am, Thomas Ferris Nicolaisen tfn...@gmail.com wrote:
 Actually this was a much better guide to the git date formats:

 http://www.alexpeattie.com/blog/working-with-dates-in-git/

-- 
You received this message because you are subscribed to the Google Groups Git 
for human beings group.
To post to this group, send email to git-users@googlegroups.com.
To unsubscribe from this group, send email to 
git-users+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/git-users?hl=en.



[git-users] Re: Odd performance problem with git diff

2011-03-23 Thread Dave R
Wow, thanks for your detailed response, Thomas.  I'll try to answer
some of your questions:

Both repositories are on a remote mounted filesystem.  Several people
have told me that defeats the purpose of using git, but I claim
innocence by virtue of inheriting the system as it is.  :-)  Also, I
doubt that's a factor anyway since the fast repository is perfectly
fast on that same filesystem.  That seems to rule out any filesystem
related issues such as fragmentation too.

The files in the slow repository are mostly text (.html .css etc.)
files.  There are .gif .png and other such files sprinkled in.  By far
the biggest directory in the repository has 32509 regular files with
2208 that don't have obvious text suffixes.  A few of those 2208 are
probably also text files, but with suffixes I don't recognize.  So,
the repository certainly isn't dominated by binary files.  There are
currently 26 directories in all.  Most of them have a few hundred
files in them.

Maybe a little more background might help.  Changes are made to a
given directory (new files, updated files, deleted files) throughout
the day.  Every night a cron job runs which does a git add . and
git commit in that directory.  Later a PHP program runs via a
browser that looks at the git repository and shows all the pertinent
changes.  Depending on the URL, the script can say, Give me the names
of all files in the repository changed since the beginning of time,
but it usually says, Give me the names of all files in the repository
changed between two given timestamps.  It's the give me all changed
files since the beginning of time that is vexing me.  It's super fast
for the larger repository and slow for the other one.  So the system
is interested in things at the file level rather than the commit level
since a mass commit is done once a day.

I'm running git version 1.7.0.6 on an RHEL system.

I've run git gc in the past and it has no effect.

Is there a way to run git that might reveal where it's spending all
it's time?

git log is super speedy.  Is there a good way to use that to get all
the files changed between two times?









-- 
You received this message because you are subscribed to the Google Groups Git 
for human beings group.
To post to this group, send email to git-users@googlegroups.com.
To unsubscribe from this group, send email to 
git-users+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/git-users?hl=en.



[git-users] Re: Odd performance problem with git diff

2011-03-23 Thread Dave R
And, just to clarify, the following text:

On Mar 17, 10:32 am, Dave R dran...@yahoo-inc.com wrote:
 Here is the command that's slow on repository #1, and fast on
 repository #2:

is wrong.  The command is fast on repository #1 and slow on repository
#2.

-- 
You received this message because you are subscribed to the Google Groups Git 
for human beings group.
To post to this group, send email to git-users@googlegroups.com.
To unsubscribe from this group, send email to 
git-users+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/git-users?hl=en.



[git-users] Re: Odd performance problem with git diff

2011-03-18 Thread Thomas Ferris Nicolaisen
Hi Dave,

Just to clarify: The normal way to check for changes in a repository is 
usually this:

cd /path/to/repository
git status

There are many factors that come into play with git performance, and 
although it is inherently fast compared to a remote repository system like 
Subversion, it involves a lot of clever techniques on the file-system, using 
a lot of indexing and compression.

This means, when it slows down, it's kind of hard to trace what's wrong 
without being a bit of a git/filesystem/operating-system wizard. 

For example, it could be due to heavy fragmentation of the disk sectors 
where your slow repository is stored. It could also be that your filesystem 
is slowing down the procedure. Which filesystem are you running? What are 
your hard-drive specs? Operating system? Git version? 

It does sound odd that the larger repository is fast compared to the small 
one. Can you say something about the nature of the two repos? Is there a lot 
of binary (images) data in one of them? How many files are in there? How big 
are the .git folders compared to the rest?

Sometimes it helps asking git to repackage its index/repository (garbage 
collection):

git gc

If you want additional ways of asking for the diff between two points in 
time in the repository, the general syntax is this:

git diff commit1 commit2

where the commits can be either SHA's, or tags.

If you leave out the first one like this:

git diff commit

.. the diff is between HEAD and commit

You can also use *git log* for seeing the commit log between two points in 
time, as well as git show, for looking at single commits.

For example, here's the log with diff (-p) for the last three commits:

git log -p HEAD~3

And here, showing only the changed files:

git logv -p HEAD~3 --name-status

-- 
You received this message because you are subscribed to the Google Groups Git 
for human beings group.
To post to this group, send email to git-users@googlegroups.com.
To unsubscribe from this group, send email to 
git-users+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/git-users?hl=en.