> From: Roman Neuhauser <neuhau...@sigpipe.cz>

> the inefficiency i'd like to avoid is in the diff-tree initialization.
> while the strace output is nice and short, most of it is loading shared
> libraries and reading the various .git* files; i hoped there would be
> a way to spend that energy once per N commits described.

That might be less important than it seems:

I once wrote a revision of "tar" that could write compressed tar files
to a tape drive.  (The tape drive did not have built-in compression.)
The standard "tar" can do that, but it compresses the entire tar file
as a unit, so if any part of the file is corrupted, you can't read the
remainder of the file.  This is not good for a backup tape.

So I modified tar to compress each file individually before putting it
into the tar file.  As a first implementation, I had tar simply create
a subprocess which ran gzip -- once for each file that was written.
So gzip was run tens to hundreds of thousands of times when writing a
backup tape.

The effect on performance was zero.  (And this was on a processor that
ran at tens of MHz.)  The reason seemed to be that once all the files
that are needed to get a gzip process running are in the disk cache,
the process starts *very* quickly.  Yes, it takes a zillion CPU
cycles, but that's not the slow part of the computer.

So don't fret about the efficiency of part of your system before you
know that it is actually performance-limiting.

Dale

-- 
You received this message because you are subscribed to the Google Groups "Git 
for human beings" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to git-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to