Hi!

Although I've read a lot of resources concerning the topic "Handling binary 
files with Git" I'm still confused. Hopefully you can help me to find 
definitive answers :-)

If you ask people about "Git and binary files" the answers are often: "Git 
is a SCM and not a backup solution" or "dependences should be referenced 
via Maven/NuGet/etc. only". Regarding the second argument: it is not always 
possible for us to do that, and the situation is (unfortunately) not going 
to change anytime soon. Therefore we want to know Git's limits before 
switching from Subversion to Git.

Quantity of binary data: Some of our projects have up to 500MB of library 
dependencies which are updated (in parts) every two to three weeks. 
However, the files are not too big by themselves, they are around 250 files 
ranging from 200KB to 10MB.

Main question:
The existence of tools like git-annex, git-fat, git-media, etc. hints that 
Git has problems with binary files in some way. Although I've studied as 
much internal docs as I could find, I could not find a clue why Git should 
handle binary files any worse than Subversion did. - Yes the repository 
size may get huge, however, initial cloning is a one-time process and does 
not affect our company too much.

Does Git even have a problem with binary files? What's the problem exactly? 
How does Subversion handle this in a better way? Is it about single files 
which are very huge (e.g. 3D models with more than 500MB file size) or are 
many small binary files a problem as well? Is it about initial cloning time 
only or does it affect the everyday work (committing, branching, etc.) as 
well?

Note: we're using Git on Windows, if that's important in any way.

Follow-up questions:
(a) What's the maximum recommended repository overall size? We have 
repositories (converted with svn2git and already compressed) ranging from 
1GB to 15GB. Is this already problematic assuming that everybody has a SSD?
(b) What's the maximum *single* file size you would recommend? We don't 
have any binary files larger than 10MB.
(c) What's the maximum checkout size you would recommend? We have 
repositories ranging from 1KB to around 1GB checkouts. Is this already 
hitting the limits of Git on current modern developer computers?
(d) What's the maximum number of files/directories you would recommend?

First empiric tests have not shown any problems with one of our biggest 
projects, checkout takes around one to three seconds, diffs settle at 
around one second. Stashing takes a little longer, up to five seconds. The 
values are not optimal compared to small (source-only) repositories, 
however, still faster than Subversion has been. However, we're scared to 
switch and run into big show stoppers soon after.

Hope you can point out the specific problems we have to know about, thank 
you!

Best regards,
Dominik

(PS: I've asked similar questions on non-official forums a few weeks ago 
and haven't got any satisfying answers)

-- 
You received this message because you are subscribed to the Google Groups "Git 
for human beings" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to git-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to