Hi!

Thank you for your initial reply.

(1) if those pack files aren't getting to big this should not be a problem. 
However, it concerns my follow-up questions on "known" Git upper limits.
(2) Yeah, no difference to Subversion here.
(3) We do not require any locking of those binaries.

> On the other hand, you seem to have fallen into the usual pitfall 
of wanting someone else to just look at your requirement and somehow 
know will it work or not.

I don't think so. We're already testing with our biggest repositories, 
however, I want to know if

(a) there are any structural disadvantages when handling binary files 
compared to Subversion.
(b) if someone already ran into / experience performance problems with 
binary files.
(c) the estimated upper limits to work with a repository in reasonable time 
on a "normal" machine (i.e. if my repository reaches 20GB and all of a 
sudden it takes five minutes per commit, etc.)

I guess we're not the first to use Git for bigger repositories or with 
binary files and we want to gather as much know how as possible + want to 
find the real reason why people felt the need to create git-annex, 
git-media, and all those projects in the first place.

Hope some people can bring in their experiences on working with big 
repositories / know why such tools like git-annex have been introduced / 
know of any under-the-hood problems in comparison to Subversion / etc.

Of course all of that is combined with empirical tests of our own :-)

Best regards,
Dominik

On Wednesday, July 16, 2014 2:21:50 PM UTC+2, Dominik Rauch wrote:
>
> Hi!
>
> Although I've read a lot of resources concerning the topic "Handling 
> binary files with Git" I'm still confused. Hopefully you can help me to 
> find definitive answers :-)
>
> If you ask people about "Git and binary files" the answers are often: "Git 
> is a SCM and not a backup solution" or "dependences should be referenced 
> via Maven/NuGet/etc. only". Regarding the second argument: it is not always 
> possible for us to do that, and the situation is (unfortunately) not going 
> to change anytime soon. Therefore we want to know Git's limits before 
> switching from Subversion to Git.
>
> Quantity of binary data: Some of our projects have up to 500MB of library 
> dependencies which are updated (in parts) every two to three weeks. 
> However, the files are not too big by themselves, they are around 250 files 
> ranging from 200KB to 10MB.
>
> Main question:
> The existence of tools like git-annex, git-fat, git-media, etc. hints that 
> Git has problems with binary files in some way. Although I've studied as 
> much internal docs as I could find, I could not find a clue why Git should 
> handle binary files any worse than Subversion did. - Yes the repository 
> size may get huge, however, initial cloning is a one-time process and does 
> not affect our company too much.
>
> Does Git even have a problem with binary files? What's the problem 
> exactly? How does Subversion handle this in a better way? Is it about 
> single files which are very huge (e.g. 3D models with more than 500MB file 
> size) or are many small binary files a problem as well? Is it about initial 
> cloning time only or does it affect the everyday work (committing, 
> branching, etc.) as well?
>
> Note: we're using Git on Windows, if that's important in any way.
>
> Follow-up questions:
> (a) What's the maximum recommended repository overall size? We have 
> repositories (converted with svn2git and already compressed) ranging from 
> 1GB to 15GB. Is this already problematic assuming that everybody has a SSD?
> (b) What's the maximum *single* file size you would recommend? We don't 
> have any binary files larger than 10MB.
> (c) What's the maximum checkout size you would recommend? We have 
> repositories ranging from 1KB to around 1GB checkouts. Is this already 
> hitting the limits of Git on current modern developer computers?
> (d) What's the maximum number of files/directories you would recommend?
>
> First empiric tests have not shown any problems with one of our biggest 
> projects, checkout takes around one to three seconds, diffs settle at 
> around one second. Stashing takes a little longer, up to five seconds. The 
> values are not optimal compared to small (source-only) repositories, 
> however, still faster than Subversion has been. However, we're scared to 
> switch and run into big show stoppers soon after.
>
> Hope you can point out the specific problems we have to know about, thank 
> you!
>
> Best regards,
> Dominik
>
> (PS: I've asked similar questions on non-official forums a few weeks ago 
> and haven't got any satisfying answers)
>

-- 
You received this message because you are subscribed to the Google Groups "Git 
for human beings" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to git-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to