On Mon, 4 Apr 2016 13:50:26 -0700 (PDT) display_name_taken <[email protected]> wrote:
> Thanks for the post Konstantin. > There won't be many concurrent users (say 10) hence not many cloning > at the same time. > My main requirement was using git as some type of storage space with > versioning capability. Well, Git is optimized for handling small-to-middle-sized files and assumes they don't change a lot between commits (this would be understandable once you consider the user case it was created -- and tailored -- for: working on the Linux kernel source code). The Git's capability of "diffing" (comparing in a human-sensible way) of the contents of files recorded in different commits also relies on these files being textual, where "textual" means some 8-bit encoding (typically UTF-8 but various ISO-* and Windows-* encodings would work as well). So, while Git is able to work with large files, and it's able to work with binary files (including files containing text in weird encodings such as UTF-16/UCS-2 etc) these are not the things it's optimized for and you might find yourself dancing around Git trying to do it things with your data which you intended to get "for free". What I'm actually leading you to, is that it might turn out you might not really need a full-blown version-control system because inherently those are typically tailored for working on source code and other "plain text" stuff. Hence you might consider light-weight solutions intended for versioned backups. I suggest looking at rdiff-backup, attic and obnam -- to name just a few. All these systems allow you to periodically "push" a new state of a filesystem hierarchy rooted in a directory to "a server" (which might reside on a local machine), which would effectively store this new snapshot using various deduplication/delta compression techniques, allow to inspect the list of "revisions" and extract files from any selected revision. A "winning" feature compared to Git is that they in most cases allow to prune past revisions -- which might or might not be useful for your use case. All-in-all, if all you're concerned with is disk storage then Git is relatively OK with it -- provided your data does not change too much between adjacent commits (obviously, if you store 100MB worth of data in a commit and then store 100MB of completely different data in the next commit, that second commit won't be well-compressible compared to the previous one and you'll end up with some 200MB of data in the repository). -- You received this message because you are subscribed to the Google Groups "Git for human beings" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
