On Mon, 4 Apr 2016 13:50:26 -0700 (PDT)
display_name_taken <kenneth.kulathil...@gmail.com> wrote:
> Thanks for the post Konstantin.
> There won't be many concurrent users (say 10) hence not many cloning
> at the same time.
> My main requirement was using git as some type of storage space with
> versioning capability.
Well, Git is optimized for handling small-to-middle-sized files and
assumes they don't change a lot between commits (this would be
understandable once you consider the user case it was created --
and tailored -- for: working on the Linux kernel source code).
The Git's capability of "diffing" (comparing in a human-sensible way)
of the contents of files recorded in different commits also relies
on these files being textual, where "textual" means some 8-bit encoding
(typically UTF-8 but various ISO-* and Windows-* encodings would work
So, while Git is able to work with large files, and it's able to work
with binary files (including files containing text in weird encodings
such as UTF-16/UCS-2 etc) these are not the things it's optimized for
and you might find yourself dancing around Git trying to do it things
with your data which you intended to get "for free".
What I'm actually leading you to, is that it might turn out you might
not really need a full-blown version-control system because inherently
those are typically tailored for working on source code and other "plain
text" stuff. Hence you might consider light-weight solutions intended
for versioned backups. I suggest looking at rdiff-backup, attic and
obnam -- to name just a few. All these systems allow you to
periodically "push" a new state of a filesystem hierarchy rooted in a
directory to "a server" (which might reside on a local machine), which
would effectively store this new snapshot using various
deduplication/delta compression techniques, allow to inspect the list
of "revisions" and extract files from any selected revision.
A "winning" feature compared to Git is that they in most cases allow
to prune past revisions -- which might or might not be useful for your
All-in-all, if all you're concerned with is disk storage then Git is
relatively OK with it -- provided your data does not change too much
between adjacent commits (obviously, if you store 100MB worth of data
in a commit and then store 100MB of completely different data in the
next commit, that second commit won't be well-compressible compared to
the previous one and you'll end up with some 200MB of data in the
You received this message because you are subscribed to the Google Groups "Git
for human beings" group.
To unsubscribe from this group and stop receiving emails from it, send an email
For more options, visit https://groups.google.com/d/optout.