This option is just like --batch-check, but shows the
on-disk size rather than the true object size. In other
words, it makes the "disk_size" query of sha1_object_info_extended
available via the command-line.

This can be used for rough attribution of disk usage to
particular refs, though see the caveats in the

This patch does not include any tests, as the exact numbers
returned are volatile and subject to zlib and packing

Signed-off-by: Jeff King <>
I sort of tacked this onto the --batch-check format by replacing the
"real" object size with the on-disk size when this option is used. I'm
open to suggestions. Two other things I considered were:

  1. Having the option simply output an extra field with the on-disk
     size. But then you are paying for the true object size lookup, even
     if you don't necessarily care.

  2. Simply outputting the disk-size and object name. For my purposes, I
     do not care about the object type, and finding the type takes non-trivial
     resources (we have to walk delta chains to find the true type).

Perhaps we need

  git cat-file --batch-format="%(disk-size) %(object)"

or similar.

 Documentation/git-cat-file.txt | 16 ++++++++++++++++
 builtin/cat-file.c             |  9 +++++++++
 2 files changed, 25 insertions(+)

diff --git a/Documentation/git-cat-file.txt b/Documentation/git-cat-file.txt
index 30d585a..d4af1fc 100644
--- a/Documentation/git-cat-file.txt
+++ b/Documentation/git-cat-file.txt
@@ -65,6 +65,22 @@ OPTIONS
        Print the SHA-1, type, and size of each object provided on stdin. May 
        be combined with any other options or arguments.
+       Like `--batch-check`, but print the on-disk size of each object
+       (including zlib and delta compression) rather than the object's
+       true size. May not be combined with any other options or
+       arguments.
+NOTE: The on-disk size reported is accurate, but care should be taken in
+drawing conclusions about which refs or objects are responsible for disk
+usage. The size of a packed non-delta object be much larger than the
+size of objects which delta against it, but the choice of which object
+is the base and which is the delta is arbitrary and is subject to change
+during a repack. Note also that multiple copies of an object may be
+present in the object database; in this case, it is undefined which
+copy's size will be reported.
 If '-t' is specified, one of the <type>.
diff --git a/builtin/cat-file.c b/builtin/cat-file.c
index 045cee7..5112c64 100644
--- a/builtin/cat-file.c
+++ b/builtin/cat-file.c
@@ -15,6 +15,7 @@
 #define BATCH 1
 #define BATCH_CHECK 2
 static int cat_one_file(int opt, const char *exp_type, const char *obj_name)
@@ -135,6 +136,11 @@ static int batch_one_object(const char *obj_name, int 
        if (print_contents == BATCH)
                contents = read_sha1_file(sha1, &type, &size);
+       else if (print_contents == BATCH_DISK_SIZES) {
+               struct object_info oi = {0};
+               oi.disk_sizep = &size;
+               type = sha1_object_info_extended(sha1, &oi);
+       }
                type = sha1_object_info(sha1, &size);
@@ -206,6 +212,9 @@ int cmd_cat_file(int argc, const char **argv, const char 
                OPT_SET_INT(0, "batch-check", &batch,
                            N_("show info about objects fed from the standard 
+               OPT_SET_INT(0, "batch-disk-sizes", &batch,
+                           N_("show on-disk size of objects fed from standard 
+                           BATCH_DISK_SIZES),

To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to
More majordomo info at

Reply via email to