Re: [PATCH 1/3] repack: rewrite the shell script in C

2013-09-17 Thread Junio C Hamano
Stefan Beller  writes:

> On 09/17/2013 08:17 PM, Junio C Hamano wrote:
>> Stefan Beller  writes:
>> 
>>> +   struct option builtin_repack_options[] = {
>>> +   OPT_BIT('a', NULL, &pack_everything,
>>> +   N_("pack everything in a single pack"), 
>>> ALL_INTO_ONE),
>>> +   OPT_BIT('A', NULL, &pack_everything,
>>> +   N_("same as -a, and turn unreachable objects 
>>> loose"),
>>> +  LOOSEN_UNREACHABLE),
>> 
>> Micronit.
>> 
>> With the current version of the code in cmd_repack() that uses the
>> pack_everything variable this may not make a difference, but I think
>> this should logically be "LOOSEN_UNREACHABLE | ALL_INTO_ONE" instead,
>> and the code should check (pack_evertying & ALL_INTO_ONE) instead of
>> checking "!pack_everything".  You may want to add to this flag variable
>> a new bit that does _not_ cause it to pack everything into one.
>> 
>
> I do understand the "LOOSEN_UNREACHABLE | ALL_INTO_ONE" here, as that
> is the logical thing we are doing. Combined with your second idea this 
> would result in
> ---8<---
>
> From 4bbbfb312bf23efa7e702e200fbc2d4479e3477e Mon Sep 17 00:00:00 2001
> From: Stefan Beller 
> Date: Tue, 17 Sep 2013 22:04:35 +0200
> Subject: [PATCH 2/2] Suggestions by Junio
>
> Signed-off-by: Stefan Beller 
> ---
>  builtin/repack.c | 10 +-
>  1 file changed, 5 insertions(+), 5 deletions(-)
>
> diff --git a/builtin/repack.c b/builtin/repack.c
> index e5f90c6..a0ff5c7 100644
> --- a/builtin/repack.c
> +++ b/builtin/repack.c
> @@ -143,7 +143,7 @@ int cmd_repack(int argc, const char **argv, const char 
> *prefix)
>   N_("pack everything in a single pack"), 
> ALL_INTO_ONE),
>   OPT_BIT('A', NULL, &pack_everything,
>   N_("same as -a, and turn unreachable objects 
> loose"),
> -LOOSEN_UNREACHABLE),
> +LOOSEN_UNREACHABLE | ALL_INTO_ONE),
>   OPT_BOOL('d', NULL, &delete_redundant,
>   N_("remove redundant packs, and run 
> git-prune-packed")),
>   OPT_BOOL('f', NULL, &no_reuse_delta,
> @@ -197,10 +197,7 @@ int cmd_repack(int argc, const char **argv, const char 
> *prefix)
>   if (no_reuse_object)
>   argv_array_pushf(&cmd_args, "--no-reuse-object");
>  
> - if (!pack_everything) {
> - argv_array_push(&cmd_args, "--unpacked");
> - argv_array_push(&cmd_args, "--incremental");
> - } else {
> + if (pack_everything & ALL_INTO_ONE) {
>   get_non_kept_pack_filenames(&existing_packs);
>  
>   if (existing_packs.nr && delete_redundant) {
> @@ -212,6 +209,9 @@ int cmd_repack(int argc, const char **argv, const char 
> *prefix)
>   argv_array_push(&cmd_args,
>   "--unpack-unreachable");
>   }
> + } else {
> + argv_array_push(&cmd_args, "--unpacked");
> + argv_array_push(&cmd_args, "--incremental");
>   }
>  
>   if (local)
> -- 
> 1.8.4.273.ga194ead

Yes.

Above was what I would have expected from a straight *.sh to *.c
conversion.

But I didn't think about the change in the patch below.

> However I assume you mean to even ease up the conditions now, because now
> both -a as well as -A set ALL_INTO_ONE we could apply the following 
> on top of the previous.
> ---8<---
>
> From 80199368ab6c7ab72f81a5c531f79073a99d2498 Mon Sep 17 00:00:00 2001
> From: Stefan Beller 
> Date: Tue, 17 Sep 2013 22:11:08 +0200
> Subject: [PATCH] Further improvements by reducing nested ifs
>
> This may pass --unpacked and --unpack-unreachable to pack-objects in one
> command, which is redundant. On the other hand we may gain simplicity in
> repack.
>
> Signed-off-by: Stefan Beller 
> ---
>  builtin/repack.c | 24 
>  1 file changed, 12 insertions(+), 12 deletions(-)
>
> diff --git a/builtin/repack.c b/builtin/repack.c
> index a0ff5c7..3e56614 100644
> --- a/builtin/repack.c
> +++ b/builtin/repack.c
> @@ -197,23 +197,23 @@ int cmd_repack(int argc, const char **argv, const char 
> *prefix)
>   if (no_reuse_object)
>   argv_array_pushf(&cmd_args, "--no-reuse-object");
>  
> - if (pack_everything & ALL_INTO_ONE) {
> + if (pack_everything & ALL_INTO_ONE)
>   get_non_kept_pack_filenames(&existing_packs);
> -
> - if (existing_packs.nr && delete_redundant) {
> - if (unpack_unreachable)
> - argv_array_pushf(&cmd_args,
> - "--unpack-unreachable=%s",
> - unpack_unreachable);
> - else if (pack_everything & LOOSEN_UNREACHABLE)
> - argv_array_push(&cmd_args,
> - "--unpack

Re: [PATCH 1/3] repack: rewrite the shell script in C

2013-09-17 Thread Stefan Beller
On 09/17/2013 08:17 PM, Junio C Hamano wrote:
> Stefan Beller  writes:
> 
>> +struct option builtin_repack_options[] = {
>> +OPT_BIT('a', NULL, &pack_everything,
>> +N_("pack everything in a single pack"), 
>> ALL_INTO_ONE),
>> +OPT_BIT('A', NULL, &pack_everything,
>> +N_("same as -a, and turn unreachable objects 
>> loose"),
>> +   LOOSEN_UNREACHABLE),
> 
> Micronit.
> 
> With the current version of the code in cmd_repack() that uses the
> pack_everything variable this may not make a difference, but I think
> this should logically be "LOOSEN_UNREACHABLE | ALL_INTO_ONE" instead,
> and the code should check (pack_evertying & ALL_INTO_ONE) instead of
> checking "!pack_everything".  You may want to add to this flag variable
> a new bit that does _not_ cause it to pack everything into one.
> 

I do understand the "LOOSEN_UNREACHABLE | ALL_INTO_ONE" here, as that
is the logical thing we are doing. Combined with your second idea this 
would result in
---8<---

From 4bbbfb312bf23efa7e702e200fbc2d4479e3477e Mon Sep 17 00:00:00 2001
From: Stefan Beller 
Date: Tue, 17 Sep 2013 22:04:35 +0200
Subject: [PATCH 2/2] Suggestions by Junio

Signed-off-by: Stefan Beller 
---
 builtin/repack.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/builtin/repack.c b/builtin/repack.c
index e5f90c6..a0ff5c7 100644
--- a/builtin/repack.c
+++ b/builtin/repack.c
@@ -143,7 +143,7 @@ int cmd_repack(int argc, const char **argv, const char 
*prefix)
N_("pack everything in a single pack"), 
ALL_INTO_ONE),
OPT_BIT('A', NULL, &pack_everything,
N_("same as -a, and turn unreachable objects 
loose"),
-  LOOSEN_UNREACHABLE),
+  LOOSEN_UNREACHABLE | ALL_INTO_ONE),
OPT_BOOL('d', NULL, &delete_redundant,
N_("remove redundant packs, and run 
git-prune-packed")),
OPT_BOOL('f', NULL, &no_reuse_delta,
@@ -197,10 +197,7 @@ int cmd_repack(int argc, const char **argv, const char 
*prefix)
if (no_reuse_object)
argv_array_pushf(&cmd_args, "--no-reuse-object");
 
-   if (!pack_everything) {
-   argv_array_push(&cmd_args, "--unpacked");
-   argv_array_push(&cmd_args, "--incremental");
-   } else {
+   if (pack_everything & ALL_INTO_ONE) {
get_non_kept_pack_filenames(&existing_packs);
 
if (existing_packs.nr && delete_redundant) {
@@ -212,6 +209,9 @@ int cmd_repack(int argc, const char **argv, const char 
*prefix)
argv_array_push(&cmd_args,
"--unpack-unreachable");
}
+   } else {
+   argv_array_push(&cmd_args, "--unpacked");
+   argv_array_push(&cmd_args, "--incremental");
}
 
if (local)
-- 
1.8.4.273.ga194ead


However I assume you mean to even ease up the conditions now, because now
both -a as well as -A set ALL_INTO_ONE we could apply the following 
on top of the previous.
---8<---

From 80199368ab6c7ab72f81a5c531f79073a99d2498 Mon Sep 17 00:00:00 2001
From: Stefan Beller 
Date: Tue, 17 Sep 2013 22:11:08 +0200
Subject: [PATCH] Further improvements by reducing nested ifs

This may pass --unpacked and --unpack-unreachable to pack-objects in one
command, which is redundant. On the other hand we may gain simplicity in
repack.

Signed-off-by: Stefan Beller 
---
 builtin/repack.c | 24 
 1 file changed, 12 insertions(+), 12 deletions(-)

diff --git a/builtin/repack.c b/builtin/repack.c
index a0ff5c7..3e56614 100644
--- a/builtin/repack.c
+++ b/builtin/repack.c
@@ -197,23 +197,23 @@ int cmd_repack(int argc, const char **argv, const char 
*prefix)
if (no_reuse_object)
argv_array_pushf(&cmd_args, "--no-reuse-object");
 
-   if (pack_everything & ALL_INTO_ONE) {
+   if (pack_everything & ALL_INTO_ONE)
get_non_kept_pack_filenames(&existing_packs);
-
-   if (existing_packs.nr && delete_redundant) {
-   if (unpack_unreachable)
-   argv_array_pushf(&cmd_args,
-   "--unpack-unreachable=%s",
-   unpack_unreachable);
-   else if (pack_everything & LOOSEN_UNREACHABLE)
-   argv_array_push(&cmd_args,
-   "--unpack-unreachable");
-   }
-   } else {
+   else {
argv_array_push(&cmd_args, "--unpacked");
argv_array_push(&cmd_args, "--incremental");
}
 
+   if (existing_packs.nr && delete_redundant) {
+   if (unpack_unreachable)
+   

Re: [PATCH 1/3] repack: rewrite the shell script in C

2013-09-17 Thread Junio C Hamano
Stefan Beller  writes:

> + struct option builtin_repack_options[] = {
> + OPT_BIT('a', NULL, &pack_everything,
> + N_("pack everything in a single pack"), 
> ALL_INTO_ONE),
> + OPT_BIT('A', NULL, &pack_everything,
> + N_("same as -a, and turn unreachable objects 
> loose"),
> +LOOSEN_UNREACHABLE),

Micronit.

With the current version of the code in cmd_repack() that uses the
pack_everything variable this may not make a difference, but I think
this should logically be "LOOSEN_UNREACHABLE | ALL_INTO_ONE" instead,
and the code should check (pack_evertying & ALL_INTO_ONE) instead of
checking "!pack_everything".  You may want to add to this flag variable
a new bit that does _not_ cause it to pack everything into one.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/3] repack: rewrite the shell script in C

2013-09-17 Thread Junio C Hamano
Ramkumar Ramachandra  writes:

> Is $GIT_OBJECT_DIRECTORY a standard variable, or should it be
> $GIT_DIR/objects?

"man git" ;-)  It has been there since early May 2005
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/3] repack: rewrite the shell script in C

2013-09-15 Thread Ramkumar Ramachandra
Stefan Beller wrote:
>  Makefile   |   2 +-
>  builtin.h  |   1 +
>  builtin/repack.c   | 387 
> +
>  contrib/examples/git-repack.sh | 194 +
>  git-repack.sh  | 194 -
>  git.c  |   1 +
>  6 files changed, 584 insertions(+), 195 deletions(-)
>  create mode 100644 builtin/repack.c
>  create mode 100755 contrib/examples/git-repack.sh
>  delete mode 100755 git-repack.sh

Looks like repack.c is significantly larger than git-repack.sh. I look
forward to reading the code.

> diff --git a/builtin/repack.c b/builtin/repack.c
> new file mode 100644
> index 000..a15bd9b
> --- /dev/null
> +++ b/builtin/repack.c
> @@ -0,0 +1,387 @@
> +#include "builtin.h"
> +#include "cache.h"
> +#include "dir.h"
> +#include "parse-options.h"
> +#include "run-command.h"
> +#include "sigchain.h"
> +#include "strbuf.h"
> +#include "string-list.h"
> +#include "argv-array.h"
> +
> +static int delta_base_offset = 1;
> +static char *packdir, *packtmp;
> +
> +static const char *const git_repack_usage[] = {
> +   N_("git repack [options]"),
> +   NULL
> +};
> +
> +static int repack_config(const char *var, const char *value, void *cb)
> +{
> +   if (!strcmp(var, "repack.usedeltabaseoffset")) {
> +   delta_base_offset = git_config_bool(var, value);
> +   return 0;
> +   }
> +   return git_default_config(var, value, cb);
> +}

Configuration option: one bool. I wonder what other configuration
options the future will bring in.

> +/*
> + * Remove temporary $GIT_OBJECT_DIRECTORY/pack/.tmp-$$-pack-* files.
> + */

Is $GIT_OBJECT_DIRECTORY a standard variable, or should it be
$GIT_DIR/objects? A quick grep tells me that there are a few
references to it, but I'm yet to be convinced that it can be something
other than $GIT_DIR/objects.

> +static void remove_temporary_files(void)
> +{
> +   struct strbuf buf = STRBUF_INIT;
> +   size_t dirlen, prefixlen;
> +   DIR *dir;
> +   struct dirent *e;
> +
> +   dir = opendir(packdir);
> +   if (!dir)
> +   return;

Wait a minute: where did we initalize packdir? Ah, it's static, so it
must have been initalized before the function was called (in some sort
of setup function), right? Why don't you return an int from the
function so it's possible to differentiate between success and
failure?

> +   /* Point at the slash at the end of ".../objects/pack/" */

Is packdir a relative or absolute path? The ... isn't helping.

> +   dirlen = strlen(packdir) + 1;
> +   strbuf_addstr(&buf, packtmp);

Mysterious initalization of packtmp.

> +   /* Hold the length of  ".tmp-%d-pack-" */
> +   prefixlen = buf.len - dirlen;

Okay, so that's what packtmp contains: a reading of repack.sh told me
that you didn't even change the variable names.

> +   while ((e = readdir(dir))) {
> +   if (strncmp(e->d_name, buf.buf + dirlen, prefixlen))
> +   continue;

Skip the dentry that points to the .tmp-* thing.

> +   strbuf_setlen(&buf, dirlen);
> +   strbuf_addstr(&buf, e->d_name);
> +   unlink(buf.buf);
> +   }
> +   closedir(dir);
> +   strbuf_release(&buf);
> +}

Okay.

> +static void remove_pack_on_signal(int signo)
> +{
> +   remove_temporary_files();
> +   sigchain_pop(signo);
> +   raise(signo);
> +}

Okay, although I'd have named the variable "signal", not "signo".

> +/*
> + * Adds all packs hex strings to the fname list, which do not
> + * have a corresponding .keep file.
> + */

The packs which don't have a corresponding .keep file must be removed/
repacked. Okay.

> +static void get_non_kept_pack_filenames(struct string_list *fname_list)
> +{
> +   DIR *dir;
> +   struct dirent *e;
> +   char *fname;

Filename, I assume.

> +   size_t len;
> +
> +   if (!(dir = opendir(packdir)))
> +   return;

An int return, please.

> +   while ((e = readdir(dir)) != NULL) {
> +   if (suffixcmp(e->d_name, ".pack"))
> +   continue;

If we didn't find a .pack file, skip over?

> +   len = strlen(e->d_name) - strlen(".pack");
> +   fname = xmemdupz(e->d_name, len);

You can probably use the pathbufs to save some memory here, but I
wouldn't worry about it at this stage.

> +   if (!file_exists(mkpath("%s/%s.keep", packdir, fname)))
> +   string_list_append_nodup(fname_list, fname);
> +   else
> +   free(fname);
> +   }
> +   closedir(dir);
> +}

Okay.

> +static void remove_redundant_pack(const char *path_prefix, const char *hex)
> +{
> +   const char *exts[] = {".pack", ".idx", ".keep"};
> +   int i;
> +   struct strbuf buf = STRBUF_INIT;
> +   size_t plen;
> +
> +   strbuf_addf(&buf, "%s/%s", path_p