Re: [PATCH v2 01/12] commit-graph: add 'verify' subcommand

2018-05-20 Thread Jakub Narebski
Derrick Stolee  writes:

> If the commit-graph file becomes corrupt, we need a way to verify
> that its contents match the object database. In the manner of
> 'git fsck' we will implement a 'git commit-graph verify' subcommand
> to report all issues with the file.
>
> Add the 'verify' subcommand to the 'commit-graph' builtin and its
> documentation. The subcommand is currently a no-op except for
> loading the commit-graph into memory, which may trigger run-time
> errors that would be caught by normal use. Add a simple test that
> ensures the command returns a zero error code.
>
> If no commit-graph file exists, this is an acceptable state. Do
> not report any errors.

All right.  Nice introductory patch.

>
> During review, we noticed that a FREE_AND_NULL(graph_name) was
> placed after a possible 'return', and this pattern was also in
> graph_read(). Fix that case, too.

This should probably be a separate [micro-]patch.  Especially as Martin
Ågren noticed it is not correct...

>
> Signed-off-by: Derrick Stolee 
> ---
>  Documentation/git-commit-graph.txt |  6 ++
>  builtin/commit-graph.c | 40 
> +-
>  commit-graph.c |  5 +
>  commit-graph.h |  2 ++
>  t/t5318-commit-graph.sh| 10 ++
>  5 files changed, 62 insertions(+), 1 deletion(-)
>
> diff --git a/Documentation/git-commit-graph.txt 
> b/Documentation/git-commit-graph.txt
> index 4c97b555cc..a222cfab08 100644
> --- a/Documentation/git-commit-graph.txt
> +++ b/Documentation/git-commit-graph.txt
> @@ -10,6 +10,7 @@ SYNOPSIS
>  
>  [verse]
>  'git commit-graph read' [--object-dir ]
> +'git commit-graph verify' [--object-dir ]
>  'git commit-graph write'  [--object-dir ]
>  
>  
> @@ -52,6 +53,11 @@ existing commit-graph file.
>  Read a graph file given by the commit-graph file and output basic
>  details about the graph file. Used for debugging purposes.
>  
> +'verify'::
> +
> +Read the commit-graph file and verify its contents against the object
> +database. Used to check for corrupted data.

I wonder if it would be useful to have an option to verify commit-graph
file without accessing the object database, checking just that it is
well formed.

Anyway, it could be added later, if needed.

> +
>  
>  EXAMPLES
>  
> diff --git a/builtin/commit-graph.c b/builtin/commit-graph.c
> index 37420ae0fd..af3101291f 100644
> --- a/builtin/commit-graph.c
> +++ b/builtin/commit-graph.c
> @@ -8,10 +8,16 @@
>  static char const * const builtin_commit_graph_usage[] = {
>   N_("git commit-graph [--object-dir ]"),
>   N_("git commit-graph read [--object-dir ]"),
> + N_("git commit-graph verify [--object-dir ]"),
>   N_("git commit-graph write [--object-dir ] [--append] 
> [--stdin-packs|--stdin-commits]"),
>   NULL
>  };
>  
> +static const char * const builtin_commit_graph_verify_usage[] = {
> + N_("git commit-graph verify [--object-dir ]"),
> + NULL
> +};
> +
>  static const char * const builtin_commit_graph_read_usage[] = {
>   N_("git commit-graph read [--object-dir ]"),
>   NULL
> @@ -29,6 +35,36 @@ static struct opts_commit_graph {
>   int append;
>  } opts;
>  
> +
> +static int graph_verify(int argc, const char **argv)

A reminder for myself: exit code 0 means no errors.

> +{
> + struct commit_graph *graph = 0;
> + char *graph_name;
> +
> + static struct option builtin_commit_graph_verify_options[] = {
> + OPT_STRING(0, "object-dir", _dir,
> +N_("dir"),
> +N_("The object directory to store the graph")),
> + OPT_END(),
> + };
> +
> + argc = parse_options(argc, argv, NULL,
> +  builtin_commit_graph_verify_options,
> +  builtin_commit_graph_verify_usage, 0);
> +
> + if (!opts.obj_dir)
> + opts.obj_dir = get_object_directory();

All right, simple handling of a subcommand and its options.


I still think that '--object-dir=' should be a git wrapper option,
like '--git-dir=' and '--work-tree=' (and
'--namespace=') are.  It would be command-line option equivalent
to the GIT_OBJECT_DIRECTORY environment variable, just like
--git-dir= is for GIT_DIR, and --work-tree= is for
GIT_WORK_TREE, etc.

This way the code would be implemented once for all commands, and there
would be no duplicated code for each git-commit-graph subcommand.

But that may be a matter of a separate patch.

> +
> + graph_name = get_commit_graph_filename(opts.obj_dir);

This returns full path of commit-graph file, allocating it.

> + graph = load_commit_graph_one(graph_name);

This reads the file (no checking if core.commitGraph is set, no handling
of alternatives), and verifies that:
 - file is not too small, i.e. smaller than GRAPH_MIN_SIZE
 - it has correct signature
 - it has correct graph version
 - it has correct hash 

Re: [PATCH v2 01/12] commit-graph: add 'verify' subcommand

2018-05-14 Thread Derrick Stolee

On 5/12/2018 9:31 AM, Martin Ågren wrote:

On 11 May 2018 at 23:15, Derrick Stolee  wrote:


 graph_name = get_commit_graph_filename(opts.obj_dir);
 graph = load_commit_graph_one(graph_name);
+   FREE_AND_NULL(graph_name);

 if (!graph)
 die("graph file %s does not exist", graph_name);
-   FREE_AND_NULL(graph_name);

This is probably because of something I said, but this does not look
correct. The `die()` would typically print "(null)" or segfault. If the
`die()` means we don't free `graph_name`, that should be fine.

You're still leaking `graph` here (possibly nothing this patch should
worry about) and in `graph_verify()`. UNLEAK-ing it immediately before
calling `verify_commit_graph()` should be ok. I also think punting on
this UNLEAK-business entirely would be ok; I was just a bit surprised to
see one variable getting freed and the other one ignored.


Thanks, Martin. I was just blindly searching for FREE_AND_NULL() and 
shouldn't have been so careless.


-Stolee


Re: [PATCH v2 01/12] commit-graph: add 'verify' subcommand

2018-05-12 Thread Martin Ågren
On 11 May 2018 at 23:15, Derrick Stolee  wrote:

> graph_name = get_commit_graph_filename(opts.obj_dir);
> graph = load_commit_graph_one(graph_name);
> +   FREE_AND_NULL(graph_name);
>
> if (!graph)
> die("graph file %s does not exist", graph_name);
> -   FREE_AND_NULL(graph_name);

This is probably because of something I said, but this does not look
correct. The `die()` would typically print "(null)" or segfault. If the
`die()` means we don't free `graph_name`, that should be fine.

You're still leaking `graph` here (possibly nothing this patch should
worry about) and in `graph_verify()`. UNLEAK-ing it immediately before
calling `verify_commit_graph()` should be ok. I also think punting on
this UNLEAK-business entirely would be ok; I was just a bit surprised to
see one variable getting freed and the other one ignored.

Martin


[PATCH v2 01/12] commit-graph: add 'verify' subcommand

2018-05-11 Thread Derrick Stolee
If the commit-graph file becomes corrupt, we need a way to verify
that its contents match the object database. In the manner of
'git fsck' we will implement a 'git commit-graph verify' subcommand
to report all issues with the file.

Add the 'verify' subcommand to the 'commit-graph' builtin and its
documentation. The subcommand is currently a no-op except for
loading the commit-graph into memory, which may trigger run-time
errors that would be caught by normal use. Add a simple test that
ensures the command returns a zero error code.

If no commit-graph file exists, this is an acceptable state. Do
not report any errors.

During review, we noticed that a FREE_AND_NULL(graph_name) was
placed after a possible 'return', and this pattern was also in
graph_read(). Fix that case, too.

Signed-off-by: Derrick Stolee 
---
 Documentation/git-commit-graph.txt |  6 ++
 builtin/commit-graph.c | 40 +-
 commit-graph.c |  5 +
 commit-graph.h |  2 ++
 t/t5318-commit-graph.sh| 10 ++
 5 files changed, 62 insertions(+), 1 deletion(-)

diff --git a/Documentation/git-commit-graph.txt 
b/Documentation/git-commit-graph.txt
index 4c97b555cc..a222cfab08 100644
--- a/Documentation/git-commit-graph.txt
+++ b/Documentation/git-commit-graph.txt
@@ -10,6 +10,7 @@ SYNOPSIS
 
 [verse]
 'git commit-graph read' [--object-dir ]
+'git commit-graph verify' [--object-dir ]
 'git commit-graph write'  [--object-dir ]
 
 
@@ -52,6 +53,11 @@ existing commit-graph file.
 Read a graph file given by the commit-graph file and output basic
 details about the graph file. Used for debugging purposes.
 
+'verify'::
+
+Read the commit-graph file and verify its contents against the object
+database. Used to check for corrupted data.
+
 
 EXAMPLES
 
diff --git a/builtin/commit-graph.c b/builtin/commit-graph.c
index 37420ae0fd..af3101291f 100644
--- a/builtin/commit-graph.c
+++ b/builtin/commit-graph.c
@@ -8,10 +8,16 @@
 static char const * const builtin_commit_graph_usage[] = {
N_("git commit-graph [--object-dir ]"),
N_("git commit-graph read [--object-dir ]"),
+   N_("git commit-graph verify [--object-dir ]"),
N_("git commit-graph write [--object-dir ] [--append] 
[--stdin-packs|--stdin-commits]"),
NULL
 };
 
+static const char * const builtin_commit_graph_verify_usage[] = {
+   N_("git commit-graph verify [--object-dir ]"),
+   NULL
+};
+
 static const char * const builtin_commit_graph_read_usage[] = {
N_("git commit-graph read [--object-dir ]"),
NULL
@@ -29,6 +35,36 @@ static struct opts_commit_graph {
int append;
 } opts;
 
+
+static int graph_verify(int argc, const char **argv)
+{
+   struct commit_graph *graph = 0;
+   char *graph_name;
+
+   static struct option builtin_commit_graph_verify_options[] = {
+   OPT_STRING(0, "object-dir", _dir,
+  N_("dir"),
+  N_("The object directory to store the graph")),
+   OPT_END(),
+   };
+
+   argc = parse_options(argc, argv, NULL,
+builtin_commit_graph_verify_options,
+builtin_commit_graph_verify_usage, 0);
+
+   if (!opts.obj_dir)
+   opts.obj_dir = get_object_directory();
+
+   graph_name = get_commit_graph_filename(opts.obj_dir);
+   graph = load_commit_graph_one(graph_name);
+   FREE_AND_NULL(graph_name);
+
+   if (!graph)
+   return 0;
+
+   return verify_commit_graph(graph);
+}
+
 static int graph_read(int argc, const char **argv)
 {
struct commit_graph *graph = NULL;
@@ -50,10 +86,10 @@ static int graph_read(int argc, const char **argv)
 
graph_name = get_commit_graph_filename(opts.obj_dir);
graph = load_commit_graph_one(graph_name);
+   FREE_AND_NULL(graph_name);
 
if (!graph)
die("graph file %s does not exist", graph_name);
-   FREE_AND_NULL(graph_name);
 
printf("header: %08x %d %d %d %d\n",
ntohl(*(uint32_t*)graph->data),
@@ -160,6 +196,8 @@ int cmd_commit_graph(int argc, const char **argv, const 
char *prefix)
 PARSE_OPT_STOP_AT_NON_OPTION);
 
if (argc > 0) {
+   if (!strcmp(argv[0], "verify"))
+   return graph_verify(argc, argv);
if (!strcmp(argv[0], "read"))
return graph_read(argc, argv);
if (!strcmp(argv[0], "write"))
diff --git a/commit-graph.c b/commit-graph.c
index a8c337dd77..b25aaed128 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -817,3 +817,8 @@ void write_commit_graph(const char *obj_dir,
oids.alloc = 0;
oids.nr = 0;
 }
+
+int verify_commit_graph(struct commit_graph *g)
+{
+   return !g;
+}
diff --git a/commit-graph.h b/commit-graph.h
index