Re: [PATCH 2/7] move setting of object-type to alloc_* functions

2014-07-14 Thread Ramsay Jones
On 14/07/14 06:57, Jeff King wrote:
 On Sun, Jul 13, 2014 at 08:27:51PM +0100, Ramsay Jones wrote:
 
 Thinking on this more, writing out the definitions is the only sane
 thing to do here, now that alloc_commit_node does not use the macro.
 Otherwise you are inviting people to modify the macro, but fail to
 notice that the commit allocator also needs updating.

 Hmm, well I could argue that using the macro for all allocators, apart
 from alloc_commit_node(), clearly shows which allocator is the odd-man
 out (and conversely, that all others are the same)! :-P

 No, I don't think this is a telling advantage; I don't think it makes
 that much difference. (six of one, half-a-dozen of the other).
 
 Yeah, I agree with your final statement in parentheses. I am OK with it
 either way (but I have a slight preference for what I posted).

Yep, once again, this looks good to me. :-)

 I was slightly concerned, when reading through this new series, that the
 alloc_node() function may no longer be inlined in the new allocators.
 However, I have just tested on Linux (only using gcc this time), and it
 was just fine. I will test the new series on the above systems later
 (probably tomorrow) but don't expect to find any problems.
 
 That should not be due to my patches (which are just expanding macros),
 but rather to your 1/8, right?

Ah, no. Over the years, I've used many compilers which would happily
inline this:

void *alloc_blob_node(void)
{
return alloc_node(blob_state, sizeof(struct blob));
}

but not this:

void *alloc_blob_node(void)
{
struct blob *b = alloc_node(blob_state, sizeof(struct blob));
return b;
}

Admittedly, it has been many years since I've had to use such a compiler,
but old habits die hard and I always check! (The compiler is not obliged
to inline the code anyway).

 I do not know that it matters that much anyway. Yes, we allocate a lot
 of objects in some workloads. But I think it is not so tight a loop that
 the extra function call is going to kill us (and we tend to _read_ the
 allocated objects much more than we allocate them).

Yes, my testing indicates that it wouldn't be noticeable if it wasn't
inlined, at least on my git.git repo. (The largest repo I have is an
ffmpeg repo, which is only about twice the size of git.git; but I used
git.git for the timing/profiling tests). So, I'm always slightly nervous
about affecting users of really large repos.

I don't think it will be an issue either way (famous last words!). ;-)

ATB,
Ramsay Jones


--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/7] move setting of object-type to alloc_* functions

2014-07-13 Thread Jeff King
On Sat, Jul 12, 2014 at 02:05:39PM -0400, Jeff King wrote:

  I don't particularly like 'flag' here. (not a massive dislike, mind you:)
  
  Perhaps: flag-object_type, type-node_type?
  Or, if that's too verbose, maybe just: flag-type, type-node?
 
 Me either, but as you noticed, type was taken. Your suggestions seem
 fine. We could also just do away with the macro as discussed earlier (we
 already do in the commit_node case, anyway...).

Thinking on this more, writing out the definitions is the only sane
thing to do here, now that alloc_commit_node does not use the macro.
Otherwise you are inviting people to modify the macro, but fail to
notice that the commit allocator also needs updating.

Here's a re-roll. The interesting bit is the addition of the second
patch (but the rest needed to be rebased on top).

  [1/8]: alloc.c: remove the alloc_raw_commit_node() function
  [2/8]: alloc: write out allocator definitions
  [3/8]: move setting of object-type to alloc_* functions
  [4/8]: parse_object_buffer: do not set object type
  [5/8]: add object_as_type helper for casting objects
  [6/8]: alloc: factor out commit index
  [7/8]: object_as_type: set commit index
  [8/8]: diff-tree: avoid lookup_unknown_object

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/7] move setting of object-type to alloc_* functions

2014-07-13 Thread Ramsay Jones
On 13/07/14 07:41, Jeff King wrote:
 On Sat, Jul 12, 2014 at 02:05:39PM -0400, Jeff King wrote:
 
 I don't particularly like 'flag' here. (not a massive dislike, mind you:)

 Perhaps: flag-object_type, type-node_type?
 Or, if that's too verbose, maybe just: flag-type, type-node?

 Me either, but as you noticed, type was taken. Your suggestions seem
 fine. We could also just do away with the macro as discussed earlier (we
 already do in the commit_node case, anyway...).
 
 Thinking on this more, writing out the definitions is the only sane
 thing to do here, now that alloc_commit_node does not use the macro.
 Otherwise you are inviting people to modify the macro, but fail to
 notice that the commit allocator also needs updating.

Hmm, well I could argue that using the macro for all allocators, apart
from alloc_commit_node(), clearly shows which allocator is the odd-man
out (and conversely, that all others are the same)! :-P

No, I don't think this is a telling advantage; I don't think it makes
that much difference. (six of one, half-a-dozen of the other).

BTW, I tested the previous series on Linux 32-bit, Cygwin 32-bit, MinGW
32-bit and Cygwin 64-bit. (I can't test on Linux 64-bit, since I can't
get Linux installed on my new laptop :( ). Admittedly, the testing on
MinGW and Cygwin was only fairly light (it takes *hours* to run the full
testsuite, and I just don't have the time).

I was slightly concerned, when reading through this new series, that the
alloc_node() function may no longer be inlined in the new allocators.
However, I have just tested on Linux (only using gcc this time), and it
was just fine. I will test the new series on the above systems later
(probably tomorrow) but don't expect to find any problems.

 
 Here's a re-roll. The interesting bit is the addition of the second
 patch (but the rest needed to be rebased on top).

Yep, this looks good. Thanks!

 
   [1/8]: alloc.c: remove the alloc_raw_commit_node() function
   [2/8]: alloc: write out allocator definitions
   [3/8]: move setting of object-type to alloc_* functions
   [4/8]: parse_object_buffer: do not set object type
   [5/8]: add object_as_type helper for casting objects
   [6/8]: alloc: factor out commit index
   [7/8]: object_as_type: set commit index
   [8/8]: diff-tree: avoid lookup_unknown_object

ATB,
Ramsay Jones



--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/7] move setting of object-type to alloc_* functions

2014-07-13 Thread Jeff King
On Sun, Jul 13, 2014 at 08:27:51PM +0100, Ramsay Jones wrote:

  Thinking on this more, writing out the definitions is the only sane
  thing to do here, now that alloc_commit_node does not use the macro.
  Otherwise you are inviting people to modify the macro, but fail to
  notice that the commit allocator also needs updating.
 
 Hmm, well I could argue that using the macro for all allocators, apart
 from alloc_commit_node(), clearly shows which allocator is the odd-man
 out (and conversely, that all others are the same)! :-P
 
 No, I don't think this is a telling advantage; I don't think it makes
 that much difference. (six of one, half-a-dozen of the other).

Yeah, I agree with your final statement in parentheses. I am OK with it
either way (but I have a slight preference for what I posted).

 I was slightly concerned, when reading through this new series, that the
 alloc_node() function may no longer be inlined in the new allocators.
 However, I have just tested on Linux (only using gcc this time), and it
 was just fine. I will test the new series on the above systems later
 (probably tomorrow) but don't expect to find any problems.

That should not be due to my patches (which are just expanding macros),
but rather to your 1/8, right?

I do not know that it matters that much anyway. Yes, we allocate a lot
of objects in some workloads. But I think it is not so tight a loop that
the extra function call is going to kill us (and we tend to _read_ the
allocated objects much more than we allocate them).

  Here's a re-roll. The interesting bit is the addition of the second
  patch (but the rest needed to be rebased on top).
 
 Yep, this looks good. Thanks!

Thanks for reviewing, as usual.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/7] move setting of object-type to alloc_* functions

2014-07-12 Thread Ramsay Jones
On 11/07/14 09:46, Jeff King wrote:
 The struct object type implements basic object
 polymorphism.  Individual instances are allocated as
 concrete types (or as a union type that can store any
 object), and a struct object * can be cast into its real
 type after examining its type enum.  This means it is
 dangerous to have a type field that does not match the
 allocation (e.g., setting the type field of a struct blob
 to OBJ_COMMIT would mean that a reader might read past the
 allocated memory).
 
 In most of the current code this is not a problem; the first
 thing we do after allocating an object is usually to set its
 type field by passing it to create_object. However, the
 virtual commits we create in merge-recursive.c do not ever
 get their type set. This does not seem to have caused
 problems in practice, though (presumably because we always
 pass around a struct commit pointer and never even look at
 the type).
 
 We can fix this oversight and also make it harder for future
 code to get it wrong by setting the type directly in the
 object allocation functions.
 
 This will also make it easier to fix problems with commit
 index allocation, as we know that any object allocated by
 alloc_commit_node will meet the invariant that an object
 with an OBJ_COMMIT type field will have a unique index
 number.
 
 Signed-off-by: Jeff King p...@peff.net
 ---
  alloc.c | 18 ++
  blob.c  |  2 +-
  builtin/blame.c |  1 -
  commit.c|  2 +-
  object.c|  5 ++---
  object.h|  2 +-
  tag.c   |  2 +-
  tree.c  |  2 +-
  8 files changed, 17 insertions(+), 17 deletions(-)
 
 diff --git a/alloc.c b/alloc.c
 index d7c3605..fd5fcb7 100644
 --- a/alloc.c
 +++ b/alloc.c
 @@ -18,11 +18,11 @@
  
  #define BLOCKING 1024
  
 -#define DEFINE_ALLOCATOR(name, type) \
 +#define DEFINE_ALLOCATOR(name, flag, type)   \
  static struct alloc_state name##_state;  \
  void *alloc_##name##_node(void)  \
  {\
 - return alloc_node(name##_state, sizeof(type)); \
 + return alloc_node(name##_state, flag, sizeof(type));   \
  }

I don't particularly like 'flag' here. (not a massive dislike, mind you:)

Perhaps: flag-object_type, type-node_type?
Or, if that's too verbose, maybe just: flag-type, type-node?

ATB,
Ramsay Jones


--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/7] move setting of object-type to alloc_* functions

2014-07-12 Thread Ramsay Jones
On 11/07/14 09:46, Jeff King wrote:

[snip]

Sorry, hit send too early ...

 diff --git a/blob.c b/blob.c
 index ae320bd..5720a38 100644
 --- a/blob.c
 +++ b/blob.c
 @@ -7,7 +7,7 @@ struct blob *lookup_blob(const unsigned char *sha1)
  {
   struct object *obj = lookup_object(sha1);
   if (!obj)
 - return create_object(sha1, OBJ_BLOB, alloc_blob_node());
 + return create_object(sha1, alloc_blob_node());
   if (!obj-type)
   obj-type = OBJ_BLOB;
   if (obj-type != OBJ_BLOB) {
 diff --git a/builtin/blame.c b/builtin/blame.c
 index d3b256e..8f3e311 100644
 --- a/builtin/blame.c
 +++ b/builtin/blame.c
 @@ -2287,7 +2287,6 @@ static struct commit *fake_working_tree_commit(struct 
 diff_options *opt,
   commit = alloc_commit_node();
   commit-object.parsed = 1;
   commit-date = now;
 - commit-object.type = OBJ_COMMIT;
   parent_tail = commit-parents;
  
   if (!resolve_ref_unsafe(HEAD, head_sha1, 1, NULL))
 diff --git a/commit.c b/commit.c
 index fb7897c..21ed310 100644
 --- a/commit.c
 +++ b/commit.c
 @@ -63,7 +63,7 @@ struct commit *lookup_commit(const unsigned char *sha1)
   struct object *obj = lookup_object(sha1);
   if (!obj) {
   struct commit *c = alloc_commit_node();
 - return create_object(sha1, OBJ_COMMIT, c);
 + return create_object(sha1, c);
   }

perhaps:
if (!obj)
return create_object(sha1, alloc_commit_node());

(increasing similarity with other calls here ...)

   if (!obj-type)
   obj-type = OBJ_COMMIT;
 diff --git a/object.c b/object.c
 index 9c31e9a..a950b85 100644
 --- a/object.c
 +++ b/object.c
 @@ -141,13 +141,12 @@ static void grow_object_hash(void)
   obj_hash_size = new_hash_size;
  }
  
 -void *create_object(const unsigned char *sha1, int type, void *o)
 +void *create_object(const unsigned char *sha1, void *o)
  {
   struct object *obj = o;
  
   obj-parsed = 0;
   obj-used = 0;
 - obj-type = type;
   obj-flags = 0;
   hashcpy(obj-sha1, sha1);
  
 @@ -163,7 +162,7 @@ struct object *lookup_unknown_object(const unsigned char 
 *sha1)
  {
   struct object *obj = lookup_object(sha1);
   if (!obj)
 - obj = create_object(sha1, OBJ_NONE, alloc_object_node());
 + obj = create_object(sha1, alloc_object_node());
   return obj;
  }
  
 diff --git a/object.h b/object.h
 index 6e12f2c..8020ace 100644
 --- a/object.h
 +++ b/object.h
 @@ -79,7 +79,7 @@ extern struct object *get_indexed_object(unsigned int);
   */
  struct object *lookup_object(const unsigned char *sha1);
  
 -extern void *create_object(const unsigned char *sha1, int type, void *obj);
 +extern void *create_object(const unsigned char *sha1, void *obj);
  
  /*
   * Returns the object, having parsed it to find out what it is.
 diff --git a/tag.c b/tag.c
 index 7b07921..79552c7 100644
 --- a/tag.c
 +++ b/tag.c
 @@ -40,7 +40,7 @@ struct tag *lookup_tag(const unsigned char *sha1)
  {
   struct object *obj = lookup_object(sha1);
   if (!obj)
 - return create_object(sha1, OBJ_TAG, alloc_tag_node());
 + return create_object(sha1, alloc_tag_node());
   if (!obj-type)
   obj-type = OBJ_TAG;
   if (obj-type != OBJ_TAG) {
 diff --git a/tree.c b/tree.c
 index c8c49d7..ed66575 100644
 --- a/tree.c
 +++ b/tree.c
 @@ -183,7 +183,7 @@ struct tree *lookup_tree(const unsigned char *sha1)
  {
   struct object *obj = lookup_object(sha1);
   if (!obj)
 - return create_object(sha1, OBJ_TREE, alloc_tree_node());
 + return create_object(sha1, alloc_tree_node());
   if (!obj-type)
   obj-type = OBJ_TREE;
   if (obj-type != OBJ_TREE) {
 

ATB,
Ramsay Jones



--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/7] move setting of object-type to alloc_* functions

2014-07-12 Thread Jeff King
On Sat, Jul 12, 2014 at 03:44:06PM +0100, Ramsay Jones wrote:

  -   return alloc_node(name##_state, sizeof(type)); \
  +   return alloc_node(name##_state, flag, sizeof(type));   \
   }
 
 I don't particularly like 'flag' here. (not a massive dislike, mind you:)
 
 Perhaps: flag-object_type, type-node_type?
 Or, if that's too verbose, maybe just: flag-type, type-node?

Me either, but as you noticed, type was taken. Your suggestions seem
fine. We could also just do away with the macro as discussed earlier (we
already do in the commit_node case, anyway...).

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/7] move setting of object-type to alloc_* functions

2014-07-12 Thread Jeff King
On Sat, Jul 12, 2014 at 03:55:35PM +0100, Ramsay Jones wrote:

  if (!obj) {
  struct commit *c = alloc_commit_node();
  -   return create_object(sha1, OBJ_COMMIT, c);
  +   return create_object(sha1, c);
  }
 
 perhaps:
   if (!obj)
   return create_object(sha1, alloc_commit_node());
 
 (increasing similarity with other calls here ...)

Yeah, I noticed that but didn't change it to keep the diff small. The
one that should have is 969eba6, but it is not a big deal to do it here.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/7] move setting of object-type to alloc_* functions

2014-07-11 Thread Jeff King
The struct object type implements basic object
polymorphism.  Individual instances are allocated as
concrete types (or as a union type that can store any
object), and a struct object * can be cast into its real
type after examining its type enum.  This means it is
dangerous to have a type field that does not match the
allocation (e.g., setting the type field of a struct blob
to OBJ_COMMIT would mean that a reader might read past the
allocated memory).

In most of the current code this is not a problem; the first
thing we do after allocating an object is usually to set its
type field by passing it to create_object. However, the
virtual commits we create in merge-recursive.c do not ever
get their type set. This does not seem to have caused
problems in practice, though (presumably because we always
pass around a struct commit pointer and never even look at
the type).

We can fix this oversight and also make it harder for future
code to get it wrong by setting the type directly in the
object allocation functions.

This will also make it easier to fix problems with commit
index allocation, as we know that any object allocated by
alloc_commit_node will meet the invariant that an object
with an OBJ_COMMIT type field will have a unique index
number.

Signed-off-by: Jeff King p...@peff.net
---
 alloc.c | 18 ++
 blob.c  |  2 +-
 builtin/blame.c |  1 -
 commit.c|  2 +-
 object.c|  5 ++---
 object.h|  2 +-
 tag.c   |  2 +-
 tree.c  |  2 +-
 8 files changed, 17 insertions(+), 17 deletions(-)

diff --git a/alloc.c b/alloc.c
index d7c3605..fd5fcb7 100644
--- a/alloc.c
+++ b/alloc.c
@@ -18,11 +18,11 @@
 
 #define BLOCKING 1024
 
-#define DEFINE_ALLOCATOR(name, type)   \
+#define DEFINE_ALLOCATOR(name, flag, type) \
 static struct alloc_state name##_state;\
 void *alloc_##name##_node(void)\
 {  \
-   return alloc_node(name##_state, sizeof(type)); \
+   return alloc_node(name##_state, flag, sizeof(type));   \
 }
 
 union any_object {
@@ -39,7 +39,8 @@ struct alloc_state {
void *p;   /* first free node in current allocation */
 };
 
-static inline void *alloc_node(struct alloc_state *s, size_t node_size)
+static inline void *alloc_node(struct alloc_state *s, enum object_type type,
+  size_t node_size)
 {
void *ret;
 
@@ -52,20 +53,21 @@ static inline void *alloc_node(struct alloc_state *s, 
size_t node_size)
ret = s-p;
s-p = (char *)s-p + node_size;
memset(ret, 0, node_size);
+   ((struct object *)ret)-type = type;
return ret;
 }
 
-DEFINE_ALLOCATOR(blob, struct blob)
-DEFINE_ALLOCATOR(tree, struct tree)
-DEFINE_ALLOCATOR(tag, struct tag)
-DEFINE_ALLOCATOR(object, union any_object)
+DEFINE_ALLOCATOR(blob, OBJ_BLOB, struct blob)
+DEFINE_ALLOCATOR(tree, OBJ_TREE, struct tree)
+DEFINE_ALLOCATOR(tag, OBJ_TAG, struct tag)
+DEFINE_ALLOCATOR(object, OBJ_NONE, union any_object)
 
 static struct alloc_state commit_state;
 
 void *alloc_commit_node(void)
 {
static int commit_count;
-   struct commit *c = alloc_node(commit_state, sizeof(struct commit));
+   struct commit *c = alloc_node(commit_state, OBJ_COMMIT, sizeof(struct 
commit));
c-index = commit_count++;
return c;
 }
diff --git a/blob.c b/blob.c
index ae320bd..5720a38 100644
--- a/blob.c
+++ b/blob.c
@@ -7,7 +7,7 @@ struct blob *lookup_blob(const unsigned char *sha1)
 {
struct object *obj = lookup_object(sha1);
if (!obj)
-   return create_object(sha1, OBJ_BLOB, alloc_blob_node());
+   return create_object(sha1, alloc_blob_node());
if (!obj-type)
obj-type = OBJ_BLOB;
if (obj-type != OBJ_BLOB) {
diff --git a/builtin/blame.c b/builtin/blame.c
index d3b256e..8f3e311 100644
--- a/builtin/blame.c
+++ b/builtin/blame.c
@@ -2287,7 +2287,6 @@ static struct commit *fake_working_tree_commit(struct 
diff_options *opt,
commit = alloc_commit_node();
commit-object.parsed = 1;
commit-date = now;
-   commit-object.type = OBJ_COMMIT;
parent_tail = commit-parents;
 
if (!resolve_ref_unsafe(HEAD, head_sha1, 1, NULL))
diff --git a/commit.c b/commit.c
index fb7897c..21ed310 100644
--- a/commit.c
+++ b/commit.c
@@ -63,7 +63,7 @@ struct commit *lookup_commit(const unsigned char *sha1)
struct object *obj = lookup_object(sha1);
if (!obj) {
struct commit *c = alloc_commit_node();
-   return create_object(sha1, OBJ_COMMIT, c);
+   return create_object(sha1, c);
}
if (!obj-type)
obj-type = OBJ_COMMIT;
diff --git a/object.c b/object.c
index 9c31e9a..a950b85 100644
--- a/object.c
+++ b/object.c
@@ -141,13 +141,12 @@ static void