Re: segmentation fault (nullpointer) with git log --submodule -p

2013-01-25 Thread Junio C Hamano
Jonathon Mah  writes:

> Just to note, the proposals so far don't prevent a "smart-ass"
> function from freeing the buffer when it's called underneath the
> use/release scope, as in:
>
> with_commit_buffer(commit); {
>   fn1_needing_buffer(commit);
>   walk_rev_tree_or_something();
>   fn2_needing_buffer(commit);
> } done_with_commit_buffer(commit);

I think the goal of everybody discussing these ideas is to make sure
that all code follows the simple ownership policy proposed at the
beginning of this subthread: commit->buffer belongs to the revision
traversal machinery, and other users could borrow it when available.

Even though your sample code will break, from that point of view, I
do not think it is something worth worrying about.  If the function
"walk_rev_tree_or_something()" discards commit->buffer, it by
definition must be a part of the revision traversal machinery, and
any code that calls it inside with_commit_buffer() or uses the field
after such a call without revalidating commit->buffer, is already in
violation.  With or without such a macro, we would need to be
careful about enforcing the ownership rule, and I think a code
structure like the above example is easier to spot problems in
during the review than the current code.

Because retaining commit->buffer is done for the benefit of the
next/future users of the data, and not for the users that _are_
using them right now, I do not think the usual refcounting that
discards when nobody references the data is a good match to the
problem we are discussing.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: segmentation fault (nullpointer) with git log --submodule -p

2013-01-24 Thread Jonathon Mah
Just to note, the proposals so far don't prevent a "smart-ass" function from 
freeing the buffer when it's called underneath the use/release scope, as in:

with_commit_buffer(commit); {
fn1_needing_buffer(commit);
walk_rev_tree_or_something();
fn2_needing_buffer(commit);
} done_with_commit_buffer(commit);

walk_rev_tree_or_something() might need to read commits to do its thing, and it 
could still choose to free their buffers (as in rev-list.c finish_commit()). If 
those commits includes the one being "retained", the call to fn2 will still see 
NULL despite it being in a 'protected scope'.

Are the objections to using a reference count?



Jonathon Mah
m...@jonathonmah.com


--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: segmentation fault (nullpointer) with git log --submodule -p

2013-01-24 Thread Junio C Hamano
Jonathan Nieder  writes:

> Hi,
>
> Junio C Hamano wrote:
>
>> I've been toying with an idea along this line.
>
> Heh.  Just for fun, here's an uglier version:

Much nicer, though.

>
>   struct wcb_data {
>   int had_buffer;
>   int using_buffer;
>   };
>   #define WITH_COMMIT_BUFFER_DATA_INIT { 0, 0 }
>
>   extern void acquire_commit_buffer(struct commit *, struct wcb_data *);
>   extern void done_with_commit_buffer(struct commit *, struct wcb_data *);
>
>   /*
>* usage:
>*  struct wcb_data buf = WITH_COMMIT_BUFFER_INIT;
>*
>*  with_commit_buffer(commit, buf) {
>*  ...
>*  }
>*/
>   #define with_commit_buffer(commit, i) \
>   for (acquire_commit_buffer(commit, &i); \
>i.using_buffer; \
>done_with_commit_buffer(commit, &i))
>
>   void acquire_commit_buffer(struct commit *commit, struct wcb_data *i)
>   {
>   enum object_type type;
>   unsigned long size;
>
>   assert(!i->using_buffer);
>   i->using_buffer = 1;
>   i->had_buffer = !!commit->buffer;
>
>   if (i->had_buffer)
>   return;
>   commit->buffer = read_sha1_file(commit->object.sha1, &type, 
> &size);
>   if (!commit->buffer)
>   die("unable to read commit %s", 
> sha1_to_hex(commit->object.sha1));
>   }
>
>   void done_with_commit_buffer(struct commit *commit, struct wcb_data *i)
>   {
>   assert(i->using_buffer);
>   i->using_buffer = 0;
>
>   if (!i->had_buffer) {
>   free(commit->buffer);
>   commit->buffer = NULL;
>   }
>   }
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: segmentation fault (nullpointer) with git log --submodule -p

2013-01-24 Thread Jonathan Nieder
Hi,

Junio C Hamano wrote:

> I've been toying with an idea along this line.

Heh.  Just for fun, here's an uglier version:

struct wcb_data {
int had_buffer;
int using_buffer;
};
#define WITH_COMMIT_BUFFER_DATA_INIT { 0, 0 }

extern void acquire_commit_buffer(struct commit *, struct wcb_data *);
extern void done_with_commit_buffer(struct commit *, struct wcb_data *);

/*
 * usage:
 *  struct wcb_data buf = WITH_COMMIT_BUFFER_INIT;
 *
 *  with_commit_buffer(commit, buf) {
 *  ...
 *  }
 */
#define with_commit_buffer(commit, i) \
for (acquire_commit_buffer(commit, &i); \
 i.using_buffer; \
 done_with_commit_buffer(commit, &i))

void acquire_commit_buffer(struct commit *commit, struct wcb_data *i)
{
enum object_type type;
unsigned long size;

assert(!i->using_buffer);
i->using_buffer = 1;
i->had_buffer = !!commit->buffer;

if (i->had_buffer)
return;
commit->buffer = read_sha1_file(commit->object.sha1, &type, 
&size);
if (!commit->buffer)
die("unable to read commit %s", 
sha1_to_hex(commit->object.sha1));
}

void done_with_commit_buffer(struct commit *commit, struct wcb_data *i)
{
assert(i->using_buffer);
i->using_buffer = 0;

if (!i->had_buffer) {
free(commit->buffer);
commit->buffer = NULL;
}
}
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: segmentation fault (nullpointer) with git log --submodule -p

2013-01-24 Thread Junio C Hamano
Jeff King  writes:

> ... I know it gets the job done, but in my
> experience, macros which do not behave syntactically like functions are
> usually a good sign that you are doing something gross and
> unmaintainable.

Yeah, it is a bit too cute for my taste, too, even though it was fun
;-)
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: segmentation fault (nullpointer) with git log --submodule -p

2013-01-24 Thread Jeff King
On Thu, Jan 24, 2013 at 07:59:58PM -0800, Junio C Hamano wrote:

> Junio C Hamano  writes:
> 
> > Jeff King  writes:
> >
> >> ... (e.g., how should "log" know that a submodule diff might later want
> >> to see the same entry? Should we optimistically free and then make it
> >> easier for the later user to reliably ensure the buffer is primed? Or
> >> should we err on the side of keeping it in place?).
> >
> > My knee-jerk reaction is that we should consider that commit->buffer
> > belongs to the revision traversal machinery.  Any other uses bolted
> > on later can borrow it if buffer still exists (I do not think pretty
> > code rewrites the buffer contents in place in any way), or they can
> > ask read_sha1_file() to read it themselves and free when they are
> > done.
> 
> I've been toying with an idea along this line.
> 
>  commit.h| 16 
>  builtin/blame.c | 27 ---
>  commit.c| 20 
>  3 files changed, 44 insertions(+), 19 deletions(-)

I think we are on the same page as far as what needs to happen at the
call sites.

My suggested implementation had a separate buffer, but you are right
that we may need to actually set "commit->buffer" because sub-functions
expect to find it there (the alternative might be cleaning up the
sub-function interfaces). I haven't looked at the call-sites yet.

This:

> +extern int ensure_commit_buffer(struct commit *);
> +extern void discard_commit_buffer(struct commit *);
> +
> +#define with_commit_buffer(commit) \
> + do { \
> + int had_buffer_ = !!commit->buffer; \
> + if (!had_buffer_) \
> + ensure_commit_buffer(commit); \
> + do
> +
> +#define done_with_commit_buffer(commit) \
> + while (0); \
> + if (!had_buffer_) \
> + discard_commit_buffer(commit); \
> + } while (0)

is pretty nasty, though. I know it gets the job done, but in my
experience, macros which do not behave syntactically like functions are
usually a good sign that you are doing something gross and
unmaintainable.

I dunno.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: segmentation fault (nullpointer) with git log --submodule -p

2013-01-24 Thread Junio C Hamano
Junio C Hamano  writes:

> Jeff King  writes:
>
>> ... (e.g., how should "log" know that a submodule diff might later want
>> to see the same entry? Should we optimistically free and then make it
>> easier for the later user to reliably ensure the buffer is primed? Or
>> should we err on the side of keeping it in place?).
>
> My knee-jerk reaction is that we should consider that commit->buffer
> belongs to the revision traversal machinery.  Any other uses bolted
> on later can borrow it if buffer still exists (I do not think pretty
> code rewrites the buffer contents in place in any way), or they can
> ask read_sha1_file() to read it themselves and free when they are
> done.

I've been toying with an idea along this line.

 commit.h| 16 
 builtin/blame.c | 27 ---
 commit.c| 20 
 3 files changed, 44 insertions(+), 19 deletions(-)

diff --git a/commit.h b/commit.h
index c16c8a7..b559535 100644
--- a/commit.h
+++ b/commit.h
@@ -226,4 +226,20 @@ extern void print_commit_list(struct commit_list *list,
  const char *format_cur,
  const char *format_last);
 
+extern int ensure_commit_buffer(struct commit *);
+extern void discard_commit_buffer(struct commit *);
+
+#define with_commit_buffer(commit) \
+   do { \
+   int had_buffer_ = !!commit->buffer; \
+   if (!had_buffer_) \
+   ensure_commit_buffer(commit); \
+   do
+
+#define done_with_commit_buffer(commit) \
+   while (0); \
+   if (!had_buffer_) \
+   discard_commit_buffer(commit); \
+   } while (0)
+
 #endif /* COMMIT_H */
diff --git a/builtin/blame.c b/builtin/blame.c
index b431ba3..8b2e4a5 100644
--- a/builtin/blame.c
+++ b/builtin/blame.c
@@ -1424,25 +1424,14 @@ static void get_commit_info(struct commit *commit,
 
commit_info_init(ret);
 
-   /*
-* We've operated without save_commit_buffer, so
-* we now need to populate them for output.
-*/
-   if (!commit->buffer) {
-   enum object_type type;
-   unsigned long size;
-   commit->buffer =
-   read_sha1_file(commit->object.sha1, &type, &size);
-   if (!commit->buffer)
-   die("Cannot read commit %s",
-   sha1_to_hex(commit->object.sha1));
-   }
-   encoding = get_log_output_encoding();
-   reencoded = logmsg_reencode(commit, encoding);
-   message   = reencoded ? reencoded : commit->buffer;
-   get_ac_line(message, "\nauthor ",
-   &ret->author, &ret->author_mail,
-   &ret->author_time, &ret->author_tz);
+   with_commit_buffer(commit) {
+   encoding = get_log_output_encoding();
+   reencoded = logmsg_reencode(commit, encoding);
+   message   = reencoded ? reencoded : commit->buffer;
+   get_ac_line(message, "\nauthor ",
+   &ret->author, &ret->author_mail,
+   &ret->author_time, &ret->author_tz);
+   } done_with_commit_buffer(commit);
 
if (!detailed) {
free(reencoded);
diff --git a/commit.c b/commit.c
index e8eb0ae..a627eea 100644
--- a/commit.c
+++ b/commit.c
@@ -1357,3 +1357,23 @@ void print_commit_list(struct commit_list *list,
printf(format, sha1_to_hex(list->item->object.sha1));
}
 }
+
+int ensure_commit_buffer(struct commit *commit)
+{
+   enum object_type type;
+   unsigned long size;
+
+   if (commit->buffer)
+   return 0;
+   commit->buffer = read_sha1_file(commit->object.sha1, &type, &size);
+   if (commit->buffer)
+   return -1;
+   else
+   return 0;
+}
+
+void discard_commit_buffer(struct commit *commit)
+{
+   free(commit->buffer);
+   commit->buffer = NULL;
+}
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: segmentation fault (nullpointer) with git log --submodule -p

2013-01-24 Thread Duy Nguyen
On Fri, Jan 25, 2013 at 7:55 AM, Jeff King  wrote:
> On Thu, Jan 24, 2013 at 03:56:23PM -0800, Junio C Hamano wrote:
>
>> Jeff King  writes:
>>
>> > ... (e.g., how should "log" know that a submodule diff might later want
>> > to see the same entry? Should we optimistically free and then make it
>> > easier for the later user to reliably ensure the buffer is primed? Or
>> > should we err on the side of keeping it in place?).
>>
>> My knee-jerk reaction is that we should consider that commit->buffer
>> belongs to the revision traversal machinery.  Any other uses bolted
>> on later can borrow it if buffer still exists (I do not think pretty
>> code rewrites the buffer contents in place in any way), or they can
>> ask read_sha1_file() to read it themselves and free when they are
>> done.
>
> Yeah, that is probably the sanest way forward. It at least leaves
> ownership in one place, and everybody else is an opportunistic consumer.
> We do need to annotate each user site though with something like the
> "ensure" function I mentioned.
>
> If they are not owners, then the better pattern is probably something
> like:

You probably should rename "buffer" (to _buffer or something) and let
the compiler catches all the places commit->buffer illegally used.

>
>   /*
>* Get the commit buffer, either opportunistically using
>* the cached version attached to the commit object, or loading it
>* from disk if necessary.
>*/
>   const char *use_commit_buffer(struct commit *c)
>   {
>   enum object_type type;
>   unsigned long size;
>
>   if (c->buffer)
>   return c->buffer;
>   /*
>* XXX check type == OBJ_COMMIT?
>* XXX die() on NULL as a convenience to callers?
>*/
>   return read_sha1_file(c->object.sha1, &type, &size);
>   }
>
>   void unuse_commit_buffer(const char *buf, struct commit *c)
>   {
>   /*
>* If we used the cached copy attached to the commit,
>* we don't want to free it; it's not our responsibility.
>*/
>   if (buf == c->buffer)
>   return;
>
>   free((char *)buf);
>   }
>
> I suspect that putting a use/unuse pair inside format_commit_message
> would handle most cases.
>
> -Peff
-- 
Duy
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: segmentation fault (nullpointer) with git log --submodule -p

2013-01-24 Thread Jeff King
On Thu, Jan 24, 2013 at 03:56:23PM -0800, Junio C Hamano wrote:

> Jeff King  writes:
> 
> > ... (e.g., how should "log" know that a submodule diff might later want
> > to see the same entry? Should we optimistically free and then make it
> > easier for the later user to reliably ensure the buffer is primed? Or
> > should we err on the side of keeping it in place?).
> 
> My knee-jerk reaction is that we should consider that commit->buffer
> belongs to the revision traversal machinery.  Any other uses bolted
> on later can borrow it if buffer still exists (I do not think pretty
> code rewrites the buffer contents in place in any way), or they can
> ask read_sha1_file() to read it themselves and free when they are
> done.

Yeah, that is probably the sanest way forward. It at least leaves
ownership in one place, and everybody else is an opportunistic consumer.
We do need to annotate each user site though with something like the
"ensure" function I mentioned.

If they are not owners, then the better pattern is probably something
like:

  /*
   * Get the commit buffer, either opportunistically using
   * the cached version attached to the commit object, or loading it
   * from disk if necessary.
   */
  const char *use_commit_buffer(struct commit *c)
  {
  enum object_type type;
  unsigned long size;

  if (c->buffer)
  return c->buffer;
  /*
   * XXX check type == OBJ_COMMIT?
   * XXX die() on NULL as a convenience to callers?
   */
  return read_sha1_file(c->object.sha1, &type, &size);
  }

  void unuse_commit_buffer(const char *buf, struct commit *c)
  {
  /*
   * If we used the cached copy attached to the commit,
   * we don't want to free it; it's not our responsibility.
   */
  if (buf == c->buffer)
  return;

  free((char *)buf);
  }

I suspect that putting a use/unuse pair inside format_commit_message
would handle most cases.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: segmentation fault (nullpointer) with git log --submodule -p

2013-01-24 Thread Junio C Hamano
Jeff King  writes:

> ... (e.g., how should "log" know that a submodule diff might later want
> to see the same entry? Should we optimistically free and then make it
> easier for the later user to reliably ensure the buffer is primed? Or
> should we err on the side of keeping it in place?).

My knee-jerk reaction is that we should consider that commit->buffer
belongs to the revision traversal machinery.  Any other uses bolted
on later can borrow it if buffer still exists (I do not think pretty
code rewrites the buffer contents in place in any way), or they can
ask read_sha1_file() to read it themselves and free when they are
done.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: segmentation fault (nullpointer) with git log --submodule -p

2013-01-24 Thread Jeff King
On Thu, Jan 24, 2013 at 09:14:47PM +0700, Nguyen Thai Ngoc Duy wrote:

> >>> I did. My bisection told me this is the suspect:
> >>>
> >>> ccdc603 (parse_object: try internal cache before reading object db)
> >>
> >> diff --git a/object.c b/object.c
> >> index d8d09f9..6b06297 100644
> >> --- a/object.c
> >> +++ b/object.c
> >> @@ -191,10 +191,15 @@ struct object *parse_object(const unsigned char 
> >> *sha1)
> >> enum object_type type;
> >> int eaten;
> >> const unsigned char *repl = lookup_replace_object(sha1);
> >> -   void *buffer = read_sha1_file(sha1, &type, &size);
> >> +   void *buffer;
> >> +   struct object *obj;
> >> +
> >> +   obj = lookup_object(sha1);
> >> +   if (obj && obj->parsed)
> >> +   return obj;
> >>
> >> Any chance obj->parsed is 1 but ((struct commit*)obj)->buffer is NULL?
> >> What if you change that "if" to
> >>
> >> if (obj && obj->parsed && (obj->type != OBJ_COMMIT || ((struct commit
> >> *)obj)->buffer))
> >>
> >
> > No more segfault!
> 
> Sweet. I have no idea how that fixes it. Maybe Jeff can give some
> explanation after he wakes up.

Ugh. I think I know why it fixes it. We free the commit's buffer as part
of the log traversal, but then later want to access it as part of the
diff. We presumably call parse_object somewhere in the middle to make
sure it is parsed.

Before ccdc603, a side effect of parse_object is that even for a parsed
object, we would fill in the buffer field of a commit or tree. See
parse_object_buffer:

} else if (type == OBJ_COMMIT) {
struct commit *commit = lookup_commit(sha1);
if (commit) {
if (parse_commit_buffer(commit, buffer, size))
return NULL;
if (!commit->buffer) {
commit->buffer = buffer;
eaten = 1;
}
obj = &commit->object;
}

When this patch was originally proposed, I wrote[1]:

  On Thu, Jan 05, 2012 at 01:55:22PM -0800, Junio C Hamano wrote:
  > > So I think it is safe short of somebody doing some horrible manual
  > > munging of a "struct object".
  >
  > Yeah, I was worried about codepaths like commit-pretty-printing
  > might be mucking with the contents of commit->buffer, perhaps
  > reencoding the text and then calling parse_object() to get the
  > unmodified original back, or something silly like that. But the
  > lookup_object() call at the beginning of the parse_object() already
  > prevents us from doing such a thing, so we should be OK, I would
  > think.

  [...]

  What saves you is that the parse_*_buffer functions all do nothing
  when the object.parsed flag is set, and the code I added makes sure
  that object.parsed is set in the object that lookup_object returns.

  So yeah, anytime you tweak the contents of commit->buffer but don't
  unset the "parsed" flag, you are asking for trouble.

Which is true, but obviously I missed that in addition to calling
parse_*_buffer, which will be a no-op, we _also_ set the buffer
independently. So parse_object was functioning in a belt-and-suspenders
for that case. And I think this is probably the same root cause as the
segfault which came up here:

  http://thread.gmane.org/gmane.comp.version-control.git/214366

So what to do?

We can revert ccdc603, but I do not think we need to. We can catch the
problematic cases with something like your patch, but still get the
optimization when the buffer really is already filled in. I think we'd
need to extend your patch to handle trees, too, to be totally correct.

But there are still some loose ends that I note:

  1. Making such a change would be parse_object erring on the side of
 providing the buffer. But it doesn't actually know if the buffer is
 desired or not. For instance, upload-pack benefited from this
 optimization, but does not need save_commit_buffer on at all. So
 commit->buffer is _always_ NULL there, and that's just fine; we
 really don't need to read the object.

 Now this may be a bad example, because due to my follow-on patches,
 we avoid calling parse_object at all in most cases, so I don't
 think it matters any longer to upload-pack. But I suspect there are
 other places with similar circumstances. Fundamentally parse_object
 doesn't know what the caller is interested in.

  2. This means that parse_commit and parse_object behave differently in
 this regard. The former will leave the buffer unfilled. Meaning we
 may still have issues with code paths that munge the buffer without
 resetting the parsed flag, independent of ccdc603 and fixing this.

To me, these highlight that our commit->buffer management is fragile and
is largely about guessing in various circumstances whether somebody will
later want the buffer. I'm not sure of the right solution, though. It
seems like something th

Re: segmentation fault (nullpointer) with git log --submodule -p

2013-01-24 Thread Duy Nguyen
On Thu, Jan 24, 2013 at 9:06 PM, Stefan Näwe
 wrote:
> Am Donnerstag, 24. Januar 2013 14:40:47 schrieb Duy Nguyen:
>> On Thu, Jan 24, 2013 at 7:11 PM, Stefan Näwe
>>  wrote:
 Does it fail with older versions of git? If so, can you bisect?
>>>
>>> I did. My bisection told me this is the suspect:
>>>
>>> ccdc603 (parse_object: try internal cache before reading object db)
>>
>> diff --git a/object.c b/object.c
>> index d8d09f9..6b06297 100644
>> --- a/object.c
>> +++ b/object.c
>> @@ -191,10 +191,15 @@ struct object *parse_object(const unsigned char *sha1)
>> enum object_type type;
>> int eaten;
>> const unsigned char *repl = lookup_replace_object(sha1);
>> -   void *buffer = read_sha1_file(sha1, &type, &size);
>> +   void *buffer;
>> +   struct object *obj;
>> +
>> +   obj = lookup_object(sha1);
>> +   if (obj && obj->parsed)
>> +   return obj;
>>
>> Any chance obj->parsed is 1 but ((struct commit*)obj)->buffer is NULL?
>> What if you change that "if" to
>>
>> if (obj && obj->parsed && (obj->type != OBJ_COMMIT || ((struct commit
>> *)obj)->buffer))
>>
>
> No more segfault!

Sweet. I have no idea how that fixes it. Maybe Jeff can give some
explanation after he wakes up.
-- 
Duy
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: segmentation fault (nullpointer) with git log --submodule -p

2013-01-24 Thread Stefan Näwe
Am Donnerstag, 24. Januar 2013 14:40:47 schrieb Duy Nguyen:
> On Thu, Jan 24, 2013 at 7:11 PM, Stefan Näwe
>  wrote:
>>> Does it fail with older versions of git? If so, can you bisect?
>>
>> I did. My bisection told me this is the suspect:
>>
>> ccdc603 (parse_object: try internal cache before reading object db)
>
> diff --git a/object.c b/object.c
> index d8d09f9..6b06297 100644
> --- a/object.c
> +++ b/object.c
> @@ -191,10 +191,15 @@ struct object *parse_object(const unsigned char *sha1)
> enum object_type type;
> int eaten;
> const unsigned char *repl = lookup_replace_object(sha1);
> -   void *buffer = read_sha1_file(sha1, &type, &size);
> +   void *buffer;
> +   struct object *obj;
> +
> +   obj = lookup_object(sha1);
> +   if (obj && obj->parsed)
> +   return obj;
>
> Any chance obj->parsed is 1 but ((struct commit*)obj)->buffer is NULL?
> What if you change that "if" to
>
> if (obj && obj->parsed && (obj->type != OBJ_COMMIT || ((struct commit
> *)obj)->buffer))
>

No more segfault!

> Also you did not encode commits in any specific encoding,

We're using Git for Windows and some commits contain 'umlauts' (äöü).
But those characters should be encoded in UTF-8, shouldn't they?
But the 'git log...' only crashes on a Debian/Linux machine.

> nor set i18n.logOutputEncoding?

It's not set.

(only i18n.filesEncoding is set to utf-8 on my machine)

Oh, and it's not crashing if I do:

git log -p --submodule |cat

Stefan
--

/dev/random says: Dumb luck beats sound planning every time. Trust me.
python -c "print 
'73746566616e2e6e616577654061746c61732d656c656b74726f6e696b2e636f6d'.decode('hex')"
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: segmentation fault (nullpointer) with git log --submodule -p

2013-01-24 Thread Duy Nguyen
On Thu, Jan 24, 2013 at 7:11 PM, Stefan Näwe
 wrote:
>> Does it fail with older versions of git? If so, can you bisect?
>
> I did. My bisection told me this is the suspect:
>
> ccdc603 (parse_object: try internal cache before reading object db)

diff --git a/object.c b/object.c
index d8d09f9..6b06297 100644
--- a/object.c
+++ b/object.c
@@ -191,10 +191,15 @@ struct object *parse_object(const unsigned char *sha1)
enum object_type type;
int eaten;
const unsigned char *repl = lookup_replace_object(sha1);
-   void *buffer = read_sha1_file(sha1, &type, &size);
+   void *buffer;
+   struct object *obj;
+
+   obj = lookup_object(sha1);
+   if (obj && obj->parsed)
+   return obj;

Any chance obj->parsed is 1 but ((struct commit*)obj)->buffer is NULL?
What if you change that "if" to

if (obj && obj->parsed && (obj->type != OBJ_COMMIT || ((struct commit
*)obj)->buffer))

??

Also you did not encode commits in any specific encoding, nor set
i18n.logOutputEncoding?
-- 
Duy
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: segmentation fault (nullpointer) with git log --submodule -p

2013-01-24 Thread Stefan Näwe
Am 23.01.2013 21:02, schrieb Jeff King:
> On Wed, Jan 23, 2013 at 03:38:16PM +0100, Armin wrote:
> 
>> Hello dear git people.
>>
>> I experience a reproducible segmentation fault on one of my
>> repositories when doing a "git log --submodule -p", tested with newest
>> version on Arch Linux (git version 1.8.1.1) and built fresh (git
>> version 1.8.1.1.347.g9591fcc), tried on 2 seperate systems:
>>
>>
>> Program terminated with signal 11, Segmentation fault.
>> #0  0x004b51e5 in parse_commit_header (context=0x769b6980) at 
>> pretty.c:752
>> 752 for (i = 0; msg[i]; i++) {
>> [...]
>> (gdb) l
>> 747 static void parse_commit_header(struct format_commit_context *context)
>> 748 {
>> 749 const char *msg = context->message;
>> 750 int i;
>> 751 
>> 752 for (i = 0; msg[i]; i++) {
>> 753 int eol;
>> 754 for (eol = i; msg[eol] && msg[eol] != '\n'; eol++)
>> 755 ; /* do nothing */
>> 756 
>> (gdb) p msg
>> $7 = 
>> (gdb) p context->message
>> $8 = 0x0
> 
> Yeah, that should definitely not be NULL. I can't reproduce here with a
> few simple examples, though.
> 
> Does it fail with older versions of git? If so, can you bisect?

I did. My bisection told me this is the suspect:

ccdc603 (parse_object: try internal cache before reading object db)

My git-fu is not good enough to analyze that...

> Is it possible for you to make your repo available?

Unfortunately not. It crashes with one particular repos (using submodules)
that I can't make available but not with another which is available
at https://github.com/snaewe/super.git

HTH

Stefan
-- 

/dev/random says: There must be more to life than compile-and-go.
python -c "print 
'73746566616e2e6e616577654061746c61732d656c656b74726f6e696b2e636f6d'.decode('hex')"
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: segmentation fault (nullpointer) with git log --submodule -p

2013-01-23 Thread Jeff King
On Wed, Jan 23, 2013 at 03:38:16PM +0100, Armin wrote:

> Hello dear git people.
> 
> I experience a reproducible segmentation fault on one of my
> repositories when doing a "git log --submodule -p", tested with newest
> version on Arch Linux (git version 1.8.1.1) and built fresh (git
> version 1.8.1.1.347.g9591fcc), tried on 2 seperate systems:
> 
> 
> Program terminated with signal 11, Segmentation fault.
> #0  0x004b51e5 in parse_commit_header (context=0x769b6980) at 
> pretty.c:752
> 752 for (i = 0; msg[i]; i++) {
> [...]
> (gdb) l
> 747 static void parse_commit_header(struct format_commit_context *context)
> 748 {
> 749 const char *msg = context->message;
> 750 int i;
> 751 
> 752 for (i = 0; msg[i]; i++) {
> 753 int eol;
> 754 for (eol = i; msg[eol] && msg[eol] != '\n'; eol++)
> 755 ; /* do nothing */
> 756 
> (gdb) p msg
> $7 = 
> (gdb) p context->message
> $8 = 0x0

Yeah, that should definitely not be NULL. I can't reproduce here with a
few simple examples, though.

Does it fail with older versions of git? If so, can you bisect?

Is it possible for you to make your repo available?

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


segmentation fault (nullpointer) with git log --submodule -p

2013-01-23 Thread Armin
Hello dear git people.

I experience a reproducible segmentation fault on one of my repositories when 
doing a "git log --submodule -p", tested with newest version on Arch Linux (git 
version 1.8.1.1) and built fresh (git version 1.8.1.1.347.g9591fcc), tried on 2 
seperate systems:


Program terminated with signal 11, Segmentation fault.
#0  0x004b51e5 in parse_commit_header (context=0x769b6980) at 
pretty.c:752
752 for (i = 0; msg[i]; i++) {
(gdb) bt
#0  0x004b51e5 in parse_commit_header (context=0x769b6980) at 
pretty.c:752
#1  format_commit_one (context=, placeholder=0x526b1e "s", 
sb=0x769b6ad0) at pretty.c:1157
#2  format_commit_item (sb=0x769b6ad0, placeholder=0x526b1e "s", 
context=) at pretty.c:1224
#3  0x004dacd9 in strbuf_expand (sb=sb@entry=0x769b6ad0, 
format=0x526b1e "s", format@entry=0x526b18 "  %m %s", fn=fn@entry=0x4b4730 
, context=context@entry=0x769b6980)
at strbuf.c:247
#4  0x004b5816 in format_commit_message (commit=commit@entry=0x1ffafd8, 
format=format@entry=0x526b18 "  %m %s", sb=sb@entry=0x769b6ad0, 
pretty_ctx=pretty_ctx@entry=0x769b6af0) at pretty.c:1284
#5  0x004dde52 in print_submodule_summary (reset=0x754640 "\033[m", 
add=0x754708 "\033[32m", del=0x7546e0 "\033[31m", f=0x7f0685bac7a0, 
rev=0x769b6b40) at submodule.c:236
#6  show_submodule_summary (f=0x7f0685bac7a0, path=, 
one=one@entry=0x1ff2af0 
"\020\\vC\371\070\vJ\352\344\205\340\226u\273\021\372\330\234\004", 
two=two@entry=0x2030a60 "\301a(\350\371\372\340mb[խo_\272\301\223V˙", 
dirty_submodule=, meta=meta@entry=0x754690 "\033[1m", 
del=del@entry=0x7546e0 "\033[31m", add=0x754708 "\033[32m", 
reset=reset@entry=0x754640 "\033[m") at submodule.c:307
#7  0x0048dd1d in builtin_diff (name_a=name_a@entry=0x1ff2b50 
"Packages/Application/Amadeus.Somea.Dialog", name_b=name_b@entry=0x1ff2b50 
"Packages/Application/Amadeus.Somea.Dialog", 
one=one@entry=0x1ff2af0, two=two@entry=0x2030a60, xfrm_msg=0x2039a20 
"\033[1mindex 105c764..c16128e 16\033[m\n", 
must_show_header=must_show_header@entry=0, o=o@entry=0x769b7b88, 
complete_rewrite=complete_rewrite@entry=0) at diff.c:2267
#8  0x0048e60e in run_diff_cmd (pgm=pgm@entry=0x0, name=0x1ff2b50 
"Packages/Application/Amadeus.Somea.Dialog", other=, 
attr_path=attr_path@entry=0x1ff2b50 
"Packages/Application/Amadeus.Somea.Dialog", one=one@entry=0x1ff2af0, 
two=two@entry=0x2030a60, msg=msg@entry=0x769b74b0, 
o=o@entry=0x769b7b88, p=p@entry=0x20371b0)
at diff.c:3057

(gdb) bt
#0  0x004b51e5 in parse_commit_header (context=0x769b6980) at 
pretty.c:752
#1  format_commit_one (context=, placeholder=0x526b1e "s", 
sb=0x769b6ad0) at pretty.c:1157
#2  format_commit_item (sb=0x769b6ad0, placeholder=0x526b1e "s", 
context=) at pretty.c:1224
#3  0x004dacd9 in strbuf_expand (sb=sb@entry=0x769b6ad0, 
format=0x526b1e "s", format@entry=0x526b18 "  %m %s", fn=fn@entry=0x4b4730 
, context=context@entry=0x769b6980)
at strbuf.c:247
#4  0x004b5816 in format_commit_message (commit=commit@entry=0x1ffafd8, 
format=format@entry=0x526b18 "  %m %s", sb=sb@entry=0x769b6ad0, 
pretty_ctx=pretty_ctx@entry=0x769b6af0) at pretty.c:1284
#5  0x004dde52 in print_submodule_summary (reset=0x754640 "\033[m", 
add=0x754708 "\033[32m", del=0x7546e0 "\033[31m", f=0x7f0685bac7a0, 
rev=0x769b6b40) at submodule.c:236
#6  show_submodule_summary (f=0x7f0685bac7a0, path=, 
one=one@entry=0x1ff2af0 
"\020\\vC\371\070\vJ\352\344\205\340\226u\273\021\372\330\234\004", 
two=two@entry=0x2030a60 "\301a(\350\371\372\340mb[խo_\272\301\223V˙", 
dirty_submodule=, meta=meta@entry=0x754690 "\033[1m", 
del=del@entry=0x7546e0 "\033[31m", add=0x754708 "\033[32m", 
reset=reset@entry=0x754640 "\033[m") at submodule.c:307
#7  0x0048dd1d in builtin_diff (name_a=name_a@entry=0x1ff2b50 
"Packages/Application/Amadeus.Somea.Dialog", name_b=name_b@entry=0x1ff2b50 
"Packages/Application/Amadeus.Somea.Dialog", 
one=one@entry=0x1ff2af0, two=two@entry=0x2030a60, xfrm_msg=0x2039a20 
"\033[1mindex 105c764..c16128e 16\033[m\n", 
must_show_header=must_show_header@entry=0, o=o@entry=0x769b7b88, 
complete_rewrite=complete_rewrite@entry=0) at diff.c:2267
#8  0x0048e60e in run_diff_cmd (pgm=pgm@entry=0x0, name=0x1ff2b50 
"Packages/Application/Amadeus.Somea.Dialog", other=, 
attr_path=attr_path@entry=0x1ff2b50 
"Packages/Application/Amadeus.Somea.Dialog", one=one@entry=0x1ff2af0, 
two=two@entry=0x2030a60, msg=msg@entry=0x769b74b0, 
o=o@entry=0x769b7b88, p=p@entry=0x20371b0)
at diff.c:3057
#9  0x0048eb3d in run_diff (o=0x769b7b88, p=0x20371b0) at 
diff.c:3145
#10 diff_flush_patch (o=0x769b7b88, p=0x20371b0) at diff.c:3979
#11 diff_flush_patch (p=0x20371b0, o=0x769b7b88) at diff.c:3970
#12 0x0048f15f in diff_flush (options=options@entry=0x769b7b88) at 
diff.c:4500
#13 0x00