Re: [PATCH] Convert size datatype to size_t

2017-08-09 Thread Johannes Schindelin
Hi Martin,

On Tue, 8 Aug 2017, Martin Koegler wrote:

> From: Martin Koegler 
> 
> It changes the signature of the core object access function
> including any other functions to assure a clean compile if
> sizeof(size_t) != sizeof(unsigned long).
> 
> Signed-off-by: Martin Koegler 
> ---

Thank you for working on this!

Much appreciated,
Dscho


Re: [PATCH] Convert size datatype to size_t

2017-08-09 Thread Johannes Schindelin
Hi,

On Wed, 9 Aug 2017, Junio C Hamano wrote:

> Martin Koegler  writes:
> 
> > On Wed, Aug 09, 2017 at 09:19:06AM +0200, Martin Koegler wrote:
> >> On Tue, Aug 08, 2017 at 11:04:20PM -0700, Junio C Hamano wrote:
> >> > As https://travis-ci.org/git/git/jobs/262463159 shows, unfortunately
> >> > it turns out that things are not so simple X-<.  On Linux32, size_t
> >> > is uint, which is the same size as ulong, but "%lu" is not appropriate
> >> > for showing a size_t value.
> >> > 
> >> > So you are correct to say in the comment under three-dashes that
> >> > there is much more to change in the codebase.
> >> 
> >> I'll post a new version fixing the *printf issues.
> >
> > What is the correct format specifier for size_t?
> > Linux has %zu (C99). Is that OK for the GIT codebase?
> 
> I haven't double checked, but IIRC we cast to uintmax_t and use
> PRIuMAX.

That would be my preference, too. This is the patch that I need to make
`pu` compile for me again (note: there is one `unsigned long` -> `size_t`
change hiding between all the `%lu` -> `%PRIuMAX` changes):

-- snipsnap --
diff --git a/builtin/cat-file.c b/builtin/cat-file.c
index bc19e08a62d..10428cbeb0b 100644
--- a/builtin/cat-file.c
+++ b/builtin/cat-file.c
@@ -232,7 +232,7 @@ static void expand_atom(struct strbuf *sb, const char 
*atom, int len,
if (data->mark_query)
data->info.sizep = >size;
else
-   strbuf_addf(sb, "%lu", data->size);
+   strbuf_addf(sb, "%" PRIuMAX, (uintmax_t)data->size);
} else if (is_atom("objectsize:disk", atom, len)) {
if (data->mark_query)
data->info.disk_sizep = >disk_size;
diff --git a/builtin/grep.c b/builtin/grep.c
index 9be8f817f2f..e3e33f3baab 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -437,7 +437,7 @@ static int grep_submodule(struct grep_opt *opt, struct 
repository *superproject,
struct object *object;
struct tree_desc tree;
void *data;
-   unsigned long size;
+   size_t size;
struct strbuf base = STRBUF_INIT;
 
object = parse_object_or_die(oid, oid_to_hex(oid));
diff --git a/builtin/ls-tree.c b/builtin/ls-tree.c
index 3ab99c47a0c..b09d9cba7e4 100644
--- a/builtin/ls-tree.c
+++ b/builtin/ls-tree.c
@@ -99,7 +99,7 @@ static int show_tree(const unsigned char *sha1, struct strbuf 
*base,
  "BAD");
else
xsnprintf(size_text, sizeof(size_text),
- "%lu", size);
+ "%" PRIuMAX, (uintmax_t)size);
} else
xsnprintf(size_text, sizeof(size_text), "-");
printf("%06o %s %s %7s\t", mode, type,
diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 126e7097bf6..d94fd170243 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -1863,9 +1863,10 @@ static int try_delta(struct unpacked *trg, struct 
unpacked *src,
die("object %s cannot be read",
oid_to_hex(_entry->idx.oid));
if (sz != trg_size)
-   die("object %s inconsistent object length (%lu vs %lu)",
-   oid_to_hex(_entry->idx.oid), sz,
-   trg_size);
+   die("object %s inconsistent object length (%" PRIuMAX
+   " vs %" PRIuMAX ")",
+   oid_to_hex(_entry->idx.oid), (uintmax_t)sz,
+   (uintmax_t)trg_size);
*mem_usage += sz;
}
if (!src->data) {
@@ -1891,9 +1892,10 @@ static int try_delta(struct unpacked *trg, struct 
unpacked *src,
oid_to_hex(_entry->idx.oid));
}
if (sz != src_size)
-   die("object %s inconsistent object length (%lu vs %lu)",
-   oid_to_hex(_entry->idx.oid), sz,
-   src_size);
+   die("object %s inconsistent object length (%" PRIuMAX
+   " vs %" PRIuMAX ")",
+   oid_to_hex(_entry->idx.oid), (uintmax_t)sz,
+   (uintmax_t)src_size);
*mem_usage += sz;
}
if (!src->index) {
diff --git a/diff.c b/diff.c
index 6084487a2fc..1e815e87acb 100644
--- a/diff.c
+++ b/diff.c
@@ -2971,7 +2971,7 @@ static void emit_binary_diff_body(struct diff_options *o,
}
 
if (delta && delta_size < deflate_size) {
-   char *s = xstrfmt("%lu", orig_size);
+   char *s = xstrfmt("%" PRIuMAX, (uintmax_t)orig_size);
emit_diff_symbol(o, 

Re: [PATCH] Convert size datatype to size_t

2017-08-09 Thread Junio C Hamano
Martin Koegler  writes:

> On Wed, Aug 09, 2017 at 09:19:06AM +0200, Martin Koegler wrote:
>> On Tue, Aug 08, 2017 at 11:04:20PM -0700, Junio C Hamano wrote:
>> > As https://travis-ci.org/git/git/jobs/262463159 shows, unfortunately
>> > it turns out that things are not so simple X-<.  On Linux32, size_t
>> > is uint, which is the same size as ulong, but "%lu" is not appropriate
>> > for showing a size_t value.
>> > 
>> > So you are correct to say in the comment under three-dashes that
>> > there is much more to change in the codebase.
>> 
>> I'll post a new version fixing the *printf issues.
>
> What is the correct format specifier for size_t?
> Linux has %zu (C99). Is that OK for the GIT codebase?

I haven't double checked, but IIRC we cast to uintmax_t and use
PRIuMAX.


Re: [PATCH] Convert size datatype to size_t

2017-08-09 Thread Martin Koegler
On Wed, Aug 09, 2017 at 09:19:06AM +0200, Martin Koegler wrote:
> On Tue, Aug 08, 2017 at 11:04:20PM -0700, Junio C Hamano wrote:
> > As https://travis-ci.org/git/git/jobs/262463159 shows, unfortunately
> > it turns out that things are not so simple X-<.  On Linux32, size_t
> > is uint, which is the same size as ulong, but "%lu" is not appropriate
> > for showing a size_t value.
> > 
> > So you are correct to say in the comment under three-dashes that
> > there is much more to change in the codebase.
> 
> I'll post a new version fixing the *printf issues.

What is the correct format specifier for size_t?
Linux has %zu (C99). Is that OK for the GIT codebase?

Regards,
Martin


Re: [PATCH] Convert size datatype to size_t

2017-08-09 Thread Martin Koegler
On Tue, Aug 08, 2017 at 11:04:20PM -0700, Junio C Hamano wrote:
> Martin Koegler  writes:
> 
> > From: Martin Koegler 
> >
> > It changes the signature of the core object access function
> > including any other functions to assure a clean compile if
> > sizeof(size_t) != sizeof(unsigned long).
> 
> As https://travis-ci.org/git/git/jobs/262463159 shows, unfortunately
> it turns out that things are not so simple X-<.  On Linux32, size_t
> is uint, which is the same size as ulong, but "%lu" is not appropriate
> for showing a size_t value.
> 
> So you are correct to say in the comment under three-dashes that
> there is much more to change in the codebase.

My patch is based on V2 of my "Fix delta integer overflows" patch [also 
changing the variable types].

I'll post a new version fixing the *printf issues.

The patch can't be splitted without causing compile errors, if sizeof(size_t) 
!= sizeof(ulong).
Many variables are not changed to keep the patch small.

Regards,
Martin


Re: [PATCH] Convert size datatype to size_t

2017-08-09 Thread Junio C Hamano
Martin Koegler  writes:

> From: Martin Koegler 
>
> It changes the signature of the core object access function
> including any other functions to assure a clean compile if
> sizeof(size_t) != sizeof(unsigned long).

As https://travis-ci.org/git/git/jobs/262463159 shows, unfortunately
it turns out that things are not so simple X-<.  On Linux32, size_t
is uint, which is the same size as ulong, but "%lu" is not appropriate
for showing a size_t value.

So you are correct to say in the comment under three-dashes that
there is much more to change in the codebase.




Re: [PATCH] Convert size datatype to size_t

2017-08-08 Thread Junio C Hamano
Martin Koegler  writes:

> diff --git a/apply.c b/apply.c
> index f2d5991..ea97fd2 100644
> --- a/apply.c
> +++ b/apply.c
> @@ -3085,7 +3085,7 @@ static int apply_binary_fragment(struct apply_state 
> *state,
>struct patch *patch)
>  {
>   struct fragment *fragment = patch->fragments;
> - unsigned long len;
> + size_t len;
>   void *dst;
>  
>   if (!fragment)

This variable is made size_t because it receives the result size of
patch_delta().  And it later is assigned to img->len field, which
already is size_t before this patch.

Curiously, the patch_delta() invocation with this patch reads like
this:

dst = patch_delta(img->buf, img->len, fragment->patch,
  fragment->size, );

where patch_delta() is updated (correctly) to:

void *patch_delta(const void *src_buf, size_t src_size,
  const void *delta_buf, size_t delta_size,
  size_t *dst_size)

with this patch.  But "size" field in "struct fragment" is still
"int".  So we'd need to update it as well, not necessarily in this
patch (which is already too big) but as part of a larger whole.

> @@ -3174,7 +3174,7 @@ static int apply_binary(struct apply_state *state,
>   if (has_sha1_file(oid.hash)) {
>   /* We already have the postimage */
>   enum object_type type;
> - unsigned long size;
> + size_t size;
>   char *result;
>  
>   result = read_sha1_file(oid.hash, , );

This is to receive the resulting size from read_sha1_file().  It is
assigned to img->len, which is already size_t, so all is good here.

> @@ -3236,7 +3236,7 @@ static int read_blob_object(struct strbuf *buf, const 
> struct object_id *oid, uns
>   strbuf_addf(buf, "Subproject commit %s\n", oid_to_hex(oid));
>   } else {
>   enum object_type type;
> - unsigned long sz;
> + size_t sz;
>   char *result;
>  
>   result = read_sha1_file(oid->hash, , );

By reading the remainder of this function, this conversion also is
good.  sz that is now size_t is used as the size attached to an
existing strbuf like so:

result = read_sha1_file(oid->hash, , );
if (!result)
return -1;
/* XXX read_sha1_file NUL-terminates */
strbuf_attach(buf, result, sz, sz + 1);

in the part beyond the post context of this hunk.  In the longer
term, sz+1 we see here may want to become the overflow-safe variant
st_add().

As you said in the comment after three-dashes in the patch, a lot
more work is needed and your patch is a good starting point.

I am not sure if we can split the patch somehow to make it easier to
review.  The deceptively small part of your patch, i.e.

 apply.c  |  6 +++---

needs the above analysis to see if they are correct and what more
work is necessary, and there are 65 more files with ~190 lines
changed.



Re: [PATCH] Convert size datatype to size_t

2017-08-08 Thread Junio C Hamano
Martin Koegler  writes:

> From: Martin Koegler 
>
> It changes the signature of the core object access function
> including any other functions to assure a clean compile if
> sizeof(size_t) != sizeof(unsigned long).
>
> Signed-off-by: Martin Koegler 
> ---
> ...
>  66 files changed, 193 insertions(+), 191 deletions(-)

Wow, that's a lot of changes.  What version did you worked this
patch on?  The reason I ask is because...

> diff --git a/diff-delta.c b/diff-delta.c
> index cd238c8..3d5e1ef 100644

... I do not see any version of Git that had blob cd238c8 at that
path, so "git am -3" is having hard time applying this pach to allow
me reviewing it.


[PATCH] Convert size datatype to size_t

2017-08-08 Thread Martin Koegler
From: Martin Koegler 

It changes the signature of the core object access function
including any other functions to assure a clean compile if
sizeof(size_t) != sizeof(unsigned long).

Signed-off-by: Martin Koegler 
---

There is much more to change in the codebase. It JUST fixes the signature
of some functions.

 apply.c  |  6 +++---
 archive-tar.c|  4 ++--
 archive-zip.c|  2 +-
 archive.c|  2 +-
 archive.h|  2 +-
 blame.c  |  4 ++--
 blame.h  |  2 +-
 builtin/cat-file.c   | 10 +-
 builtin/difftool.c   |  2 +-
 builtin/fast-export.c|  6 +++---
 builtin/fmt-merge-msg.c  |  2 +-
 builtin/fsck.c   |  2 +-
 builtin/grep.c   |  6 +++---
 builtin/index-pack.c | 12 ++--
 builtin/log.c|  4 ++--
 builtin/ls-tree.c|  2 +-
 builtin/merge-tree.c |  6 +++---
 builtin/mktag.c  |  2 +-
 builtin/notes.c  |  6 +++---
 builtin/pack-objects.c   | 10 +-
 builtin/reflog.c |  2 +-
 builtin/tag.c|  4 ++--
 builtin/unpack-file.c|  2 +-
 builtin/unpack-objects.c | 14 +++---
 builtin/verify-commit.c  |  2 +-
 bundle.c |  2 +-
 cache.h  | 22 +++---
 combine-diff.c   |  9 +
 commit.c |  6 +++---
 config.c |  2 +-
 delta.h  | 26 +-
 diff-delta.c | 16 
 diff.c   | 14 +++---
 diff.h   |  2 +-
 diffcore.h   |  2 +-
 dir.c|  2 +-
 entry.c  |  4 ++--
 fast-import.c| 18 +-
 fsck.c   |  2 +-
 grep.h   |  2 +-
 http-push.c  |  2 +-
 mailmap.c|  2 +-
 match-trees.c|  4 ++--
 merge-blobs.c|  6 +++---
 merge-blobs.h|  2 +-
 merge-recursive.c|  4 ++--
 notes-cache.c|  2 +-
 notes-merge.c|  2 +-
 notes.c  |  6 +++---
 object.c |  2 +-
 pack-check.c |  2 +-
 pack-objects.h   |  6 +++---
 patch-delta.c| 11 ++-
 read-cache.c |  4 ++--
 ref-filter.c |  4 ++--
 remote-testsvn.c |  4 ++--
 rerere.c |  2 +-
 sha1_file.c  | 44 ++--
 streaming.c  |  8 
 streaming.h  |  2 +-
 submodule-config.c   |  2 +-
 t/helper/test-delta.c|  2 +-
 tag.c|  4 ++--
 tree-walk.c  |  8 
 tree.c   |  2 +-
 xdiff-interface.c|  2 +-
 66 files changed, 193 insertions(+), 191 deletions(-)

diff --git a/apply.c b/apply.c
index f2d5991..ea97fd2 100644
--- a/apply.c
+++ b/apply.c
@@ -3085,7 +3085,7 @@ static int apply_binary_fragment(struct apply_state 
*state,
 struct patch *patch)
 {
struct fragment *fragment = patch->fragments;
-   unsigned long len;
+   size_t len;
void *dst;
 
if (!fragment)
@@ -3174,7 +3174,7 @@ static int apply_binary(struct apply_state *state,
if (has_sha1_file(oid.hash)) {
/* We already have the postimage */
enum object_type type;
-   unsigned long size;
+   size_t size;
char *result;
 
result = read_sha1_file(oid.hash, , );
@@ -3236,7 +3236,7 @@ static int read_blob_object(struct strbuf *buf, const 
struct object_id *oid, uns
strbuf_addf(buf, "Subproject commit %s\n", oid_to_hex(oid));
} else {
enum object_type type;
-   unsigned long sz;
+   size_t sz;
char *result;
 
result = read_sha1_file(oid->hash, , );
diff --git a/archive-tar.c b/archive-tar.c
index c6ed96e..719673d 100644
--- a/archive-tar.c
+++ b/archive-tar.c
@@ -115,7 +115,7 @@ static int stream_blocked(const unsigned char *sha1)
 {
struct git_istream *st;
enum object_type type;
-   unsigned long sz;
+   size_t sz;
char buf[BLOCKSIZE];
ssize_t readlen;
 
@@ -240,7 +240,7 @@ static int write_tar_entry(struct archiver_args *args,
struct ustar_header header;
struct strbuf ext_header = STRBUF_INIT;
unsigned int old_mode = mode;
-   unsigned long size, size_in_header;
+   size_t size, size_in_header;
void *buffer;
int err = 0;
 
diff --git a/archive-zip.c b/archive-zip.c
index e8913e5..4492d64 100644
--- a/archive-zip.c
+++ b/archive-zip.c
@@ -295,7 +295,7 @@ static int write_zip_entry(struct archiver_args *args,
void *buffer;
struct git_istream *stream = NULL;
unsigned long flags = 0;
-