Re: [PATCH v4 3/5] archive-tar: write extended headers for file sizes >= 8GB

2016-07-14 Thread Torsten Bögershausen



On 07/14/2016 06:48 PM, Johannes Sixt wrote:

Am 30.06.2016 um 11:09 schrieb Jeff King:

+/*
+ * This is the max value that a ustar size header can specify, as it
is fixed
+ * at 11 octal digits. POSIX specifies that we switch to extended
headers at
+ * this size.
+ */
+#define USTAR_MAX_SIZE 0777UL


This is too large by one bit for our 32-bit unsigned long on Windows:

archive-tar.c: In function 'write_tar_entry':
archive-tar.c:295: warning: integer constant is too large for 'unsigned
long' type
archive-tar.c: In function 'write_global_extended_header':
archive-tar.c:332: warning: integer constant is too large for 'unsigned
long' type
archive-tar.c:335: warning: integer constant is too large for 'unsigned
long' type
archive-tar.c:335: warning: overflow in implicit constant conversion

-- Hannes

Similar problem on 32 Bit Linux:
 In function ‘write_global_extended_header’:
archive-tar.c:29:25: warning: overflow in implicit constant conversion 
[-Woverflow]

 #define USTAR_MAX_MTIME 0777UL
 ^
archive-tar.c:335:16: note: in expansion of macro ‘USTAR_MAX_MTIME’
   args->time = USTAR_MAX_MTIME;

--
I want to volunteer to do more tests on 32 bit Linux ;-)
Does this fix it and look as a good thing to change ?


--- a/archive-tar.c
+++ b/archive-tar.c
@@ -332,7 +332,7 @@ static void write_global_extended_header(struct 
archiver_args *args)

if (args->time > USTAR_MAX_MTIME) {
strbuf_append_ext_header_uint(_header, "mtime",
  args->time);
-   args->time = USTAR_MAX_MTIME;
+   args->time = (time_t)USTAR_MAX_MTIME;
}
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 3/5] archive-tar: write extended headers for file sizes >= 8GB

2016-07-14 Thread Jeff King
On Thu, Jul 14, 2016 at 10:11:18AM -0700, Junio C Hamano wrote:

> Johannes Sixt  writes:
> 
> > Am 30.06.2016 um 11:09 schrieb Jeff King:
> >> +/*
> >> + * This is the max value that a ustar size header can specify, as it is 
> >> fixed
> >> + * at 11 octal digits. POSIX specifies that we switch to extended headers 
> >> at
> >> + * this size.
> >> + */
> >> +#define USTAR_MAX_SIZE 0777UL
> >
> > This is too large by one bit for our 32-bit unsigned long on Windows:
> >
> > archive-tar.c: In function 'write_tar_entry':
> > archive-tar.c:295: warning: integer constant is too large for
> > unsigned long' type
> > archive-tar.c: In function 'write_global_extended_header':
> > archive-tar.c:332: warning: integer constant is too large for
> > unsigned long' type
> > archive-tar.c:335: warning: integer constant is too large for
> > unsigned long' type
> > archive-tar.c:335: warning: overflow in implicit constant conversion
> 
> Yikes.  I guess we need ULL here, and it cascade to all the
> variables used to interact with this constant, but not everybody has
> "long long", so

On 32-bit systems, I suspect the tar limits are a non-issue. For real
filesystem operations, tar on such a system would use off_t. But we use
"unsigned long" internally for object sizes, so I imagine that objects
larger than 4G are simply impossible on such systems.

So one option would be to simply "#if"-out these checks on 32-bit
systems.

I think it would also be OK to just set USTAR_MAX_SIZE to ULONG_MAX on
such a system, too (which effectively eliminates the check, but keeps
the conditional bits contained).

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 3/5] archive-tar: write extended headers for file sizes >= 8GB

2016-07-14 Thread Junio C Hamano
Johannes Sixt  writes:

> Am 30.06.2016 um 11:09 schrieb Jeff King:
>> +/*
>> + * This is the max value that a ustar size header can specify, as it is 
>> fixed
>> + * at 11 octal digits. POSIX specifies that we switch to extended headers at
>> + * this size.
>> + */
>> +#define USTAR_MAX_SIZE 0777UL
>
> This is too large by one bit for our 32-bit unsigned long on Windows:
>
> archive-tar.c: In function 'write_tar_entry':
> archive-tar.c:295: warning: integer constant is too large for
> unsigned long' type
> archive-tar.c: In function 'write_global_extended_header':
> archive-tar.c:332: warning: integer constant is too large for
> unsigned long' type
> archive-tar.c:335: warning: integer constant is too large for
> unsigned long' type
> archive-tar.c:335: warning: overflow in implicit constant conversion

Yikes.  I guess we need ULL here, and it cascade to all the
variables used to interact with this constant, but not everybody has
"long long", so

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 3/5] archive-tar: write extended headers for file sizes >= 8GB

2016-07-14 Thread Johannes Sixt

Am 30.06.2016 um 11:09 schrieb Jeff King:

+/*
+ * This is the max value that a ustar size header can specify, as it is fixed
+ * at 11 octal digits. POSIX specifies that we switch to extended headers at
+ * this size.
+ */
+#define USTAR_MAX_SIZE 0777UL


This is too large by one bit for our 32-bit unsigned long on Windows:

archive-tar.c: In function 'write_tar_entry':
archive-tar.c:295: warning: integer constant is too large for 'unsigned 
long' type

archive-tar.c: In function 'write_global_extended_header':
archive-tar.c:332: warning: integer constant is too large for 'unsigned 
long' type
archive-tar.c:335: warning: integer constant is too large for 'unsigned 
long' type

archive-tar.c:335: warning: overflow in implicit constant conversion

-- Hannes

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v4 3/5] archive-tar: write extended headers for file sizes >= 8GB

2016-06-30 Thread Jeff King
The ustar format has a fixed-length field for the size of
each file entry which is supposed to contain up to 11 bytes
of octal-formatted data plus a NUL or space terminator.

These means that the largest size we can represent is
0777, or 1 byte short of 8GB. The correct solution
for a larger file, according to POSIX.1-2001, is to add an
extended pax header, similar to how we handle long
filenames. This patch does that, and writes zero for the
size field in the ustar header (the last bit is not
mentioned by POSIX, but it matches how GNU tar behaves with
--format=pax).

This should be a strict improvement over the current
behavior, which is to die in xsnprintf with a "BUG".
However, there's some interesting history here.

Prior to f2f0267 (archive-tar: use xsnprintf for trivial
formatting, 2015-09-24), we silently overflowed the "size"
field. The extra bytes ended up in the "mtime" field of the
header, which was then immediately written itself,
overwriting our extra bytes. What that means depends on how
many bytes we wrote.

If the size was 64GB or greater, then we actually overflowed
digits into the mtime field, meaning our value was
effectively right-shifted by those lost octal digits. And
this patch is again a strict improvement over that.

But if the size was between 8GB and 64GB, then our 12-byte
field held all of the actual digits, and only our NUL
terminator overflowed. According to POSIX, there should be a
NUL or space at the end of the field. However, GNU tar seems
to be lenient here, and will correctly parse a size up 64GB
(minus one) from the field. So sizes in this range might
have just worked, depending on the implementation reading
the tarfile.

This patch is mostly still an improvement there, as the 8GB
limit is specifically mentioned in POSIX as the correct
limit. But it's possible that it could be a regression
(versus the pre-f2f0267 state) if all of the following are
true:

  1. You have a file between 8GB and 64GB.

  2. Your tar implementation _doesn't_ know about pax
 extended headers.

  3. Your tar implementation _does_ parse 12-byte sizes from
 the ustar header without a delimiter.

It's probably not worth worrying about such an obscure set
of conditions, but I'm documenting it here just in case.

Helped-by: René Scharfe 
Signed-off-by: Jeff King 
---
 archive-tar.c   | 31 +--
 t/t5000-tar-tree.sh |  4 ++--
 2 files changed, 31 insertions(+), 4 deletions(-)

diff --git a/archive-tar.c b/archive-tar.c
index cb99df2..57a1540 100644
--- a/archive-tar.c
+++ b/archive-tar.c
@@ -18,6 +18,13 @@ static int tar_umask = 002;
 static int write_tar_filter_archive(const struct archiver *ar,
struct archiver_args *args);
 
+/*
+ * This is the max value that a ustar size header can specify, as it is fixed
+ * at 11 octal digits. POSIX specifies that we switch to extended headers at
+ * this size.
+ */
+#define USTAR_MAX_SIZE 0777UL
+
 /* writes out the whole block, but only if it is full */
 static void write_if_needed(void)
 {
@@ -137,6 +144,20 @@ static void strbuf_append_ext_header(struct strbuf *sb, 
const char *keyword,
strbuf_addch(sb, '\n');
 }
 
+/*
+ * Like strbuf_append_ext_header, but for numeric values.
+ */
+static void strbuf_append_ext_header_uint(struct strbuf *sb,
+ const char *keyword,
+ uintmax_t value)
+{
+   char buf[40]; /* big enough for 2^128 in decimal, plus NUL */
+   int len;
+
+   len = xsnprintf(buf, sizeof(buf), "%"PRIuMAX, value);
+   strbuf_append_ext_header(sb, keyword, buf, len);
+}
+
 static unsigned int ustar_header_chksum(const struct ustar_header *header)
 {
const unsigned char *p = (const unsigned char *)header;
@@ -208,7 +229,7 @@ static int write_tar_entry(struct archiver_args *args,
struct ustar_header header;
struct strbuf ext_header = STRBUF_INIT;
unsigned int old_mode = mode;
-   unsigned long size;
+   unsigned long size, size_in_header;
void *buffer;
int err = 0;
 
@@ -267,7 +288,13 @@ static int write_tar_entry(struct archiver_args *args,
memcpy(header.linkname, buffer, size);
}
 
-   prepare_header(args, , mode, size);
+   size_in_header = size;
+   if (S_ISREG(mode) && size > USTAR_MAX_SIZE) {
+   size_in_header = 0;
+   strbuf_append_ext_header_uint(_header, "size", size);
+   }
+
+   prepare_header(args, , mode, size_in_header);
 
if (ext_header.len > 0) {
err = write_extended_header(args, sha1, ext_header.buf,
diff --git a/t/t5000-tar-tree.sh b/t/t5000-tar-tree.sh
index 950bdd3..93c2d34 100755
--- a/t/t5000-tar-tree.sh
+++ b/t/t5000-tar-tree.sh
@@ -360,7 +360,7 @@ test_expect_success 'set up repository with huge blob' '
 
 # We expect git to die with SIGPIPE here (otherwise