Re: pax: GNU tar base-256 size field support

2023-04-22 Thread Marc Espie
On Tue, Apr 18, 2023 at 08:10:06PM -0600, Todd C. Miller wrote:
> I recently ran into a problem with busybox tar generating archives
> where the size field is base-256 encoded for files larger than 8GB.
> Apparently this is a GNU tar extension.
> 
> Do we want to support this in pax?  Below is an initial diff that
> at least produces the correct results when listing the archive.  I
> have not yet verified that it gets extracted correctly.  If there
> is interest I will do some more testing.

It's likely we may encounter this, sooner or later, in ports.
Either in source archives, or in data files for a game or
something.

It's also likely it's obscure enough that it won't be diagnosed
correctly by the people running into that, so we may end up with
another port requiring GNU tar as a dependency.

I'd say that yes, at least supporting proper diagnostics for that
would be a good idea.



pax: GNU tar base-256 size field support

2023-04-18 Thread Todd C . Miller
I recently ran into a problem with busybox tar generating archives
where the size field is base-256 encoded for files larger than 8GB.
Apparently this is a GNU tar extension.

Do we want to support this in pax?  Below is an initial diff that
at least produces the correct results when listing the archive.  I
have not yet verified that it gets extracted correctly.  If there
is interest I will do some more testing.

 - todd

Index: bin/pax/gen_subs.c
===
RCS file: /cvs/src/bin/pax/gen_subs.c,v
retrieving revision 1.32
diff -u -p -u -r1.32 gen_subs.c
--- bin/pax/gen_subs.c  26 Aug 2016 05:06:14 -  1.32
+++ bin/pax/gen_subs.c  19 Apr 2023 02:05:19 -
@@ -45,6 +45,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "pax.h"
 #include "extern.h"
@@ -279,9 +280,10 @@ ul_asc(u_long val, char *str, int len, i
 
 /*
  * asc_ull()
- * Convert hex/octal character string into a unsigned long long.
- * We do not have to check for overflow!  (The headers in all
- * supported formats are not large enough to create an overflow).
+ * Convert hex/octal/base-256 character string into a unsigned long long.
+ * We only have to check for overflow when parsing base-256.
+ * The headers in all supported formats are not large enough to create
+ * an overflow for hex or octal.
  * NOTE: strings passed to us are NOT TERMINATED.
  * Return:
  * unsigned long long value
@@ -296,9 +298,9 @@ asc_ull(char *str, int len, int base)
stop = str + len;
 
/*
-* skip over leading blanks and zeros
+* skip over leading blanks
 */
-   while ((str < stop) && ((*str == ' ') || (*str == '0')))
+   while ((str < stop) && (*str == ' '))
++str;
 
/*
@@ -316,7 +318,17 @@ asc_ull(char *str, int len, int base)
else
break;
}
+   } else if (*str == '\200') {
+   /* base-256 encoding, GNU tar extension */
+   while (++str < stop) {
+   if (tval + (unsigned char)*str > ULLONG_MAX >> 8) {
+   /* overflow */
+   return(-1);
+   }
+   tval = (tval << 8) + (unsigned char)*str;
+   }
} else {
+   /* octal */
while ((str < stop) && (*str >= '0') && (*str <= '7'))
tval = (tval << 3) + (*str++ - '0');
}