Hi!

I had a look at the 4GB per packaged file limit. The current cpio format [1] uses 8 bytes to encode 32 bit integers as hexadecimal ASCII strings. So there is no way of fixing this problem while staying compatible with the cpio format (and keep rpm2cpio working).

Having a look at the tar formats I do not belief that switching to tar is a real option. The format is just horrible (GNU tar needs over 200 lines to read an integer out of a header field) and full of hacks to remain backward compatible (header in header + extentions). This would be all not that bad if there where a nice little library we could link against...

My favorite solution would be to use no payload format at all and just rely on the meta data we ship in the header anyway. While this would surely be possible it requires redoing the hard link handling (as hard links are treated specially in the payload - like shipping the content just once) and modifying the next upper layer within rpm (fsm.c) which is probably the most horrible place in the whole code base. Volunteers welcome!

A much simpler alternative would be to use a slightly modified cpio format. With a new magic for large files we could just put an binary integer into the c_filesize field (or all integer fields). Another solution could be to keep the hexadecimal encoding and just double the c_filesize or even some more integer fields. This will both render the payload incompatible with cpio if there are large files (and only then).

I did not yet ask cpio upstream or our cpio package maintainer about accepting patches to at least read such archives...

Attached patch uses a binary integer for large file sizes. Patch is untested and assumes that everything else that deals with file sizes already is 64 bit save.

Comments? Ideas? Panic?

Florian

[1] http://people.freebsd.org/~kientzle/libarchive/man/cpio.5.txt (see "New ASCII Format")

--
________________________________________________________________________
Reg. Adresse: Red Hat GmbH, Hauptstätter Str. 58, 70178 Stuttgart
Handelsregister: Amtsgericht Muenchen HRB 153243
Geschaeftsfuehrer: Brendan Lane, Charlie Peters, Michael Cunningham,
Charles Cachera

>From 4804fcca8a2ee33e60d8a7a042c01443e35b2ae9 Mon Sep 17 00:00:00 2001
From: Florian Festi <ffe...@redhat.com>
Date: Wed, 9 Sep 2009 12:06:54 +0200
Subject: [PATCH] Add support for 64 bit file sizes to cpio

---
 lib/cpio.c |   27 +++++++++++++++++++++------
 lib/cpio.h |    3 ++-
 2 files changed, 23 insertions(+), 7 deletions(-)

diff --git a/lib/cpio.c b/lib/cpio.c
index 9003494..db6acbf 100644
--- a/lib/cpio.c
+++ b/lib/cpio.c
@@ -8,7 +8,7 @@
  */
 
 #include "system.h"
-
+#include <endian.h>
 #include <rpm/rpmio.h>
 #include <rpm/rpmlog.h>
 
@@ -87,15 +87,21 @@ int cpioHeaderWrite(FSM_t fsm, struct stat * st)
     dev_t dev;
     int rc = 0;
 
-    memcpy(hdr->magic, CPIO_NEWC_MAGIC, sizeof(hdr->magic));
     SET_NUM_FIELD(hdr->inode, st->st_ino, field);
     SET_NUM_FIELD(hdr->mode, st->st_mode, field);
     SET_NUM_FIELD(hdr->uid, st->st_uid, field);
     SET_NUM_FIELD(hdr->gid, st->st_gid, field);
     SET_NUM_FIELD(hdr->nlink, st->st_nlink, field);
     SET_NUM_FIELD(hdr->mtime, st->st_mtime, field);
-    SET_NUM_FIELD(hdr->filesize, st->st_size, field);
-
+    if (st->st_size <= UINT64_MAX) {
+       SET_NUM_FIELD(hdr->filesize, st->st_size, field);
+       memcpy(hdr->magic, CPIO_NEWC_MAGIC, sizeof(hdr->magic));
+
+    } else {
+       uint64_t fsize = htole64(st->st_size);
+       memcpy(hdr->filesize, &fsize, sizeof(fsize));
+       memcpy(hdr->magic, CPIO_LARGE_MAGIC, sizeof(hdr->magic));
+    }
     dev = major(st->st_dev); SET_NUM_FIELD(hdr->devMajor, dev, field);
     dev = minor(st->st_dev); SET_NUM_FIELD(hdr->devMinor, dev, field);
     dev = major(st->st_rdev); SET_NUM_FIELD(hdr->rdevMajor, dev, field);
@@ -122,6 +128,8 @@ int cpioHeaderRead(FSM_t fsm, struct stat * st)
     char * end;
     unsigned int major, minor;
     int rc = 0;
+    int small = strncmp(CPIO_LARGE_MAGIC, hdr.magic,
+                       sizeof(CPIO_LARGE_MAGIC)-1);
 
     fsm->wrlen = PHYS_HDR_SIZE;
     rc = fsmNext(fsm, FSM_DREAD);
@@ -131,7 +139,8 @@ int cpioHeaderRead(FSM_t fsm, struct stat * st)
     memcpy(&hdr, fsm->wrbuf, fsm->rdnb);
 
     if (strncmp(CPIO_CRC_MAGIC, hdr.magic, sizeof(CPIO_CRC_MAGIC)-1) &&
-       strncmp(CPIO_NEWC_MAGIC, hdr.magic, sizeof(CPIO_NEWC_MAGIC)-1))
+       strncmp(CPIO_NEWC_MAGIC, hdr.magic, sizeof(CPIO_NEWC_MAGIC)-1) &&
+       small)
        return CPIOERR_BAD_MAGIC;
 
     GET_NUM_FIELD(hdr.inode, st->st_ino);
@@ -140,7 +149,13 @@ int cpioHeaderRead(FSM_t fsm, struct stat * st)
     GET_NUM_FIELD(hdr.gid, st->st_gid);
     GET_NUM_FIELD(hdr.nlink, st->st_nlink);
     GET_NUM_FIELD(hdr.mtime, st->st_mtime);
-    GET_NUM_FIELD(hdr.filesize, st->st_size);
+    if (small) {
+       GET_NUM_FIELD(hdr.filesize, st->st_size);
+    } else {
+       uint64_t fsize;
+       memcpy(&fsize, hdr.filesize, sizeof(fsize));
+       st->st_size = le64toh(fsize);
+    }
 
     GET_NUM_FIELD(hdr.devMajor, major);
     GET_NUM_FIELD(hdr.devMinor, minor);
diff --git a/lib/cpio.h b/lib/cpio.h
index 9453184..faef98a 100644
--- a/lib/cpio.h
+++ b/lib/cpio.h
@@ -60,10 +60,11 @@ enum cpioErrorReturns {
  * The max size of the entire archive is unlimited from cpio POV,
  * but subject to filesystem limitations.
  */
-#define CPIO_FILESIZE_MAX UINT32_MAX
+#define CPIO_FILESIZE_MAX UINT64_MAX
 
 #define CPIO_NEWC_MAGIC        "070701"
 #define CPIO_CRC_MAGIC "070702"
+#define CPIO_LARGE_MAGIC       "07070X"
 #define CPIO_TRAILER   "TRAILER!!!"
 
 /** \ingroup payload
-- 
1.6.2.5

_______________________________________________
Rpm-maint mailing list
Rpm-maint@lists.rpm.org
http://lists.rpm.org/mailman/listinfo/rpm-maint

Reply via email to