On Mon, 2011-09-26 at 15:22 -0700, Fox, Kevin M wrote:
> On Mon, 2011-09-26 at 14:48 -0700, Sergey Poznyakoff wrote:
> > Kevin Fox <[email protected]> ha escrit:
> >
> > > Tar starts padding out the file
> > > with 0's up until the file stat size and returns a message like:
> > > NET.csv: File shrank by 45427965 bytes; padding with zeros
> >
> > This means that a read returned 0 bytes,
>
> Nope. Its clearly returning a short read, not a 0.
>
> I see a read of 8704 (10240 - the tar headers I'm assuming)
> then 3275 reads of 10240 bytes each,
> then the short read of 9728 (sums to exactly 32mb) then tar never issues
> another read on that descriptor.
>
> > but not all file contents has
> > been read.
> >
> > > I believe in this case, tar should try to fill its remaining buffer
> > > space with another read
> >
> > That's what it actually does, but hits EOF.
>
> I found the code in question and wrote a rough fix that works for me.
> I'll submit a patch to the mailing list as soon as I can get it through
> legal.
Wow. Legal was really quick today. Please see the attached patch. Its
very rough, but illustrates the problem and potential solution.
Thanks,
Kevin
>
> In the mean time, if you want to have a look:
>
> search for safe_read in src/create.c. There is only one of them in that
> file. It gets count, checks for error, then bails if count != bufsize,
> never retrying in the short read case. I haven't looked farther to see
> if there are any other code paths that have similar bugs.
>
> Thanks,
> Kevin
>
> > Are you sure it's not a bug
> > of the underlying filesystem?
> >
> > Regards,
> > Sergey
>
>
>
--- src/create.c.orig 2011-09-26 11:26:52.000000000 -0700
+++ src/create.c 2011-09-26 12:01:27.000000000 -0700
@@ -1014,6 +1014,7 @@
off_t size_left = st->stat.st_size;
off_t block_ordinal;
union block *blk;
+ char *buffer;
block_ordinal = current_block_ordinal ();
blk = start_header (st);
@@ -1029,7 +1030,7 @@
mv_begin_write (st->file_name, st->stat.st_size, st->stat.st_size);
while (size_left > 0)
{
- size_t bufsize, count;
+ size_t bufsize, count, prevcount;
blk = find_next_block ();
@@ -1044,7 +1045,10 @@
memset (blk->buffer + size_left, 0, BLOCKSIZE - count);
}
- count = (fd <= 0) ? bufsize : safe_read (fd, blk->buffer, bufsize);
+for(prevcount=0;;)
+{
+ buffer = blk->buffer + prevcount;
+ count = (fd <= 0) ? bufsize : safe_read (fd, buffer, bufsize - prevcount);
if (count == SAFE_READ_ERROR)
{
read_diag_details (st->orig_file_name,
@@ -1053,6 +1057,13 @@
return dump_status_short;
}
size_left -= count;
+ if (prevcount + count == bufsize || count == 0)
+ {
+ count += prevcount;
+ break;
+ }
+ prevcount += count;
+}
set_next_block_after (blk + (bufsize - 1) / BLOCKSIZE);
if (count != bufsize)