retitle 1010024 pristine-tar: fails to handle paths with non-ASCII in
quit

Hi,

I spent some more time looking at this. The problem that's causing this particular issue is that unquote_filename(); doesn't handle the escaped high-bit characters properly. That eventually ends up with the \[0-7]{3} escapes being treated literally.

The attached tarball (rclone_1.60.0.orig.tar.gz) demonstrates the problem.

A minimal works-for-me patch is
diff --git a/pristine-tar b/pristine-tar
index 081dca1..4810215 100755
--- a/pristine-tar
+++ b/pristine-tar
@@ -370,6 +370,7 @@ sub unquote_filename {
   $filename =~ s/\\t/\t/g;
   $filename =~ s/\\v/\x11/g;
   $filename =~ s/\\\\/\\/g;
+  $filename =~ s/\\([0-7]{3})/chr oct $1/eg;

   return $filename;
 }

...but actually there's a deeper problem here, as alluded to in https://salsa.debian.org/debian/pristine-tar/-/merge_requests/4

which is that pristine-tar is now somewhat confused as to whether the manifest entries are meant to be quoted or not (which results in patches like the fix to #933031). The problem is that either naive approach doesn't work:

i) if you use quoted paths, you then cannot use --verbatim-files-from (since it doesn't unquote them) and lose on paths starting -

ii) if you used unquoted paths, then you need to use --verbatim-files-from (otherwise you get stuck on paths starting -) and then lose on paths containing newline

Instead, what pristine-tar needs to do is to take quoted paths, unquote them and put \0 between records (rather than \n), and then use the resulting manifest with tar --null -T

To demonstrate, see the attached hazard.tar.gz, which contains one file:
-bar/test\nnewline\\x2foo

There is no argument to tar -tf that produces a manifest file you can feed to tar -T [0] - a transcript demonstrating is attached (paste_1259177.txt).

If, however, you follow my take quoted -> unquote -> null-separate approach, it works; the attached mangle.pl (taking the fixed unquote_filename from https://salsa.debian.org/debian/pristine-tar/-/merge_requests/4 and genmanifest from pristine-tar modified to put NULL between records) makes a manifest "haz_zero" which you can then use with:

tar -xf hazard.tar.gz --null -T haz_zero

So I think this is the approach that pristine-tar needs to take; that would also I think mean we can remove the slightly hack fix to 851286.

[I will look at trying to update the above-mentioned MR]

HTH,

Matthew

[0] with old enough tar (1.29 or earlier) this is not quite true, as --verbatim-files-from used to unescape; that was "fixed" for 1.30

Attachment: rclone_1.60.0.orig.tar.gz
Description: application/gzip

matthew@tsk:~/hazard$ ls
hazard.tar.gz
matthew@tsk:~/hazard$ tar -tf hazard.tar.gz 
-bar/test\nnewline\\x2foo
matthew@tsk:~/hazard$ tar -xf hazard.tar.gz 
matthew@tsk:~/hazard$ ls -- -bar
'test'$'\n''newline\x2foo'
matthew@tsk:~/hazard$ rm -rf -- -bar
matthew@tsk:~/hazard$ tar -tf hazard.tar.gz > haz_quoted
matthew@tsk:~/hazard$ tar -tf hazard.tar.gz --quoting-style=literal > 
haz_unquoted
matthew@tsk:~/hazard$ cat -vet haz_quoted 
-bar/test\nnewline\\x2foo$
matthew@tsk:~/hazard$ cat -vet haz_unquoted 
-bar/test$
newline\x2foo$
matthew@tsk:~/hazard$ cat haz_quoted 
-bar/test\nnewline\\x2foo
matthew@tsk:~/hazard$ cat haz_unquoted 
-bar/test
newline\x2foo
matthew@tsk:~/hazard$ tar -xf hazard.tar.gz -T haz_quoted 
tar: haz_quoted:1: unrecognized option
tar: Exiting with failure status due to previous errors
matthew@tsk:~/hazard$ tar -xf hazard.tar.gz --verbatim-files-from -T haz_quoted 
tar: -bar/test\\nnewline\\\\x2foo: Not found in archive
tar: Exiting with failure status due to previous errors
matthew@tsk:~/hazard$ tar -xf hazard.tar.gz -T haz_unquoted 
tar: haz_unquoted:1: unrecognized option
tar: newline\\x2foo: Not found in archive
tar: Exiting with failure status due to previous errors
matthew@tsk:~/hazard$ tar -xf hazard.tar.gz --verbatim-files-from -T 
haz_unquoted 
tar: -bar/test: Not found in archive
tar: newline\\x2foo: Not found in archive
tar: Exiting with failure status due to previous errors

Attachment: mangle.pl
Description: Perl program

Attachment: hazard.tar.gz
Description: application/gzip

Reply via email to