Re: 2.01.01a>=40: multi.c

2008-07-24 Thread Andy Polyakov

Fix for multi-extent bug in multi.c has [at least] two bugs:


These two bugs were discovered by merely looking at the code. Actual 
initial testings revealed [at least] 3 more bugs, see below.


1. If directory entries constituting multi-extent file end up in 
different sectors read_merging_directory ends up in endless loop. 
Problematic loop is denoted in attached diff along with possible solution.


Coiuld you please explant your claim?


How are 2KB sectors in directory table filled with directory entries? If 
there is no place for next directory entry in current 2KB sector the 
latter gets padded with 0s and next directory entry is written to next 
2KB sector. Now all you have to imagine is sequence of directory entries 
constituting same multi-extent file being laid down in two adjacent 
sectors with 0 padding between them. Having this image in mind, how 
would following loop handle this padding?


while (i2 < len) {
idr2 = (struct iso_directory_record *) &dirbuff[i2];
...
i2 += idr2->length[0];
}

It would spin endlessly, wouldn't it? Well, actual loop had an extra 
termination criteria:


while (i2 < len) {
idr2 = (struct iso_directory_record *) &dirbuff[i2];
if ((idr2->flags[0]&ISO_MULTIEXTENT)==0) break;
i2 += idr2->length[0];
}

So that there are two possible scenarios:

1. idr2->flags[0] overlaps with 0 padding, therefore condition is met 
and loop gets terminated *prematurely*. This results in calculated 
s_entry->size being *less* than it should (run mkisofs -M under debugger 
and see for yourself).


2. If padding is small enough idr2->flags[0] reads byte of *some* data 
from next sector and if this byte happens to have ISO_MULTIEXTENT bit 
set, then the loop spins endlessly.


Well, my initial report was admittedly incomplete, I failed to mention 
case #1, but #2 is as real. Attached image (produced with mkisofs 
2.01.01a43) triggers this case. The image contains single 40GB large 
file with long name, so that this file's extents' directory entries span 
over two 2KB blocks.



The code you removed is needed and you did not provide a working replacement.


Yes, I did! What is the purpose of the loop in question? To calculate 
size field in "super" directory entry preceding actual extent entries, 
isn't it? In next "if" I added following line:


(*pnt)->mxroot->size += (*pnt)->size;

which takes proper care of this very field without any extra loop.

2. Given that mkisofs reserves for file system layout generated by 
another iso9660 formatter, alternative partitioning of file to multiple 
extents (for example larger amount of smaller extents) for now would 
result in effective data corruption. Point is that it's not enough to 



This is a claim that would need a prove...


Given last experience I can't even imagine that it would be possible for 
me to provide "proof" you would accept. You just have to experience it 
yourself:


1. mkdir test
   touch test/file
   perl -e 'truncate("test/file",8*1024*1024)'

   this gives you test directory with 8MB file;

2. modify 2.01.01a43 mkisofs/tree.c and redefine LARGE_EXTENT and 
MAX_EXTENT to say 0x10, 1MB, recompile;


3. mkisofs -R -iso-level 3 test > test.iso

   this gives you an image with 8 extents generated by "alternative 
iso9660 formatter."


4. burn in to CD;

5. modify mkisofs/tree.c and redefine LARGE_EXTENT and MAX_EXTENT to say 
0x20, 2MB, recompile;


6. collect multi-session parameters from cdrecord and simply "refresh" 
directory structure:


   mkisofs -C xx,yy -M xxx -R -iso-level 3 test > test2.iso

7. burn test2.iso as second session, mount it and tell us what you see, 
run 'wc /cdrom/test/file' and tell us what is the result.



Now to further bugs.

Bug #3. If previous session has multi-extent file[s] 
read_merging_directory effectively omits last directory entry[-ies] in 
corresponding subdirectory, which naturally results in file[s] 
disappearing from multi-session recordings. This is because of 
following. Beside pointer to directory_entry* array procedure in 
question returns number of entries in this array in *nentp. Now amount 
of allocated and filled entries is nent+nmult, while value returned in 
nentp is nent alone. This means that if for example there is 
multi-extent file in current directory followed by single-extent one, 
latter disappears from multi-session recording, because consumer of 
returned array would never process it. One should return nent+nmult, see 
reference to line #591 in attached diff.


Bug #4. check_prev_session cleans up only first extent on multi-extent 
file. As result remaining extents are attempted to be merged, which in 
turn results in failure to sort merged directory. This happens because 
loop index in clean-up is never incremented, see reference to line #1099 
in attached diff.


Bug #5. Whenever there is an iso-name clash with previous sess

Re: 2.01.01a>=40: multi.c

2008-07-23 Thread Joerg Schilling
Andy Polyakov <[EMAIL PROTECTED]> wrote:

> Fix for multi-extent bug in multi.c has [at least] two bugs:
>
> 1. If directory entries constituting multi-extent file end up in 
> different sectors read_merging_directory ends up in endless loop. 
> Problematic loop is denoted in attached diff along with possible solution.

Coiuld you please explant your claim?

The code you removed is needed and you did not provide a working replacement.

> 2. Given that mkisofs reserves for file system layout generated by 
> another iso9660 formatter, alternative partitioning of file to multiple 
> extents (for example larger amount of smaller extents) for now would 
> result in effective data corruption. Point is that it's not enough to 


This is a claim that would need a prove...

> copy extents' isorec.extent fields alone, one has to copy at least 
> isorec.size fields as well. As for "at least." *Formally* (i.e. knowing 
> nothing about alternative iso9660 formatting and assuming "worst") one 
> should copy even isorec.ext_attr_length, [some bits from] flags, 
> file_unit_size and interleave. Attached diff denotes problematic code by 
> simply copying all of above mentioned fields in single memcpy (along 
> with isorec.date field, which is checked to be same as in curr_entry, so 
> copying should not cause side effects).

As your patch does not handle the file correctly anymore, we would need to 
discuss it. 

> Couple of comments not related to above file-system formatting bugs.
>
> There is inconsistency in directory_entry.mxpart declaration and its 
> usage. Declaration is "guarded" by #ifdef USE_LARGEFILES, while its 
> usage is not. Formally multi-extent files don't have to be "large," i.e. 
> "small" files are allowed to be multi-extent too (actually original 
> application for multi-extent was *CD* packet writing). Therefore it, 
> multi-extent support, should not be associated with USE_LARGEFILES and 
> mxpart declaration should not be depend on corresponding macro.

The problem with multi extent files in ISO-9660 is that it allows you to any 
kind of nonsense.

> In check_prev_session one can see free(ptr[i]); and in copy_mult_ext one 
> can see free(sex);. These are pointers to directory_entry structure, 
> which in turn can contain pointers to other dynamically allocated 
> objects, most notably rr_attributes and name. Freeing an object without 
> freeing other dynamically allocated objects it alone refers to used to 
> constitute memory leaks.
>
> As for copy_mult_extent. As mentioned above it copies pieces of 
> information from original extent records. As alternative to this field 
> copying would it be appropriate to consider simply using original extent 
> records, i.e. removing them from orig_contents array and linking them 
> into this_dir? A.
>
> P.S. As usual I make no claims on correctness of provided patch, it 
> serves primarily educational purposes.

I am currently at OSCON and don't have the time for a closer look.
Let us continue next week.

Jörg

-- 
 EMail:[EMAIL PROTECTED] (home) Jörg Schilling D-13353 Berlin
   [EMAIL PROTECTED](uni)  
   [EMAIL PROTECTED] (work) Blog: http://schily.blogspot.com/
 URL:  http://cdrecord.berlios.de/old/private/ ftp://ftp.berlios.de/pub/schily


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]



2.01.01a>=40: multi.c

2008-07-23 Thread Andy Polyakov

Fix for multi-extent bug in multi.c has [at least] two bugs:

1. If directory entries constituting multi-extent file end up in 
different sectors read_merging_directory ends up in endless loop. 
Problematic loop is denoted in attached diff along with possible solution.


2. Given that mkisofs reserves for file system layout generated by 
another iso9660 formatter, alternative partitioning of file to multiple 
extents (for example larger amount of smaller extents) for now would 
result in effective data corruption. Point is that it's not enough to 
copy extents' isorec.extent fields alone, one has to copy at least 
isorec.size fields as well. As for "at least." *Formally* (i.e. knowing 
nothing about alternative iso9660 formatting and assuming "worst") one 
should copy even isorec.ext_attr_length, [some bits from] flags, 
file_unit_size and interleave. Attached diff denotes problematic code by 
simply copying all of above mentioned fields in single memcpy (along 
with isorec.date field, which is checked to be same as in curr_entry, so 
copying should not cause side effects).



Couple of comments not related to above file-system formatting bugs.

There is inconsistency in directory_entry.mxpart declaration and its 
usage. Declaration is "guarded" by #ifdef USE_LARGEFILES, while its 
usage is not. Formally multi-extent files don't have to be "large," i.e. 
"small" files are allowed to be multi-extent too (actually original 
application for multi-extent was *CD* packet writing). Therefore it, 
multi-extent support, should not be associated with USE_LARGEFILES and 
mxpart declaration should not be depend on corresponding macro.


In check_prev_session one can see free(ptr[i]); and in copy_mult_ext one 
can see free(sex);. These are pointers to directory_entry structure, 
which in turn can contain pointers to other dynamically allocated 
objects, most notably rr_attributes and name. Freeing an object without 
freeing other dynamically allocated objects it alone refers to used to 
constitute memory leaks.


As for copy_mult_extent. As mentioned above it copies pieces of 
information from original extent records. As alternative to this field 
copying would it be appropriate to consider simply using original extent 
records, i.e. removing them from orig_contents array and linking them 
into this_dir? A.


P.S. As usual I make no claims on correctness of provided patch, it 
serves primarily educational purposes.
--- ./multi.c.orig	2008-06-13 22:40:10.0 +0200
+++ ./multi.c	2008-07-23 10:42:42.0 +0200
@@ -761,26 +761,11 @@
 		if ((mx & ISO_MULTIEXTENT) == 0 &&
 		(idr->flags[0] & ISO_MULTIEXTENT) != 0) {
 			struct directory_entry		*s_entry;
-			struct iso_directory_record	*idr2 = idr;
-			inti2 = i;
-			off_ttsize = 0;
-
-			/*
-			 * Sum up the total file size for the multi extent file
-			 */
-			while (i2 < len) {
-idr2 = (struct iso_directory_record *) &dirbuff[i2];
-
-tsize += get_733(idr2->size);
-if ((idr2->flags[0] & ISO_MULTIEXTENT) == 0)
-	break;
-i2 += idr2->length[0];
-			}
 
 			s_entry = dup_directory_entry(*pnt);	/* dup first for mxroot */
 			s_entry->de_flags |= MULTI_EXTENT;
 			s_entry->de_flags |= INHIBIT_ISO9660_ENTRY|INHIBIT_JOLIET_ENTRY;
-			s_entry->size = tsize;
+			s_entry->size = 0;	/* size is calculated below */
 			s_entry->starting_block = (*pnt)->starting_block;
 			s_entry->mxroot = s_entry;
 			s_entry->mxpart = 0;
@@ -796,6 +781,7 @@
 			(pnt[-1])->next = *pnt;
 			(*pnt)->mxroot = (pnt[-1])->mxroot;
 			(*pnt)->mxpart = (pnt[-1])->mxpart + 1;
+			(*pnt)->mxroot->size += (*pnt)->size;	/* accumulate mxroot's size */
 		}
 		pnt++;
 		mx = idr->flags[0];
@@ -1184,7 +1170,16 @@
 			se1->next = sex;
 			len1++;
 		}
-		memcpy(se1->isorec.extent, se2->isorec.extent, 8);
+		/* 27 bytes memcpy covers following fields:
+			ext_attr_length,
+			extent,
+			size,
+			date,
+			flags, 
+			file_unit_size,
+			interleave
+		*/
+		memcpy(se1->isorec.ext_attr_length,se2->isorec.ext_attr_length,27);
 		se1->starting_block = get_733(se2->isorec.extent);
 		se1->de_flags |= SAFE_TO_REUSE_TABLE_ENTRY;
 		se1->de_flags |= MULTI_EXTENT;
@@ -1195,7 +1190,7 @@
 		se1 = se1->next;
 		se2 = se2->next;
 	}
-	memcpy(se1->isorec.extent, se2->isorec.extent, 8);
+	memcpy(se1->isorec.ext_attr_length,se2->isorec.ext_attr_length,27);
 	se1->starting_block = get_733(se2->isorec.extent);
 	se1->de_flags |= SAFE_TO_REUSE_TABLE_ENTRY;
 	se1->isorec.flags[0] &= ~ISO_MULTIEXTENT;	/* Last entry */