Hi, Ben Kohler wrote: > I've found that for an iso > created with (cdrtools) mkisofs -J -iso-level 3, libcdio tools like > iso-info and iso-read are not able to handle multi-extent files.
It's a pair of bugs in function _iso9660_dir_to_statbuf() of libcdio source file lib/iso9660/iso9660_fs.c . ----------------------------------------------------------------------- Symptoms: I can reproduce this with a xorriso-made ISO too, which has Rock Ridge additionally to Joliet. (mkisofs with -R doesn't help either. genisoimage is safe because it does not obey -iso-level 3 and aborts already on image production.) A workaround for iso-info seems to be option --no-joliet man iso-read does not mention this option. A run of iso-info without --no-joliet complains: ++ WARN: Non consecutive multiextent file parts for '' ----------------------------------------------------------------------- Diagnosis: The message comes from _iso9660_dir_to_statbuf() in iso9660_fs.c. It is supposed to report a filename in the quotation marks. In order to get there, a re-used p_stat has to be submitted by the caller. The first bug is this: When iso9660_ifs_readdir() calls _iso9660_dir_to_statbuf() with the first directory entry of the multi-extent file it gets back an empty string as file name, because the conversion from UCS-2 to UTF-8 is missing if the ISO_MULTIEXTENT bit is set in the directory record. This is probably my fault due to the assumption that libcdio already deals with C strings at that point of processing. See https://git.savannah.gnu.org/cgit/libcdio.git/commit/?id=d758fa2253cca062e92ad0754a64c15a854db4ff "Use the plain ISO-9660 name when dealing with a multiextent file part" The second bug is similar: The test which leads to the warning message compares the C string which holds the previously read file name with the UCS-2 array which holds the name of the follow-up directory record. So even if the UTF-8 conversion missing in above bug is done, this test still detects that the string length does not match the array byte size and thus throws the warning and prevents the proper merging of both directory records. See https://git.savannah.gnu.org/cgit/libcdio.git/commit/?id=d758fa2253cca062e92ad0754a64c15a854db4ff "Only resolve the full filename when we're not dealing with extent" ----------------------------------------------------------------------- Code to test: I now have an ugly contraption by which iso-info recognizes a multi-extent file even if Joliet is present. The base is an outdated repo clone. I will have to get a fresh git state and try to find my cheat sheet which describes how to submit changes to libcdio. The current diff (minus all my debugging printfs) is: diff --git a/lib/iso9660/iso9660_fs.c b/lib/iso9660/iso9660_fs.c index be693c7..a84b9fc 100644 --- a/lib/iso9660/iso9660_fs.c +++ b/lib/iso9660/iso9660_fs.c @@ -866,8 +866,25 @@ _iso9660_dir_to_statbuf (iso9660_dir_t *p_iso9660_dir, if ((p_iso9660_dir->file_flags & ISO_MULTIEXTENT) == 0) { /* Check if this is the last part of a multiextent file */ if (!first_extent) { - if (strlen(p_stat->filename) != i_fname || - strncmp(p_stat->filename, &p_iso9660_dir->filename.str[1], i_fname) != 0) { + cdio_utf8_t *p_psz_out = NULL; + int bad_multi; + +#ifdef HAVE_JOLIET + if (u_joliet_level) { + int i_inlen = i_fname; + if (!cdio_charset_to_utf8(&p_iso9660_dir->filename.str[1], i_inlen, + &p_psz_out, "UCS-2BE")) { + goto fail; + } + } else +#endif /*HAVE_JOLIET*/ + { + p_psz_out = calloc(1, i_fname + 1); + strncpy (p_psz_out, &p_iso9660_dir->filename.str[1], i_fname); + } + bad_multi = (strcmp(p_stat->filename, p_psz_out) != 0); + free(p_psz_out); + if (bad_multi) { cdio_warn("Non consecutive multiextent file parts for '%s'", p_stat->filename); goto fail; @@ -916,7 +933,22 @@ _iso9660_dir_to_statbuf (iso9660_dir_t *p_iso9660_dir, } } else { /* Use the plain ISO-9660 name when dealing with a multiextent file part */ - strncpy(p_stat->filename, &p_iso9660_dir->filename.str[1], i_fname); +#ifdef HAVE_JOLIET + if (u_joliet_level) { + int i_inlen = i_fname; + cdio_utf8_t *p_psz_out = NULL; + if (cdio_charset_to_utf8(&p_iso9660_dir->filename.str[1], i_inlen, + &p_psz_out, "UCS-2BE")) { + strncpy(p_stat->filename, p_psz_out, i_fname); + free(p_psz_out); + } else { + goto fail; + } + } else +#endif /*HAVE_JOLIET*/ + { + strncpy (p_stat->filename, &p_iso9660_dir->filename.str[1], i_fname); + } } iso9660_get_dtime(&(p_iso9660_dir->recording_time), true, &(p_stat->tm)); -------------------------------------------------------------------------- @ Ben Kohler: You'd do me a big favor if you could test this with your use cases. @ All: I am unhappy with the code repetition in above patch (plus the original conversion code which is already _iso9660_dir_to_statbuf()). Abstraction proposals to unify the now three cdio_charset_to_utf8() occasions in _iso9660_dir_to_statbuf() would be welcome. Have a nice day :) Thomas