Hi, i'm not a NetBSD expert, but had reason to look into its cd9660 code for developing two PRs (48808, 48959).
Robert Elz: > At this point it looks to be some kind of setup problem on amd64, when it > reads nested directories, building some data struct that resuts in EINVAL > from the copyout I did not see any architecture specific code in cd9660. Given the fact that you located the origin of EINVAL outside of cd9660 (resp. underneath), i expect that ISO 9660 structure aspects are only indirectly involved. Especially the difference between i386 and amd64 can hardly be explained by ISO 9660 aspects. The connection between ISO 9660 and data file content blocks is made by cd9660_bmap() as implementation of VOP_BMAP(9). > diff -r of the mount points of the real DVD (/cdrom) and the > mounted vnd0a (/mnt) - that completed without error. > What's more, after that, tar had no problem reading the DVD either! It is unlikely that the DVD delivers varying data blocks from the ISO 9660 directory files, which would lure cd9660_bmap() into requesting invalid data block addresses for file content. My favority suspect would now be the code underneath bread(9) and especially the mechanism which looks up the buffer and decides whether a physical read operation on the DVD is needed. > If anyone has any suggestiions for > possible sources, I'll happily mangle my kernel How about having a wrapper around bread(9), which checks for error replies of bread(9) and eventually prints some message which tells the failed block number. Then use that wrapper instead of all the bread(9) calls in cd9660. (Should be possible with a few vi commands on the few cd9660*.c files.) This would make clear whether the problem is indeed underneath bread(9) and whether it gets an implausible block number from cd9660. If the block number of a failure is in the size range of the ISO image, the its content could be looked up by help of dd and be made human readable by od or alike. Directory records show some characteristic redundancy. So there is hope we can tell whether it is metadata or data file content. If the content is inconclusive, then i'd need the whole ISO image in order to tell what the affected block shall mean. Alternatively to an inspection of the ISO image, or if the block number is implausible, you could equip the wrapper and its calls with a string parameter which identifies the wrapper's caller. So we could make a connection to particular cd9660 operations and see whether it is always the same cd9660 gesture failing. > ps: I can make the 2.1GiB .iso image available, If you have an image which never shows problems and one that reliably shows problems (e.g. on the first tar) then we could look at the block addresses of the data files and directory files which are involved. As said, it would be interesting to look up the blocks of failing bread(9) calls. Have a nice day :) Thomas
