Hi again, Richard W.M. Jones wrote: > On Wed, Apr 06, 2016 at 06:01:31PM +0200, Jean-Pierre André wrote: >> There are four bad file names in the system32 directory, >> they have a similar name with a bad surrogate pair >> followed by ".log". A fifth one has a similar name, but >> it has a valid surrogate pair. >> >> They are small files (108 or 132 bytes) created at different >> dates. One of them is recent, maybe the customer can >> remember something specific being started on Monday >> (mornings in the US). >> >> Here are the creation dates : >> >> Thu Feb 18 10:47:30 2016 UTC >> Mon Aug 18 14:08:27 2014 UTC >> Mon Oct 5 13:25:12 2015 UTC >> Mon Jun 4 11:43:29 2012 UTC >> Mon Aug 18 14:02:22 2014 UTC > > Thanks for your detailed analysis. I will ask the reporter if they > know anything about this. > >> As I said surrogate pairs are present, which make them >> unlikely to have been created by Windows XP. The pairs >> are : >> >> da5c dc93 (this is the valid one) >> dc5c dc93 >> dd5c dc93 >> de24 dc93 >> de5c dc93 > > So if I understand what's going on, surrogate pairs are not in general > bad, but these particular ones are invalid (except the first) because > the first word in the pair >= 0xdc00.
Exactly. These are used for encoding Unicode points beyond U+10000 into two 16-bit words. >>> Plus, it'd be nice if ntfs-3g could ignore (or at least not give a >>> hard error) in these cases. It's actually the getdents(2) system call >>> which fails, so any access at all to the directory returns -EILSEQ. >> >> This will mean (optional) cheating with the translations >> so that bad Unicode characters can translate to utf8 and >> back to bad Unicode. >> >>> We were trying to read a few files from \Windows\System32, it's most >>> likely that the "corrupt" file is not a file that we care about. >> >> I can provide disk patches if you want to delete them. > > The problem is not this particular disk image. The problem is that > when we use virt-v2v to convert 1000s of Windows guests we don't want > to hit this problem with some guest. virt-v2v examines a few files in > \Windows\system32, but when it hits a guest like this one it will die, > even though the corrupt name has nothing to do with any file that > virt-v2v cares about nor is trying to open. What is your need ? If you need to access some specific files, and to not crash on reading directories, you can use Erik's proposal. This is even enough if you use lowntfs-3g and you need to read the files with bad names (because lowntfs-3g requests are done by inode numbers instead of file names). > I'll have a look at the code and see if there's a way to add a mount > option to be less picky. You need a reversible translation for renaming, deleting, or linking files (even with lowntfs-3g). Regards Jean-Pierre ------------------------------------------------------------------------------ _______________________________________________ ntfs-3g-devel mailing list ntfs-3g-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ntfs-3g-devel