Hi again,

Richard W.M. Jones wrote:
> On Wed, Apr 06, 2016 at 06:01:31PM +0200, Jean-Pierre André wrote:
>> There are four bad file names in the system32 directory,
>> they have a similar name with a bad surrogate pair
>> followed by ".log". A fifth one has a similar name, but
>> it has a valid surrogate pair.
>>
>> They are small files (108 or 132 bytes) created at different
>> dates. One of them is recent, maybe the customer can
>> remember something specific being started on Monday
>> (mornings in the US).
>>
>> Here are the creation dates :
>>
>> Thu Feb 18 10:47:30 2016 UTC
>> Mon Aug 18 14:08:27 2014 UTC
>> Mon Oct  5 13:25:12 2015 UTC
>> Mon Jun  4 11:43:29 2012 UTC
>> Mon Aug 18 14:02:22 2014 UTC
>
> Thanks for your detailed analysis.  I will ask the reporter if they
> know anything about this.
>
>> As I said surrogate pairs are present, which make them
>> unlikely to have been created by Windows XP. The pairs
>> are :
>>
>> da5c dc93 (this is the valid one)
>> dc5c dc93
>> dd5c dc93
>> de24 dc93
>> de5c dc93
>
> So if I understand what's going on, surrogate pairs are not in general
> bad, but these particular ones are invalid (except the first) because
> the first word in the pair >= 0xdc00.

Exactly. These are used for encoding Unicode points
beyond U+10000 into two 16-bit words.

>>> Plus, it'd be nice if ntfs-3g could ignore (or at least not give a
>>> hard error) in these cases.  It's actually the getdents(2) system call
>>> which fails, so any access at all to the directory returns -EILSEQ.
>>
>> This will mean (optional) cheating with the translations
>> so that bad Unicode characters can translate to utf8 and
>> back to bad Unicode.
>>
>>> We were trying to read a few files from \Windows\System32, it's most
>>> likely that the "corrupt" file is not a file that we care about.
>>
>> I can provide disk patches if you want to delete them.
>
> The problem is not this particular disk image.  The problem is that
> when we use virt-v2v to convert 1000s of Windows guests we don't want
> to hit this problem with some guest.  virt-v2v examines a few files in
> \Windows\system32, but when it hits a guest like this one it will die,
> even though the corrupt name has nothing to do with any file that
> virt-v2v cares about nor is trying to open.

What is your need ?

If you need to access some specific files, and to not
crash on reading directories, you can use Erik's proposal.

This is even enough if you use lowntfs-3g and you need
to read the files with bad names (because lowntfs-3g
requests are done by inode numbers instead of file names).

> I'll have a look at the code and see if there's a way to add a mount
> option to be less picky.

You need a reversible translation for renaming, deleting,
or linking files (even with lowntfs-3g).

Regards

Jean-Pierre



------------------------------------------------------------------------------
_______________________________________________
ntfs-3g-devel mailing list
ntfs-3g-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ntfs-3g-devel

Reply via email to