Hey,

Thanks a lot. Worked like a charm.

On Mon, Jun 19, 2017 at 10:22 AM, Jean-Pierre André
<jean-pierre.an...@wanadoo.fr> wrote:
> Hi,
>
> The fix is available at :
> http://jp-andre.pagesperso-orange.fr/redo-vcn.zip
>
> Jean-Pierre
>
>
> Jean-Pierre André wrote:
>>
>> Hi,
>>
>> Gil Barash via ntfs-3g-devel wrote:
>>>
>>> Hey,
>>>
>>> Downloading the latest did the trick. Thanks.
>>>
>>> I agree that the data replacement operations are easier, but the
>>> bigger problems are when an attribute is added/deleted, because those
>>> operations shift the data for the following redo operations.
>>> I would love to hear the rational behind all this, extremely
>>> complicated, journal scheme from Microsoft, but that would probably
>>> never happen :-)
>>>
>>> I asked this because I am trying to debug a partition I created when
>>> crashing a Windows10 machine (you can find it here:
>>>
>>> https://s3-eu-west-1.amazonaws.com/gilbucket1/ntfs-disks/ntfs_shutdown_win10.raw).
>>>
>>>
>>> You can see that operation 239b94 fails the recovery process.
>>>
>>> The undo data of operation 239b94 is "0f" which differs from the "a0"
>>> found at 0x360. I think maybe the operation was targeting the "0f"  at
>>> 0x3d8.
>>
>>
>> Yes, this was meant to replace "0f" by "1f" to record a
>> fifth cluster being added to the index.
>>
>>> Note that, an earlier operation, for the same inode (45), operation
>>> 229f2a (DeleteIndexEntryRoot) - is not executed because the undo data
>>> doesn't match. It has a length of 0x78 which is exactly 0x3d8-0x360
>>>
>>> I tried to trace back, to see why operation 229f2a find the "wrong"
>>> data - perhaps an earlier operation also failed to run - but I didn't
>>> find anything definitive yet.
>>
>>
>> Yes. The offending earlier operation is 0x22705f which
>> puts the vcn at a wrong location. In redo_update_root_vcn()
>> there is some logic which remains to be understood.
>>
>> In the situations I met so far, I had to add 16 to the
>> offset in order to get correct behavior, but for this
>> specific action the correct value to add is 0x70.
>> When forcing this value, the full log can be processed
>> thus restoring the partition to a consistent state.
>>
>> Comparing two different situations leads to a possible
>> explanation : the attribute_offset (0x40) tells where
>> the index entry begins, and the vcn to insert is the last
>> field of the entry, so just add the length of entry (0x78)
>> minus 8.
>>
>> I need some time to check this theory (which probably also
>> applies to redo_update_vcn()).
>>
>> Jean-Pierre
>>
>>>
>>> Regards,
>>> Gil
>>>
>>> On Fri, Jun 16, 2017 at 10:59 AM, Jean-Pierre André
>>> <jean-pierre.an...@wanadoo.fr> wrote:
>>>>
>>>>
>>>> Hi,
>>>>
>>>> Update...
>>>>
>>>> Jean-Pierre André wrote:
>>>>>
>>>>>
>>>>> Gil Barash via ntfs-3g-devel wrote:
>>>>>
>>>>>> Hey,
>>>>>>
>>>>>> I have two follow up issues:
>>>>
>>>>
>>>>
>>>> [...]
>>>>
>>>>>>
>>>>>> --- 2 --
>>>>>> General question about redo process:
>>>>>> In change_resident_expect (called by redo_update_root_vcn, for
>>>>>> example) we chose not to apply the redo data if the current buffer
>>>>>> state doesn't match the undo data. Note that the operation is
>>>>>> considered successful in this case.
>>>>>
>>>>>
>>>>>
>>>>> I think the reasoning is a follows :
>>>>>
>>>>> - the old state was A
>>>>> - an update is made, the state is now B
>>>>> - a second update is made, the state is now C
>>>>> - C is synced to disk, but a failure occurs before
>>>>>     the syncing is recorded in the log.
>>>>>
>>>>> When restarting, both updates have to be replayed,
>>>>> and when applying the first one, the state on disk
>>>>> is not the undo state (which is A), so it is correct
>>>>> to apply the update (whose result will be overwritten
>>>>> by the second update).
>>>>>
>>>>> Avoiding the updates if the undo data does not match
>>>>> the state on disk probably leads to the same result,
>>>>> but IMHO it is safer to make sure the final state is
>>>>> what the redo data says. Other updates which overlap
>>>>> could be intertwined and they would be processed
>>>>> incorrectly when not applied to the same state as in
>>>>> the initial execution.
>>>>
>>>>
>>>>
>>>> Well, actually the current code is doing the opposite
>>>> (that is applying the update if the current state matches
>>>> the undo data).
>>>>
>>>> I have run my tests again after reversing the rule (so
>>>> applying the update if the current state does not match
>>>> the redo data), and the tests fail when the second
>>>> update destroys the attribute (e.g. deleting a file).
>>>> In this situation there is nothing the redo data can
>>>> be compared against (more exactly the current state is
>>>> meaningless and should not be compared to redo data).
>>>> There is not enough information to rebuild the intermediate
>>>> state, the only possible action is doing nothing, and
>>>> some criterion is needed to go this way.
>>>>
>>>> Jean-Pierre
>>>>
>>>>
>>>>>> Can you please provide some small explanation or direct me to some
>>>>>> form of documentation as to why it is OK to skip an operation.
>>>>>
>>>>>
>>>>>
>>>>> I would also be interested to get some form of
>>>>> documentation...
>>>>>
>>>>> Regards
>>>>>
>>>>> Jean-Pierre
>>>>>
>>>>>> I suspect it might cause some kind of "chain-reaction" where future
>>>>>> operations on this "cluster" would also be skipped because they expect
>>>>>> to see the data as it should have been after applying the redo
>>>>>> operation.
>>>>>>
>>>>>> Thanks,
>>>>>> Gil
>>
>>
>>
>
>

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
ntfs-3g-devel mailing list
ntfs-3g-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ntfs-3g-devel

Reply via email to