Re: [ntfs-3g-devel] ntfs-3g support for volumes with data deduplication windows 2012

Jean-Pierre André Thu, 23 Feb 2017 05:25:27 -0800

Hi,

Can you also post the md5 (or sha1, or ...) of the big
file. The connection is frequently interrupted, and I
cannot rely on the downloaded file without a check.


Jean-Pierre

Jelle de Jong wrote:
> Hi Jean-Pierre,
>
> Thank you!
>
> The reparse-tags.gz file:
> https://powermail.nu/nextcloud/index.php/s/fS6Y6bpzoMgPiZ0
>
> Generated by running: getfattr -e hex -n system.ntfs_reparse_data -R
> /mnt/sr7-sdb2/ 2> /dev/null | grep ntfs_reparse_data | gzip >
> /root/reparse-tags.gz
>
> Kind regards,
>
> Jelle de Jong
>
> On 23/02/17 12:07, Jean-Pierre André wrote:
>> Hi,
>>
>> Jelle de Jong wrote:
>>> Dear Jean-Pierre,
>>>
>>> I thought version 1.2.1 of the plug-in was working so I took it further
>>> into production, but during backups with rdiff-backup and guestmount it
>>> created a 100% cpu load in qemu process that stayed there for days until
>>> I killed them, I tested this twice. So I went back to a xpart/mount -t
>>> ntfs command and found more "Bad stream for offset" and found that the
>>> /sbin/mount.ntfs-3g command was running at 100% cpu load and hanged
>>> there.
>>
>> Too bad.
>>
>>> I have added the whole Stream directory here: (1.1GB)
>>> https://powermail.nu/nextcloud/index.php/s/vbq85qZ2wcVYxrG
>>>
>>> Separate stream file: stream.data.full.000c0000.00020001.gz
>>> https://powermail.nu/nextcloud/index.php/s/QinV51XE4jrAH7a
>>>
>>> All the commands I used:
>>> http://paste.debian.net/plainh/c0ea5950
>>>
>>> I do not know how to get the reparse tags of all the files, maybe you
>>> can help me how to get all the information you need.
>>
>> Just use option -R on the base directory :
>>
>> getfattr -e hex -n system.ntfs_reparse_data -R base-dir
>>
>> Notes :
>> 1) files with no reparse tags (those which are not deduplicated)
>> will throw an error
>> 2) this will output the file names, which you might not want
>> to disclose. Fortunately I do not need them for now.
>>
>> So you may append to the above command :
>>
>> 2> /dev/null | grep ntfs_reparse_data | gz > reparse-tags.gz
>>
>> With that, I will be able to build a configuration similar
>> to yours... apart from the files themselves.
>>
>> Regards
>>
>> Jean-Pierre
>>
>>>
>>> Thank you for your help!
>>>
>>> Kind regards,
>>>
>>> Jelle de Jong
>>>
>>> On 14/02/17 15:55, Jean-Pierre André wrote:
>>>> Hi,
>>>>
>>>> Jelle de Jong wrote:
>>>>> Hi Jean-Pierre,
>>>>>
>>>>> If we have to switch to Windows 2012 and thereby having an environment
>>>>> similar to yours then we can switch to an other Windows version.
>>>>
>>>> I do not have any Windows Server, and my analysis
>>>> and tests are based on an unofficial deduplication
>>>> package which was adapted to Windows 10 Pro.
>>>>
>>>> A few months ago, following a bug report, I had to
>>>> make changes for Windows Server 2012 which uses an
>>>> older data format, and my only experience about this
>>>> format is related to this report. So switching to
>>>> Windows 2012 is not guaranteed to make debugging easier.
>>>>
>>>>> We are running out of disk space here so if switching Windows versions
>>>>> makes the process of having data deduplication working easer then me
>>>>> know.
>>>>
>>>> I have not yet analyzed your latest report, but it
>>>> would probably be useful I build a full copy of
>>>> non-user data from your partition :
>>>> - the reparse tags of all your files,
>>>> - all the "*.ccc" files in the Stream directory
>>>>
>>>> Do not do it now, I must first dig into the data you
>>>> posted.
>>>>
>>>> Regards
>>>>
>>>> Jean-Pierre
>>>>
>>>>
>>>>> Kind regards,
>>>>>
>>>>> Jelle de Jong
>>>>>
>>>>> On 09/02/17 13:46, Jelle de Jong wrote:
>>>>>> Hi Jean-Pierre,
>>>>>>
>>>>>> In case you are wondering:
>>>>>>
>>>>>> I am using data deduplication in Windows 2016 for my test environment
>>>>>> iso:
>>>>>> SW_DVD9_Win_Svr_STD_Core_and_DataCtr_Core_2016_64Bit_English_-2_MLF_X21-22843.ISO
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> Kind regards,
>>>>>>
>>>>>> Jelle de Jong
>>>>>>
>>>>>> On 09/02/17 11:41, Jean-Pierre André wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>> Jelle de Jong wrote:
>>>>>>>> Hi Jean-Pierre,
>>>>>>>>
>>>>>>>> Thank you!
>>>>>>>>
>>>>>>>> The new plug-in seems to work for now, I am moving it into testing
>>>>>>>> phase
>>>>>>>> with-in our production back-up scripts.
>>>>>>>
>>>>>>> Please wait a few hours, I have found a bug which
>>>>>>> I have fixed. I am currently inserting your data
>>>>>>> into my test base in order to rerun all my tests.
>>>>>>>
>>>>>>>> Will you release the source code eventually, would like to write a
>>>>>>>> blog
>>>>>>>> post about how to add the support.
>>>>>>>
>>>>>>> What exactly do you mean ? If it is about how to
>>>>>>> collect the data in a unsupported condition, it is
>>>>>>> difficult, because unsupported generally means
>>>>>>> unknown territory...
>>>>>>>
>>>>>>>> What do you think the changes are of the plug-in stop working
>>>>>>>> again?
>>>>>>>
>>>>>>> (assuming a typo changes -> chances)
>>>>>>> Your files were in a condition not met before : data
>>>>>>> has been relocated according to a logic I do not fully
>>>>>>> understand. Maybe this is an intermediate step in the
>>>>>>> process of updating the files, anyway this can happen.
>>>>>>>
>>>>>>> The situation I am facing is that I have a single
>>>>>>> example from which it is difficult to derive the rules.
>>>>>>> So yes, the plugin may stop working again.
>>>>>>>
>>>>>>> Note : there are strict consistency checks in the plugin,
>>>>>>> so it is unlikely you read invalid data. Moreover if
>>>>>>> you only mount read-only you cannot damage the deduplicated
>>>>>>> partition.
>>>>>>>
>>>>>>>> We do not have an automatic test running to verify the back-ups at
>>>>>>>> this
>>>>>>>> moment _yet_, so if the plug-in stops working, incremental
>>>>>>>> file-based
>>>>>>>> back-ups with empty files will slowly get in the back-ups this
>>>>>>>> way :|
>>>>>>>
>>>>>>> Usually a deduplicated partition is only used for backups,
>>>>>>> and reading from backups is only for recovering former
>>>>>>> versions of files (on demand).
>>>>>>>
>>>>>>> If you access deduplicated files with no human control,
>>>>>>> you have to insert your own checks in the process. I
>>>>>>> would at least check whether the size of the recovered
>>>>>>> file is the same as the deduplicated one (also grep for
>>>>>>> messages in the syslog).
>>>>>>>
>>>>>>> Regards
>>>>>>>
>>>>>>> Jean-Pierre
>>>>>>>
>>>>>>>> Again thank you for all your help so far!
>>>>>>>>
>>>>>>>> Kind regards,
>>>>>>>>
>>>>>>>> Jelle de Jong
>>>>>>>>
>>>>>>>>
>>>>>>>> On 08/02/17 15:59, Jean-Pierre André wrote:
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> Can you please make a try with :
>>>>>>>>> http://jp-andre.pagesperso-orange.fr/dedup120-beta.zip
>>>>>>>>>
>>>>>>>>> This is experimental and based on assumptions which have
>>>>>>>>> to be clarified, but it should work in your environment.
>>>>>>>>>
>>>>>>>>> Regards
>>>>>>>>>
>>>>>>>>> Jean-Pierre
>>
>>
>



------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________
ntfs-3g-devel mailing list
ntfs-3g-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ntfs-3g-devel

Re: [ntfs-3g-devel] ntfs-3g support for volumes with data deduplication windows 2012

Reply via email to