[ Repeating, forgot to cc to the list ]

Jean-Pierre André wrote:
> Hi,
>
> There was a bug in the index location, which in bad conditions
> could lead to an endless loop in the indexed search. So I have
> fixed the bug, and protected against a corrupted index leading
> to a similar loop.
>
> With the posted data, I can access the first byte of 865,675
> dummy files similar to yours... Of course they are not your
> actual files and there is still room for problems.
>
> Could you try :
>
> http://jp-andre.pagesperso-orange.fr/dedup122-beta.zip
>
> Regards
>
> Jean-Pierre
>
> Jelle de Jong wrote:
>> Hi Jean-Pierre,
>>
>> # output: md5sum *.gz
>> https://powermail.nu/nextcloud/index.php/s/jxler2rZOqBdpr2
>>
>> 879499f9187b0f590ae92460f4949dfd  stream.data.full.dir.tar.gz
>> a8fc902613486e332898f92aba26c61f  reparse-tags.gz
>>
>> Kind regards,
>>
>> Jelle de Jong
>>
>> On 23/02/17 14:24, Jean-Pierre André wrote:
>>> Hi,
>>>
>>> Can you also post the md5 (or sha1, or ...) of the big
>>> file. The connection is frequently interrupted, and I
>>> cannot rely on the downloaded file without a check.
>>>
>>> Jean-Pierre
>>>
>>> Jelle de Jong wrote:
>>>> Hi Jean-Pierre,
>>>>
>>>> Thank you!
>>>>
>>>> The reparse-tags.gz file:
>>>> https://powermail.nu/nextcloud/index.php/s/fS6Y6bpzoMgPiZ0
>>>>
>>>> Generated by running: getfattr -e hex -n system.ntfs_reparse_data -R
>>>> /mnt/sr7-sdb2/ 2> /dev/null | grep ntfs_reparse_data | gzip >
>>>> /root/reparse-tags.gz
>>>>
>>>> Kind regards,
>>>>
>>>> Jelle de Jong
>>>>
>>>> On 23/02/17 12:07, Jean-Pierre André wrote:
>>>>> Hi,
>>>>>
>>>>> Jelle de Jong wrote:
>>>>>> Dear Jean-Pierre,
>>>>>>
>>>>>> I thought version 1.2.1 of the plug-in was working so I took it
>>>>>> further
>>>>>> into production, but during backups with rdiff-backup and
>>>>>> guestmount it
>>>>>> created a 100% cpu load in qemu process that stayed there for days
>>>>>> until
>>>>>> I killed them, I tested this twice. So I went back to a
>>>>>> xpart/mount -t
>>>>>> ntfs command and found more "Bad stream for offset" and found that
>>>>>> the
>>>>>> /sbin/mount.ntfs-3g command was running at 100% cpu load and hanged
>>>>>> there.
>>>>>
>>>>> Too bad.
>>>>>
>>>>>> I have added the whole Stream directory here: (1.1GB)
>>>>>> https://powermail.nu/nextcloud/index.php/s/vbq85qZ2wcVYxrG
>>>>>>
>>>>>> Separate stream file: stream.data.full.000c0000.00020001.gz
>>>>>> https://powermail.nu/nextcloud/index.php/s/QinV51XE4jrAH7a
>>>>>>
>>>>>> All the commands I used:
>>>>>> http://paste.debian.net/plainh/c0ea5950
>>>>>>
>>>>>> I do not know how to get the reparse tags of all the files, maybe you
>>>>>> can help me how to get all the information you need.
>>>>>
>>>>> Just use option -R on the base directory :
>>>>>
>>>>> getfattr -e hex -n system.ntfs_reparse_data -R base-dir
>>>>>
>>>>> Notes :
>>>>> 1) files with no reparse tags (those which are not deduplicated)
>>>>> will throw an error
>>>>> 2) this will output the file names, which you might not want
>>>>> to disclose. Fortunately I do not need them for now.
>>>>>
>>>>> So you may append to the above command :
>>>>>
>>>>> 2> /dev/null | grep ntfs_reparse_data | gz > reparse-tags.gz
>>>>>
>>>>> With that, I will be able to build a configuration similar
>>>>> to yours... apart from the files themselves.
>>>>>
>>>>> Regards
>>>>>
>>>>> Jean-Pierre
>>>>>
>>>>>>
>>>>>> Thank you for your help!
>>>>>>
>>>>>> Kind regards,
>>>>>>
>>>>>> Jelle de Jong
>>>>>>
>>>>>> On 14/02/17 15:55, Jean-Pierre André wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>> Jelle de Jong wrote:
>>>>>>>> Hi Jean-Pierre,
>>>>>>>>
>>>>>>>> If we have to switch to Windows 2012 and thereby having an
>>>>>>>> environment
>>>>>>>> similar to yours then we can switch to an other Windows version.
>>>>>>>
>>>>>>> I do not have any Windows Server, and my analysis
>>>>>>> and tests are based on an unofficial deduplication
>>>>>>> package which was adapted to Windows 10 Pro.
>>>>>>>
>>>>>>> A few months ago, following a bug report, I had to
>>>>>>> make changes for Windows Server 2012 which uses an
>>>>>>> older data format, and my only experience about this
>>>>>>> format is related to this report. So switching to
>>>>>>> Windows 2012 is not guaranteed to make debugging easier.
>>>>>>>
>>>>>>>> We are running out of disk space here so if switching Windows
>>>>>>>> versions
>>>>>>>> makes the process of having data deduplication working easer
>>>>>>>> then me
>>>>>>>> know.
>>>>>>>
>>>>>>> I have not yet analyzed your latest report, but it
>>>>>>> would probably be useful I build a full copy of
>>>>>>> non-user data from your partition :
>>>>>>> - the reparse tags of all your files,
>>>>>>> - all the "*.ccc" files in the Stream directory
>>>>>>>
>>>>>>> Do not do it now, I must first dig into the data you
>>>>>>> posted.
>>>>>>>
>>>>>>> Regards
>>>>>>>
>>>>>>> Jean-Pierre
>>>>>>>
>>>>>>>
>>>>>>>> Kind regards,
>>>>>>>>
>>>>>>>> Jelle de Jong
>>>>>>>>
>>>>>>>> On 09/02/17 13:46, Jelle de Jong wrote:
>>>>>>>>> Hi Jean-Pierre,
>>>>>>>>>
>>>>>>>>> In case you are wondering:
>>>>>>>>>
>>>>>>>>> I am using data deduplication in Windows 2016 for my test
>>>>>>>>> environment
>>>>>>>>> iso:
>>>>>>>>> SW_DVD9_Win_Svr_STD_Core_and_DataCtr_Core_2016_64Bit_English_-2_MLF_X21-22843.ISO
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Kind regards,
>>>>>>>>>
>>>>>>>>> Jelle de Jong
>>>>>>>>>
>>>>>>>>> On 09/02/17 11:41, Jean-Pierre André wrote:
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> Jelle de Jong wrote:
>>>>>>>>>>> Hi Jean-Pierre,
>>>>>>>>>>>
>>>>>>>>>>> Thank you!
>>>>>>>>>>>
>>>>>>>>>>> The new plug-in seems to work for now, I am moving it into
>>>>>>>>>>> testing
>>>>>>>>>>> phase
>>>>>>>>>>> with-in our production back-up scripts.
>>>>>>>>>>
>>>>>>>>>> Please wait a few hours, I have found a bug which
>>>>>>>>>> I have fixed. I am currently inserting your data
>>>>>>>>>> into my test base in order to rerun all my tests.
>>>>>>>>>>
>>>>>>>>>>> Will you release the source code eventually, would like to
>>>>>>>>>>> write a
>>>>>>>>>>> blog
>>>>>>>>>>> post about how to add the support.
>>>>>>>>>>
>>>>>>>>>> What exactly do you mean ? If it is about how to
>>>>>>>>>> collect the data in a unsupported condition, it is
>>>>>>>>>> difficult, because unsupported generally means
>>>>>>>>>> unknown territory...
>>>>>>>>>>
>>>>>>>>>>> What do you think the changes are of the plug-in stop working
>>>>>>>>>>> again?
>>>>>>>>>>
>>>>>>>>>> (assuming a typo changes -> chances)
>>>>>>>>>> Your files were in a condition not met before : data
>>>>>>>>>> has been relocated according to a logic I do not fully
>>>>>>>>>> understand. Maybe this is an intermediate step in the
>>>>>>>>>> process of updating the files, anyway this can happen.
>>>>>>>>>>
>>>>>>>>>> The situation I am facing is that I have a single
>>>>>>>>>> example from which it is difficult to derive the rules.
>>>>>>>>>> So yes, the plugin may stop working again.
>>>>>>>>>>
>>>>>>>>>> Note : there are strict consistency checks in the plugin,
>>>>>>>>>> so it is unlikely you read invalid data. Moreover if
>>>>>>>>>> you only mount read-only you cannot damage the deduplicated
>>>>>>>>>> partition.
>>>>>>>>>>
>>>>>>>>>>> We do not have an automatic test running to verify the
>>>>>>>>>>> back-ups at
>>>>>>>>>>> this
>>>>>>>>>>> moment _yet_, so if the plug-in stops working, incremental
>>>>>>>>>>> file-based
>>>>>>>>>>> back-ups with empty files will slowly get in the back-ups this
>>>>>>>>>>> way :|
>>>>>>>>>>
>>>>>>>>>> Usually a deduplicated partition is only used for backups,
>>>>>>>>>> and reading from backups is only for recovering former
>>>>>>>>>> versions of files (on demand).
>>>>>>>>>>
>>>>>>>>>> If you access deduplicated files with no human control,
>>>>>>>>>> you have to insert your own checks in the process. I
>>>>>>>>>> would at least check whether the size of the recovered
>>>>>>>>>> file is the same as the deduplicated one (also grep for
>>>>>>>>>> messages in the syslog).
>>>>>>>>>>
>>>>>>>>>> Regards
>>>>>>>>>>
>>>>>>>>>> Jean-Pierre
>>>>>>>>>>
>>>>>>>>>>> Again thank you for all your help so far!
>>>>>>>>>>>
>>>>>>>>>>> Kind regards,
>>>>>>>>>>>
>>>>>>>>>>> Jelle de Jong
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 08/02/17 15:59, Jean-Pierre André wrote:
>>>>>>>>>>>> Hi,
>>>>>>>>>>>>
>>>>>>>>>>>> Can you please make a try with :
>>>>>>>>>>>> http://jp-andre.pagesperso-orange.fr/dedup120-beta.zip
>>>>>>>>>>>>
>>>>>>>>>>>> This is experimental and based on assumptions which have
>>>>>>>>>>>> to be clarified, but it should work in your environment.
>>>>>>>>>>>>
>>>>>>>>>>>> Regards
>>>>>>>>>>>>
>>>>>>>>>>>> Jean-Pierre
>>>>>
>>>>>
>>>>
>>>
>>>
>>
>
>



------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________
ntfs-3g-devel mailing list
ntfs-3g-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ntfs-3g-devel

Reply via email to