[ Repeating, forgot to cc to the list ] Jean-Pierre André wrote: > Hi, > > There was a bug in the index location, which in bad conditions > could lead to an endless loop in the indexed search. So I have > fixed the bug, and protected against a corrupted index leading > to a similar loop. > > With the posted data, I can access the first byte of 865,675 > dummy files similar to yours... Of course they are not your > actual files and there is still room for problems. > > Could you try : > > http://jp-andre.pagesperso-orange.fr/dedup122-beta.zip > > Regards > > Jean-Pierre > > Jelle de Jong wrote: >> Hi Jean-Pierre, >> >> # output: md5sum *.gz >> https://powermail.nu/nextcloud/index.php/s/jxler2rZOqBdpr2 >> >> 879499f9187b0f590ae92460f4949dfd stream.data.full.dir.tar.gz >> a8fc902613486e332898f92aba26c61f reparse-tags.gz >> >> Kind regards, >> >> Jelle de Jong >> >> On 23/02/17 14:24, Jean-Pierre André wrote: >>> Hi, >>> >>> Can you also post the md5 (or sha1, or ...) of the big >>> file. The connection is frequently interrupted, and I >>> cannot rely on the downloaded file without a check. >>> >>> Jean-Pierre >>> >>> Jelle de Jong wrote: >>>> Hi Jean-Pierre, >>>> >>>> Thank you! >>>> >>>> The reparse-tags.gz file: >>>> https://powermail.nu/nextcloud/index.php/s/fS6Y6bpzoMgPiZ0 >>>> >>>> Generated by running: getfattr -e hex -n system.ntfs_reparse_data -R >>>> /mnt/sr7-sdb2/ 2> /dev/null | grep ntfs_reparse_data | gzip > >>>> /root/reparse-tags.gz >>>> >>>> Kind regards, >>>> >>>> Jelle de Jong >>>> >>>> On 23/02/17 12:07, Jean-Pierre André wrote: >>>>> Hi, >>>>> >>>>> Jelle de Jong wrote: >>>>>> Dear Jean-Pierre, >>>>>> >>>>>> I thought version 1.2.1 of the plug-in was working so I took it >>>>>> further >>>>>> into production, but during backups with rdiff-backup and >>>>>> guestmount it >>>>>> created a 100% cpu load in qemu process that stayed there for days >>>>>> until >>>>>> I killed them, I tested this twice. So I went back to a >>>>>> xpart/mount -t >>>>>> ntfs command and found more "Bad stream for offset" and found that >>>>>> the >>>>>> /sbin/mount.ntfs-3g command was running at 100% cpu load and hanged >>>>>> there. >>>>> >>>>> Too bad. >>>>> >>>>>> I have added the whole Stream directory here: (1.1GB) >>>>>> https://powermail.nu/nextcloud/index.php/s/vbq85qZ2wcVYxrG >>>>>> >>>>>> Separate stream file: stream.data.full.000c0000.00020001.gz >>>>>> https://powermail.nu/nextcloud/index.php/s/QinV51XE4jrAH7a >>>>>> >>>>>> All the commands I used: >>>>>> http://paste.debian.net/plainh/c0ea5950 >>>>>> >>>>>> I do not know how to get the reparse tags of all the files, maybe you >>>>>> can help me how to get all the information you need. >>>>> >>>>> Just use option -R on the base directory : >>>>> >>>>> getfattr -e hex -n system.ntfs_reparse_data -R base-dir >>>>> >>>>> Notes : >>>>> 1) files with no reparse tags (those which are not deduplicated) >>>>> will throw an error >>>>> 2) this will output the file names, which you might not want >>>>> to disclose. Fortunately I do not need them for now. >>>>> >>>>> So you may append to the above command : >>>>> >>>>> 2> /dev/null | grep ntfs_reparse_data | gz > reparse-tags.gz >>>>> >>>>> With that, I will be able to build a configuration similar >>>>> to yours... apart from the files themselves. >>>>> >>>>> Regards >>>>> >>>>> Jean-Pierre >>>>> >>>>>> >>>>>> Thank you for your help! >>>>>> >>>>>> Kind regards, >>>>>> >>>>>> Jelle de Jong >>>>>> >>>>>> On 14/02/17 15:55, Jean-Pierre André wrote: >>>>>>> Hi, >>>>>>> >>>>>>> Jelle de Jong wrote: >>>>>>>> Hi Jean-Pierre, >>>>>>>> >>>>>>>> If we have to switch to Windows 2012 and thereby having an >>>>>>>> environment >>>>>>>> similar to yours then we can switch to an other Windows version. >>>>>>> >>>>>>> I do not have any Windows Server, and my analysis >>>>>>> and tests are based on an unofficial deduplication >>>>>>> package which was adapted to Windows 10 Pro. >>>>>>> >>>>>>> A few months ago, following a bug report, I had to >>>>>>> make changes for Windows Server 2012 which uses an >>>>>>> older data format, and my only experience about this >>>>>>> format is related to this report. So switching to >>>>>>> Windows 2012 is not guaranteed to make debugging easier. >>>>>>> >>>>>>>> We are running out of disk space here so if switching Windows >>>>>>>> versions >>>>>>>> makes the process of having data deduplication working easer >>>>>>>> then me >>>>>>>> know. >>>>>>> >>>>>>> I have not yet analyzed your latest report, but it >>>>>>> would probably be useful I build a full copy of >>>>>>> non-user data from your partition : >>>>>>> - the reparse tags of all your files, >>>>>>> - all the "*.ccc" files in the Stream directory >>>>>>> >>>>>>> Do not do it now, I must first dig into the data you >>>>>>> posted. >>>>>>> >>>>>>> Regards >>>>>>> >>>>>>> Jean-Pierre >>>>>>> >>>>>>> >>>>>>>> Kind regards, >>>>>>>> >>>>>>>> Jelle de Jong >>>>>>>> >>>>>>>> On 09/02/17 13:46, Jelle de Jong wrote: >>>>>>>>> Hi Jean-Pierre, >>>>>>>>> >>>>>>>>> In case you are wondering: >>>>>>>>> >>>>>>>>> I am using data deduplication in Windows 2016 for my test >>>>>>>>> environment >>>>>>>>> iso: >>>>>>>>> SW_DVD9_Win_Svr_STD_Core_and_DataCtr_Core_2016_64Bit_English_-2_MLF_X21-22843.ISO >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> Kind regards, >>>>>>>>> >>>>>>>>> Jelle de Jong >>>>>>>>> >>>>>>>>> On 09/02/17 11:41, Jean-Pierre André wrote: >>>>>>>>>> Hi, >>>>>>>>>> >>>>>>>>>> Jelle de Jong wrote: >>>>>>>>>>> Hi Jean-Pierre, >>>>>>>>>>> >>>>>>>>>>> Thank you! >>>>>>>>>>> >>>>>>>>>>> The new plug-in seems to work for now, I am moving it into >>>>>>>>>>> testing >>>>>>>>>>> phase >>>>>>>>>>> with-in our production back-up scripts. >>>>>>>>>> >>>>>>>>>> Please wait a few hours, I have found a bug which >>>>>>>>>> I have fixed. I am currently inserting your data >>>>>>>>>> into my test base in order to rerun all my tests. >>>>>>>>>> >>>>>>>>>>> Will you release the source code eventually, would like to >>>>>>>>>>> write a >>>>>>>>>>> blog >>>>>>>>>>> post about how to add the support. >>>>>>>>>> >>>>>>>>>> What exactly do you mean ? If it is about how to >>>>>>>>>> collect the data in a unsupported condition, it is >>>>>>>>>> difficult, because unsupported generally means >>>>>>>>>> unknown territory... >>>>>>>>>> >>>>>>>>>>> What do you think the changes are of the plug-in stop working >>>>>>>>>>> again? >>>>>>>>>> >>>>>>>>>> (assuming a typo changes -> chances) >>>>>>>>>> Your files were in a condition not met before : data >>>>>>>>>> has been relocated according to a logic I do not fully >>>>>>>>>> understand. Maybe this is an intermediate step in the >>>>>>>>>> process of updating the files, anyway this can happen. >>>>>>>>>> >>>>>>>>>> The situation I am facing is that I have a single >>>>>>>>>> example from which it is difficult to derive the rules. >>>>>>>>>> So yes, the plugin may stop working again. >>>>>>>>>> >>>>>>>>>> Note : there are strict consistency checks in the plugin, >>>>>>>>>> so it is unlikely you read invalid data. Moreover if >>>>>>>>>> you only mount read-only you cannot damage the deduplicated >>>>>>>>>> partition. >>>>>>>>>> >>>>>>>>>>> We do not have an automatic test running to verify the >>>>>>>>>>> back-ups at >>>>>>>>>>> this >>>>>>>>>>> moment _yet_, so if the plug-in stops working, incremental >>>>>>>>>>> file-based >>>>>>>>>>> back-ups with empty files will slowly get in the back-ups this >>>>>>>>>>> way :| >>>>>>>>>> >>>>>>>>>> Usually a deduplicated partition is only used for backups, >>>>>>>>>> and reading from backups is only for recovering former >>>>>>>>>> versions of files (on demand). >>>>>>>>>> >>>>>>>>>> If you access deduplicated files with no human control, >>>>>>>>>> you have to insert your own checks in the process. I >>>>>>>>>> would at least check whether the size of the recovered >>>>>>>>>> file is the same as the deduplicated one (also grep for >>>>>>>>>> messages in the syslog). >>>>>>>>>> >>>>>>>>>> Regards >>>>>>>>>> >>>>>>>>>> Jean-Pierre >>>>>>>>>> >>>>>>>>>>> Again thank you for all your help so far! >>>>>>>>>>> >>>>>>>>>>> Kind regards, >>>>>>>>>>> >>>>>>>>>>> Jelle de Jong >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On 08/02/17 15:59, Jean-Pierre André wrote: >>>>>>>>>>>> Hi, >>>>>>>>>>>> >>>>>>>>>>>> Can you please make a try with : >>>>>>>>>>>> http://jp-andre.pagesperso-orange.fr/dedup120-beta.zip >>>>>>>>>>>> >>>>>>>>>>>> This is experimental and based on assumptions which have >>>>>>>>>>>> to be clarified, but it should work in your environment. >>>>>>>>>>>> >>>>>>>>>>>> Regards >>>>>>>>>>>> >>>>>>>>>>>> Jean-Pierre >>>>> >>>>> >>>> >>> >>> >> > >
------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot _______________________________________________ ntfs-3g-devel mailing list ntfs-3g-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ntfs-3g-devel