Hi Jean-Pierre, # output: md5sum *.gz https://powermail.nu/nextcloud/index.php/s/jxler2rZOqBdpr2
879499f9187b0f590ae92460f4949dfd stream.data.full.dir.tar.gz a8fc902613486e332898f92aba26c61f reparse-tags.gz Kind regards, Jelle de Jong On 23/02/17 14:24, Jean-Pierre André wrote: > Hi, > > Can you also post the md5 (or sha1, or ...) of the big > file. The connection is frequently interrupted, and I > cannot rely on the downloaded file without a check. > > Jean-Pierre > > Jelle de Jong wrote: >> Hi Jean-Pierre, >> >> Thank you! >> >> The reparse-tags.gz file: >> https://powermail.nu/nextcloud/index.php/s/fS6Y6bpzoMgPiZ0 >> >> Generated by running: getfattr -e hex -n system.ntfs_reparse_data -R >> /mnt/sr7-sdb2/ 2> /dev/null | grep ntfs_reparse_data | gzip > >> /root/reparse-tags.gz >> >> Kind regards, >> >> Jelle de Jong >> >> On 23/02/17 12:07, Jean-Pierre André wrote: >>> Hi, >>> >>> Jelle de Jong wrote: >>>> Dear Jean-Pierre, >>>> >>>> I thought version 1.2.1 of the plug-in was working so I took it further >>>> into production, but during backups with rdiff-backup and guestmount it >>>> created a 100% cpu load in qemu process that stayed there for days >>>> until >>>> I killed them, I tested this twice. So I went back to a xpart/mount -t >>>> ntfs command and found more "Bad stream for offset" and found that the >>>> /sbin/mount.ntfs-3g command was running at 100% cpu load and hanged >>>> there. >>> >>> Too bad. >>> >>>> I have added the whole Stream directory here: (1.1GB) >>>> https://powermail.nu/nextcloud/index.php/s/vbq85qZ2wcVYxrG >>>> >>>> Separate stream file: stream.data.full.000c0000.00020001.gz >>>> https://powermail.nu/nextcloud/index.php/s/QinV51XE4jrAH7a >>>> >>>> All the commands I used: >>>> http://paste.debian.net/plainh/c0ea5950 >>>> >>>> I do not know how to get the reparse tags of all the files, maybe you >>>> can help me how to get all the information you need. >>> >>> Just use option -R on the base directory : >>> >>> getfattr -e hex -n system.ntfs_reparse_data -R base-dir >>> >>> Notes : >>> 1) files with no reparse tags (those which are not deduplicated) >>> will throw an error >>> 2) this will output the file names, which you might not want >>> to disclose. Fortunately I do not need them for now. >>> >>> So you may append to the above command : >>> >>> 2> /dev/null | grep ntfs_reparse_data | gz > reparse-tags.gz >>> >>> With that, I will be able to build a configuration similar >>> to yours... apart from the files themselves. >>> >>> Regards >>> >>> Jean-Pierre >>> >>>> >>>> Thank you for your help! >>>> >>>> Kind regards, >>>> >>>> Jelle de Jong >>>> >>>> On 14/02/17 15:55, Jean-Pierre André wrote: >>>>> Hi, >>>>> >>>>> Jelle de Jong wrote: >>>>>> Hi Jean-Pierre, >>>>>> >>>>>> If we have to switch to Windows 2012 and thereby having an >>>>>> environment >>>>>> similar to yours then we can switch to an other Windows version. >>>>> >>>>> I do not have any Windows Server, and my analysis >>>>> and tests are based on an unofficial deduplication >>>>> package which was adapted to Windows 10 Pro. >>>>> >>>>> A few months ago, following a bug report, I had to >>>>> make changes for Windows Server 2012 which uses an >>>>> older data format, and my only experience about this >>>>> format is related to this report. So switching to >>>>> Windows 2012 is not guaranteed to make debugging easier. >>>>> >>>>>> We are running out of disk space here so if switching Windows >>>>>> versions >>>>>> makes the process of having data deduplication working easer then me >>>>>> know. >>>>> >>>>> I have not yet analyzed your latest report, but it >>>>> would probably be useful I build a full copy of >>>>> non-user data from your partition : >>>>> - the reparse tags of all your files, >>>>> - all the "*.ccc" files in the Stream directory >>>>> >>>>> Do not do it now, I must first dig into the data you >>>>> posted. >>>>> >>>>> Regards >>>>> >>>>> Jean-Pierre >>>>> >>>>> >>>>>> Kind regards, >>>>>> >>>>>> Jelle de Jong >>>>>> >>>>>> On 09/02/17 13:46, Jelle de Jong wrote: >>>>>>> Hi Jean-Pierre, >>>>>>> >>>>>>> In case you are wondering: >>>>>>> >>>>>>> I am using data deduplication in Windows 2016 for my test >>>>>>> environment >>>>>>> iso: >>>>>>> SW_DVD9_Win_Svr_STD_Core_and_DataCtr_Core_2016_64Bit_English_-2_MLF_X21-22843.ISO >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> Kind regards, >>>>>>> >>>>>>> Jelle de Jong >>>>>>> >>>>>>> On 09/02/17 11:41, Jean-Pierre André wrote: >>>>>>>> Hi, >>>>>>>> >>>>>>>> Jelle de Jong wrote: >>>>>>>>> Hi Jean-Pierre, >>>>>>>>> >>>>>>>>> Thank you! >>>>>>>>> >>>>>>>>> The new plug-in seems to work for now, I am moving it into testing >>>>>>>>> phase >>>>>>>>> with-in our production back-up scripts. >>>>>>>> >>>>>>>> Please wait a few hours, I have found a bug which >>>>>>>> I have fixed. I am currently inserting your data >>>>>>>> into my test base in order to rerun all my tests. >>>>>>>> >>>>>>>>> Will you release the source code eventually, would like to write a >>>>>>>>> blog >>>>>>>>> post about how to add the support. >>>>>>>> >>>>>>>> What exactly do you mean ? If it is about how to >>>>>>>> collect the data in a unsupported condition, it is >>>>>>>> difficult, because unsupported generally means >>>>>>>> unknown territory... >>>>>>>> >>>>>>>>> What do you think the changes are of the plug-in stop working >>>>>>>>> again? >>>>>>>> >>>>>>>> (assuming a typo changes -> chances) >>>>>>>> Your files were in a condition not met before : data >>>>>>>> has been relocated according to a logic I do not fully >>>>>>>> understand. Maybe this is an intermediate step in the >>>>>>>> process of updating the files, anyway this can happen. >>>>>>>> >>>>>>>> The situation I am facing is that I have a single >>>>>>>> example from which it is difficult to derive the rules. >>>>>>>> So yes, the plugin may stop working again. >>>>>>>> >>>>>>>> Note : there are strict consistency checks in the plugin, >>>>>>>> so it is unlikely you read invalid data. Moreover if >>>>>>>> you only mount read-only you cannot damage the deduplicated >>>>>>>> partition. >>>>>>>> >>>>>>>>> We do not have an automatic test running to verify the back-ups at >>>>>>>>> this >>>>>>>>> moment _yet_, so if the plug-in stops working, incremental >>>>>>>>> file-based >>>>>>>>> back-ups with empty files will slowly get in the back-ups this >>>>>>>>> way :| >>>>>>>> >>>>>>>> Usually a deduplicated partition is only used for backups, >>>>>>>> and reading from backups is only for recovering former >>>>>>>> versions of files (on demand). >>>>>>>> >>>>>>>> If you access deduplicated files with no human control, >>>>>>>> you have to insert your own checks in the process. I >>>>>>>> would at least check whether the size of the recovered >>>>>>>> file is the same as the deduplicated one (also grep for >>>>>>>> messages in the syslog). >>>>>>>> >>>>>>>> Regards >>>>>>>> >>>>>>>> Jean-Pierre >>>>>>>> >>>>>>>>> Again thank you for all your help so far! >>>>>>>>> >>>>>>>>> Kind regards, >>>>>>>>> >>>>>>>>> Jelle de Jong >>>>>>>>> >>>>>>>>> >>>>>>>>> On 08/02/17 15:59, Jean-Pierre André wrote: >>>>>>>>>> Hi, >>>>>>>>>> >>>>>>>>>> Can you please make a try with : >>>>>>>>>> http://jp-andre.pagesperso-orange.fr/dedup120-beta.zip >>>>>>>>>> >>>>>>>>>> This is experimental and based on assumptions which have >>>>>>>>>> to be clarified, but it should work in your environment. >>>>>>>>>> >>>>>>>>>> Regards >>>>>>>>>> >>>>>>>>>> Jean-Pierre >>> >>> >> > > ------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot _______________________________________________ ntfs-3g-devel mailing list ntfs-3g-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ntfs-3g-devel