On Sun, Aug 10, 2014 at 11:18:49AM +0200, Jean-Pierre André wrote: > Hi, > > Did you compare with the Microsoft implementation ? > > I have only checked the biggest file in IE7 update for WinXP > (WINDOWS/ie7updates/KB963027-IE7/ieframe.dll) with > cluster size 4096 : > > Original size 6066688 > Microsoft implementation 3883008 (64.0%) > current implementation 3682304 (60.7%) > proposed implementation 3710976 (61.2%)
I have not done any comparisons with the Microsoft implementation yet. Is there a more precise way to test it than actually copying a file to a NTFS volume from Windows? I'm not surprised that it apparently produces a worse compression ratio than NTFS-3g. Although it's impossible to know for sure what their algorithm does, my expectation is that they use hash chains --- similar to my proposal, perhaps with a slightly less exhaustive search --- but use "greedy" parsing rather than "lazy" parsing. If there's a desire for even greater performance improvement, then "greedy" parsing is the way to go. But it will degrade the compression ratio, maybe placing it closer to the Microsoft implementation. Eric ------------------------------------------------------------------------------ _______________________________________________ ntfs-3g-devel mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/ntfs-3g-devel
