Hi, Eric, always good to hear from you, On Mon, Apr 10, 2017 at 4:21 PM, Eric Auer <e.a...@jpberlin.de> wrote: > >> BSUM (by Mateusz Viste) : 6.0s (100%) >> CRC32 (by Joe Forster) : 8.5s (70%) > >> MD5 (by Colin Plumb) : 52.9s (11%) >> SHA1 (by Colin Plumb) : 85.7s (7%) > > Entertaining :-) Still you need to find a good balance > between speed and collision risk. If you want to find > duplicate files, you can first check simply the sizes.
Check sizes? Okay, but some files still have bogus data at the end that is (largely) ignored. Well, maybe .ZIP comments aren't quite the literal "end", but I did find a .ZIP recently that had a bunch of 0x1A (EOF) markers appended (for some obscure reason, yes I know about CP/M's reasoning, but why would that carry over to a DOS-only .ZIP ???). And I've seen .ZIPs with the same exact files but using different internal compression methods. Same with OS-specific "extra fields". So even if the outside container is "slightly" different, the internals are 100% the same. There are no guarantees for 100% "byte exact", usually only "close enough". I am not a mathematician, and I'm out of the loop, but I feel like the risk of (accidental) collision is still fairly low. Call me naive. Besides, don't forget that .ZIP (and .ARJ and who else, ZOO ??) still uses CRC32 internally, and .ZIP is still overwhelmingly used for downloads (despite more efficient solutions). Even .7z and .xz have been criticized for flaws, so nothing is perfect. Similarly, it's not as easy as it sounds to replicate 100% "byte exact" executables. Even the slightest detail can alter the checksum, even if 100% equivalent functionality, even if using the exact same tools. Honestly, most things (software, data, et al.) just aren't meant to be "byte exact" (match identical). > [Skein] is even faster than Groestl but only on modern 64-bit CPU. "Modern"? AMD64 (with mandatory SSE2) appeared in 2003, Intel cloned it in Xeons in 2004 and Core 2 in 2006. It's been around quite a while, in various iterations. I think "modern" probably implies AVX(es) or newer Haswell-era / Skylake instructions. Heck, AMD's newfangled Ryzen supports the following (quoting from Wikipedia): AMD64/x86-64, MMX(+), SSE1, SSE2, SSE3, SSSE3, SSE4a, SSE4.1, SSE4.2, AES, CLMUL, AVX, AVX2, FMA3, CVT16/F16C, ABM, BMI1, BMI2, SHA (Note that CLMUL also says it can do ultra-fast CRCs, see the relevant Intel PDF linked from Wikipedia.) > If you feel like trying a new DOS project: It would be a > very fancy thing to have a disk-backed TEA encrypted disk > image based "disk" or a disk-backed COMPRESSED disk image > based "disk" driver with some very minimalistic compression > algorithm. Regrettably, there hasn't been a lot of interest in DOS file systems work. Not that I blame them, it's not easy for any OS. I assume you vaguely remember (or are familiar with) an old DOS compression program called "DIET", which had an optional TSR mode. Probably not quite what you meant, but I'm just reminding you anyways. ;-) ftp://ftp.sac.sk/pub/sac/pack/diet145f.zip > A tiny-amount-of-RAM compression algorithm would be for > example run length encoding. LZ variants such as LZO can > decompress without needing extra RAM outside the unpack > buffer itself. "mini LZO" is very small, (allegedly) very easy to use / embed in new projects. It was also updated last month: http://www.oberhumer.com/opensource/lzo/ > LZO and LZ4 are simple enough to even be used in Linux zram > which can swap out RAM to a compresed RAM disk on the fly. https://en.wikipedia.org/wiki/Zram "zram was merged into the Linux kernel mainline in kernel version 3.14, released on March 30, 2014." ... "Google uses zram in Chrome OS since 2013 and in Android since its version 4.4. Lubuntu also started using zram in its version 13.10." But I had read somewhere that it only saves a relatively small amount of RAM (a dozen or so MB). Better than nothing, but not exactly life-saving / earth-shattering. ------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot _______________________________________________ Freedos-user mailing list Freedosfirstname.lastname@example.org https://lists.sourceforge.net/lists/listinfo/freedos-user