PDF compresses text with deflate (zip) and stores images in their original already compressed format like JPEG or PNG. Sure, the compression could be improved a little but that would break compatibility with lots of software and printers, so that's unlikely to happen. ZPAQ is already free if they want to use it.
On Mon, Jul 29, 2019, 12:31 PM Costi Dumitrescu <[email protected]> wrote: > Sell it to PDF software makers if it's any good with images and scans > > > On 29.07.2019 19:15, Matt Mahoney wrote: > > The ZPAQ Linux packages are an older version. The latest is here. > > http://www.mattmahoney.net/dc/zpaq.html > > > > There are 5 compression levels you can try. The newer versions focused > > on fast incremental backup functionality, dedupe, speed, and rollback > > capability, but the compression ratio is still good. I haven't updated > > it since I retired from Dell except to change the license from GPL to > > public domain. > > > > My initial goal was forward compatibility. The archive is self > > describing, meaning it runs an embedded program in a virtual machine > > to decompress. This allows for custom models. For example I have an > > archive that compresses 1M digits of pi to a few hundred bytes by > > writing a program that computers pi. The new version auto detects the > > hardware and compiles the embedded extractor to x86 to run twice as fast. > > > > The compressor normally uses LZ77, BWT, or context mixing (PAQ) > > depending on the compression level selected and the file type. It does > > incremental backups by testing if the last modified date changed, then > > comparing SHA1 hashes to see if the file still needs to be added or > > just renamed. The archive is append-only so you can roll it back by > > truncating it. It also lets you encrypt the archive. > > > > The initial ZPAQ versions were my attempt to put PAQ in a format that > > didn't break compatibility between versions. PAQ was a series of about > > 200 experimental programs with high compression ratio. PAQ derivatives > > still top all the benchmarks. > > > > On Mon, Jul 29, 2019, 6:29 AM Stefan Reich via AGI > > <[email protected] <mailto:[email protected]>> wrote: > > > > Greetings earthlings > > > > I just found out that Matt Mahoney's ZPAQ is available as a Debian > > package for my system. Didn't even make the connection before > > between his two hats! > > > > I took a database dump of agi.blue (4.1 MB a text file) and ran it > > through various compressors: > > * > > * > > *ls -lSrh > > total 9,8M > > -rw-rw-r-- 1 stefan stefan 580K Jul 29 02:20 concepts.structure.zpaq > > -rw-rw-r-- 1 stefan stefan 712K Jul 29 02:01 concepts.structure.7z > > -rw-rw-r-- 1 stefan stefan 731K Jul 29 02:08 concepts.structure.lrz > > -rw-rw-r-- 1 stefan stefan 817K Jul 29 02:08 concepts.structure.rz > > -rw-rw-r-- 1 stefan stefan 996K Jul 29 02:22 concepts.structure.gz9 > > -rw-rw-r-- 1 stefan stefan 996K Jul 29 02:22 concepts.structure.zip > > -rw-r--r-- 1 stefan stefan 1002K Jul 29 02:00 concepts.structure.gz > > -rw-rw-r-- 1 stefan stefan 4,1M Jul 29 02:08 concepts.structure* > > > > As you can see, ZPAQ wins /by far/. Compressing the file with it > > took a handful of seconds. > > > > It seems the king of compression is among us :-) > > > > @Matt: Is there any simple way to get even better compression than > > with ZPAQ 1.0/1.10 (those are the two versions I have) with > > default options? Also, have there been any attempts to run ZPAQ on > > a GPU? Would it help? > > > > Stefan > > > > -- > > Stefan Reich > > BotCompany.de // Java-based operating systems > > > > *Artificial General Intelligence List > > <https://agi.topicbox.com/latest>* / AGI / see discussions > > <https://agi.topicbox.com/groups/agi> + participants > > <https://agi.topicbox.com/groups/agi/members> + delivery options > > <https://agi.topicbox.com/groups/agi/subscription> Permalink > > < > https://agi.topicbox.com/groups/agi/Tda53259f8b6994b6-M22dcdb44a65bf50975e55b8c > > > > ------------------------------------------ Artificial General Intelligence List: AGI Permalink: https://agi.topicbox.com/groups/agi/Tda53259f8b6994b6-Mae63d3a3fc2f1feb7b2598dc Delivery options: https://agi.topicbox.com/groups/agi/subscription
