I haven't tested. With JPEG you trade off size vs quality. A PDF scan would just embed a JPEG with the same tradeoffs.
PAQ includes a model to compress JPEG losslessly about 30% (70% of original size) by using better (but slower) modeling of the DCT coefficients and arithmetic coding. I didn't include these models in ZPAQ because the code is complex and I would have to rewrite it from C++ to ZPAQL assembler. It would be possible to do, though. ZPAQ analyzes the file statistics and if it detects that it is random (meaning it is already compressed or encrypted) then it just stores the file with no compression. On Mon, Jul 29, 2019, 2:49 PM Costi Dumitrescu <[email protected]> wrote: > New software scanning to PDF with the smartphone camera > > What is the ratio to a JPEG? > > > On 29.07.2019 21:35, Matt Mahoney wrote: > > PDF compresses text with deflate (zip) and stores images in their > > original already compressed format like JPEG or PNG. Sure, the > > compression could be improved a little but that would break > > compatibility with lots of software and printers, so that's unlikely > > to happen. ZPAQ is already free if they want to use it. > > > > On Mon, Jul 29, 2019, 12:31 PM Costi Dumitrescu > > <[email protected] <mailto:[email protected]>> wrote: > > > > Sell it to PDF software makers if it's any good with images and scans > > > > > > On 29.07.2019 19:15, Matt Mahoney wrote: > > > The ZPAQ Linux packages are an older version. The latest is here. > > > http://www.mattmahoney.net/dc/zpaq.html > > > > > > There are 5 compression levels you can try. The newer versions > > focused > > > on fast incremental backup functionality, dedupe, speed, and > > rollback > > > capability, but the compression ratio is still good. I haven't > > updated > > > it since I retired from Dell except to change the license from > > GPL to > > > public domain. > > > > > > My initial goal was forward compatibility. The archive is self > > > describing, meaning it runs an embedded program in a virtual > machine > > > to decompress. This allows for custom models. For example I have an > > > archive that compresses 1M digits of pi to a few hundred bytes by > > > writing a program that computers pi. The new version auto > > detects the > > > hardware and compiles the embedded extractor to x86 to run twice > > as fast. > > > > > > The compressor normally uses LZ77, BWT, or context mixing (PAQ) > > > depending on the compression level selected and the file type. > > It does > > > incremental backups by testing if the last modified date > > changed, then > > > comparing SHA1 hashes to see if the file still needs to be added or > > > just renamed. The archive is append-only so you can roll it back by > > > truncating it. It also lets you encrypt the archive. > > > > > > The initial ZPAQ versions were my attempt to put PAQ in a format > > that > > > didn't break compatibility between versions. PAQ was a series of > > about > > > 200 experimental programs with high compression ratio. PAQ > > derivatives > > > still top all the benchmarks. > > > > > > On Mon, Jul 29, 2019, 6:29 AM Stefan Reich via AGI > > > <[email protected] <mailto:[email protected]> > > <mailto:[email protected] <mailto:[email protected]>>> wrote: > > > > > > Greetings earthlings > > > > > > I just found out that Matt Mahoney's ZPAQ is available as a > > Debian > > > package for my system. Didn't even make the connection before > > > between his two hats! > > > > > > I took a database dump of agi.blue (4.1 MB a text file) and > > ran it > > > through various compressors: > > > * > > > * > > > *ls -lSrh > > > total 9,8M > > > -rw-rw-r-- 1 stefan stefan 580K Jul 29 02:20 > > concepts.structure.zpaq > > > -rw-rw-r-- 1 stefan stefan 712K Jul 29 02:01 > > concepts.structure.7z > > > -rw-rw-r-- 1 stefan stefan 731K Jul 29 02:08 > > concepts.structure.lrz > > > -rw-rw-r-- 1 stefan stefan 817K Jul 29 02:08 > > concepts.structure.rz > > > -rw-rw-r-- 1 stefan stefan 996K Jul 29 02:22 > > concepts.structure.gz9 > > > -rw-rw-r-- 1 stefan stefan 996K Jul 29 02:22 > > concepts.structure.zip > > > -rw-r--r-- 1 stefan stefan 1002K Jul 29 02:00 > > concepts.structure.gz > > > -rw-rw-r-- 1 stefan stefan 4,1M Jul 29 02:08 > > concepts.structure* > > > > > > As you can see, ZPAQ wins /by far/. Compressing the file with > it > > > took a handful of seconds. > > > > > > It seems the king of compression is among us :-) > > > > > > @Matt: Is there any simple way to get even better > > compression than > > > with ZPAQ 1.0/1.10 (those are the two versions I have) with > > > default options? Also, have there been any attempts to run > > ZPAQ on > > > a GPU? Would it help? > > > > > > Stefan > > > > > > -- > > > Stefan Reich > > > BotCompany.de // Java-based operating systems > > > > > > *Artificial General Intelligence List > > > <https://agi.topicbox.com/latest>* / AGI / see discussions > > > <https://agi.topicbox.com/groups/agi> + participants > > > <https://agi.topicbox.com/groups/agi/members> + delivery options > > > <https://agi.topicbox.com/groups/agi/subscription> Permalink > > > > > < > https://agi.topicbox.com/groups/agi/Tda53259f8b6994b6-M22dcdb44a65bf50975e55b8c > > > > > > *Artificial General Intelligence List > > <https://agi.topicbox.com/latest>* / AGI / see discussions > > <https://agi.topicbox.com/groups/agi> + participants > > <https://agi.topicbox.com/groups/agi/members> + delivery options > > <https://agi.topicbox.com/groups/agi/subscription> Permalink > > < > https://agi.topicbox.com/groups/agi/Tda53259f8b6994b6-Mae63d3a3fc2f1feb7b2598dc ------------------------------------------ Artificial General Intelligence List: AGI Permalink: https://agi.topicbox.com/groups/agi/Tda53259f8b6994b6-M7d40683063e086d47f55dc3a Delivery options: https://agi.topicbox.com/groups/agi/subscription
