PNG claims "lossless" compression. The question is: Is that relevant?
If, say you scan at 600 DPI, and use a high-enough-quality JPG
compression, I would expect that you can get better quality at
I did a quick smoke test to test that kind of hypothesis: how well does
tesseract fare with JPG-compressed images?
Turned out it had MUCH more errors, even though the human eye wouldn't
see any difference. I guess the artifacts are throwing off the
algorithms of tesseract.
I did a bit of experimentation about what levels of compression would
start to lose serious OCR quality, and found that I'd need a setting
that give me not much better compression than PNG, so I thought "screw
it, stick with the original bits, at least I don't lose info that way".
This was with JPGs from 300-dpi scans.
I haven't tried with 600 dpi because tesseract docs tell me that it's
geared towards 300 dpi scans, and ebook docs tell me that everything is
preconfigured for 300 dpi scans as well.
I might still try and check what I can get out of a 600-dpi scan OCR-wise.
A list of frequently asked questions is available at:
You received this message because you are subscribed to the Google Groups "hugin and other free panoramic software" group.
To unsubscribe from this group and stop receiving emails from it, send an email
To view this discussion on the web visit
For more options, visit https://groups.google.com/d/optout.