Glenn Linderman on wrote...
| > DjVu divides a single image into many different images, then
| > compresses them separately. To create a DjVu file, the initial image
| > is first separated into three images: a background image, a foreground
| > image, and a mask image. The background and foreground images are
| > typically lower-resolution color images (e.g., 100dpi); the mask image
| > is a high-resolution bilevel image (e.g., 300dpi) and is typically
| > where the text is stored. The background and foreground images are
| > then compressed using a wavelet-based compression algorithm named
| > IW44. The mask image is compressed using a method called JB2 (similar
| > to JBIG2). The JB2 encoding method identifies nearly-identical shapes
| > on the page, such as multiple occurrences of a particular character in
| > a given font, style, and size. It compresses the bitmap of each unique
| > shape separately, and then encodes the locations where each shape
| > appears on the page. Thus, instead of compressing a letter "e" in a
| > given font multiple times, it compresses the letter "e" once (as a
| > compressed bit image) and then records every place on the page it
| > occurs.
| >
| > I should also mention two more things: the algorithms behind DJVU are
| > *heavily* patent-encumbered; and some of DJVU's concepts made it into
| > JPEG2000, and into PDF.
|
|
| Thanks Ross, for the info.
|
| Ah, so it isn't necessarily quite doing OCR (although I see it can have=20
| an OCRed layer), but "blob-let comparisons", so it could work for any=20
| character set, font, language, etc., that reuses characters. So the=20
| more text, the more repeated characters, the better the compression...
|
| After playing with the software...
| ...
| Interesting format.
|
While IM at this time does not read or write DjVu Images.
It sounds like a great image format to be able to use.
Just the algorithms it is using to divide an image into common sub
images and character 'glyphs' without actually doing the OCR stuff
would be make wonderful addition to the IM core library.
I just recently wrote an external script to just subdivide images into
separate vertical sections (line sub-division), without going into
further sub-divisions.
the next step of noting which sub-images are then 'simular', and
creating the 'merged' common image, are also a good algortihms that
should be made available too.
However getting back to the matter at hand, reading and writing DjVu
format specifically.
If there is software on the computer that can convert from or to this
image format from some other format IM knows about it is very posible to
create a 'Delegate' entry to do the conversion automatically via that
program. Reading Postscript is done in this manner, though a a much
higher complexity that standard delegation.
You are also right in that a 'coder' module could be created to read
DjVu image format, especially if a API library is available to do most
of the work as it is for JPEG, PNG, and TIFF. But the patent issues may
make it imposible to add direct write capabilities.
The question then comes down to... Who is going to do the work?
Cristy has his hands full and a huge todo list. I myself have been busy
adding things in the past with animation handling (still things to do),
distortions quality improvements, especially for perspective with
infinate tilinging, and a huge back long of more distortion functions
still to be added (the fun part).
So are you willing to raise you hand and create a coder module?
Or a simple delegate entry, for use when DjVu software is available?
(that later is not difficult, the former needs more programming).
Anthony Thyssen ( System Programmer ) <[EMAIL PROTECTED]>
-----------------------------------------------------------------------------
"Virtual machines?" asked Moira.
"That's like a computer that isn't there." Jerry said helpfully
-- Rick Cook, "The Wizardry Cursed"
-----------------------------------------------------------------------------
Anthony's Home is his Castle http://www.cit.gu.edu.au/~anthony/
_______________________________________________
Magick-users mailing list
[email protected]
http://studio.imagemagick.org/mailman/listinfo/magick-users