On 01/03/21 12:11, (Nuno Silva) wrote:
> On 2021-03-01, Wols Lists wrote:
> 
>> I've got a bunch of scans, let's assume they're text documents. And
>> they're rather big ... I want to email them.
>>
>> How on earth do I convert them to TRUE b&w documents? At the moment they
>> are jpegs that weigh in at 3MB, and I guess they're using about 5 bytes
>> to store all the colour, luminance, whatever, per pixel. But actually,
>> there's only ONE BIT of information there - whether that pixel is black
>> or white.
>>
>> I'm using imagemagick, but so far all my attempts to strip out the
>> surplus information have resulted in INcreasing the file size ???
>>
>> So basically, how do I save an image as "one bit per pixel" like you'd
>> think you'd send to a B&W printer?
>>
>> Even at 300dpi, I make that 300*300/8 ~= 10KB/in^2 or 800KB of
>> uncompressed info for a page of A4, not 3MB.
>>
>> Cheers,
>> Wol
> 
> Somebody else might have a better suggestion, or perhaps a better
> understanding of the JPEG format and of what needs to be tuned, but, for
> example:
> 
>     convert origin.jpg -threshold 70% -monochrome result.jpg
> 
> (And adjust the "-threshold percent" if needed. It might be that you
> don't need thresholding at all, but if you do, it apparently must go
> before "-monochrome".)
> 
> (Depending on the receiving end, you could also explore other
> formats. Here, if the scanned document can be stored in monochrome, I
> usually use djvu.)
> 
Thanks but no, I've already tried that. It makes matters worse!

I've messed about with the scanner, so it is now creating 800KB images,
but I don't want to rescan everything I've done.

The problem is that it is clearly saving the images as greyscale, not as
black&white. And when I search for help, what I want is swamped by all
the false positives for greyscale.

Oh - and for Nuno - sorry tesseract is no use, they are NOT text. That's
why I used the word "assume" - to make it clear that I want a
1-bit/pixel palette, not a 5-byte/pixel greyscale.

Cheers,
Wol

Reply via email to