Andrew Lentvorski wrote:
> David Looney wrote:
>> So something about your scanned PDF (way image is stored ?) must be
>> different.
> 
> Yes, it is.  And you're going to laugh.
> 
> The stupid scanner took a 600 dpi gray scan...
> 
> And then dithered it to black and white.  ARGGGHHHH!

That doesn't sound like a very user friendly thing to do ....

> All the conversions actually work *flawlessly*.  The problem is in the
> final rendering for display or print.

I've found a bug in convert/gs with large enough images in PDF files,
however, that have a cropbox specified. Convert something.jpg to
something.pdf works, but then running convert something.pdf
somethingelse.jpg does not give back the original image, but rather a
background the original size with only a little image in the left lower
corner the size specified in the cropbox field of the PDF.

There is of course the direct approach (snarfed from the net):

#!/usr/bin/env python
import Image, re, zlib, sys

def stripImages(fn):
    buf= open(fn,'rb').read()
    fnS= fn.split(".")[0]
    s =
re.findall("(?s)/XObject\s+/Subtype\s+/Image(.*?)stream\s*\012(.*?)endstream",
buf)
    print len(s)
    for i in s:
        try:
            name = re.findall("(?i)/name\s+/(\w+)",i[0])[0]
            width= re.findall("(?i)/Width\s+(\d+)",i[0])[0]
            height= re.findall("(?i)/Height\s+(\d+)",i[0])[0]
            filter = re.findall("(?i)/filter\s+/(\w+)",i[0])[0]
            colorSpace = re.findall("(?i)/ColorSpace\s+/(\w+)",i[0])[0]
        except IndexError:
            print "Skip:", i[0]
            continue

        print "Found:", name, width, height, filter, colorSpace

        if filter=="FlateDecode":
            im = zlib.decompress(i[1])
            im = Image.fromstring("RGB", (int(width),int(height)), im)
            im.save("%s_%s.jpg"%(fnS,name))
        elif filter == "DCTDecode":
            open("%s_%s.jpg"%(fnS,name),'wb').write(i[1])


stripImages(sys.argv[1])

Which will yank that jpeg right out of that pdf.

David Looney

-- 
And I sincerely believe, with you, that banking establishments are more
dangerous than standing armies.... Thomas Jefferson (letter to John
Taylor, 5/28/1816)


-- 
[email protected]
http://www.kernel-panic.org/cgi-bin/mailman/listinfo/kplug-list

Reply via email to