i not want to transforme my pages in image..

its so..
i will extract de text of my pdf..
its be easy..

and in same action.. i will extract all images from this pdf..
apply OCR in images.. for extract text of each image.

so ...
i need so much to get all images into the pdf..
i won to take the byte raw of image..
now need transform that in a valid JAVA.AWT.IMAGE OR BUFFEREDIMAGE

Mike Marchywka-2 wrote:
> 
> 
> 
> 
> 
> You can always use the command line tool in pdf toolkit or xpf, 
> I can't remember which but there is something like
> pdf2image similar to pdf2text to extract text.
> 
> 
> 
> 
> 
> 
> 
> ----------------------------------------
>> Date: Tue, 23 Feb 2010 12:43:28 -0800
>> From: fernandogomes...@hotmail.com
>> To: itext-questions@lists.sourceforge.net
>> Subject: Re: [iText-questions] Using Images extracted from a pdf
>>
>>
>> I'm going crazy with it. as you can see, I never manipulated images as
>> low
>> level. and do not have much sense of how things work. I am searching for
>> a
>> days for end my solution. and I'm already getting stressed.
>> i going on test methods .. i try to do.. and before try by another
>> choice..
>> -.-
>>
>> can you give me some more assistance on how I can turn this array of
>> bytes
>> back into an image?
>>
>> could have just one class of api that made it not? : P
>>
>> Pdfimages buf = new pdfimages (myRawImageByteArray);
>> buf.getAsBufferedImage ();
>>
>> : P
>>
>> if you say you can not help me all right, but I can indicate a content in
>> which I can rely on to get this done?
>>
>> thanks.
>>
>>
>> Leonard Rosenthol-3 wrote:
>>>
>>> The image is decompressed and then "injected" into the PDF. Same with
>>> EVERY TYPE of image EXCEPT JPEG.
>>>
>>> -----Original Message-----
>>> From: Fernando Gomes [mailto:fernandogomes...@hotmail.com]
>>> Sent: Tuesday, February 23, 2010 3:21 PM
>>> To: itext-questions@lists.sourceforge.net
>>> Subject: Re: [iText-questions] Using Images extracted from a pdf
>>>
>>>
>>> ty ..
>>>
>>> I have a question.
>>> when I insert an image that is not jpeg
>>> what exactly happens with this?
>>>
>>> say that it is in PNG it is decompressed to be "injected" into PDF?
>>>
>>> or she keeps your PNG format, but the bytes are encoded with the
>>> FlateEncode
>>> ..
>>>
>>> a matter of finding the filter and decode do I get it.
>>>
>>> and if the image is uncompressed before being inserted to PDF, how do I
>>> know
>>> which type of encode the image?
>>>
>>>
>>> Leonard Rosenthol-3 wrote:
>>>>
>>>> Bits per pixel is the BitsPerComponent value in the image object
>>>>
>>>> Pixels per line (POR LINHA) is _NOT_ Width * bits. It's Width *
>>>> NumComponents, where NumComponents is based on the colorspace in
>>>> question
>>>> (eg. RGB == 3, CMYK == 4).
>>>>
>>>> -----Original Message-----
>>>> From: Fernando Gomes [mailto:fernandogomes...@hotmail.com]
>>>> Sent: Tuesday, February 23, 2010 2:00 PM
>>>> To: itext-questions@lists.sourceforge.net
>>>> Subject: Re: [iText-questions] Using Images extracted from a pdf
>>>>
>>>>
>>>>
>>>>
>>>>> public static BufferedImage createBufferedImageFromRawBytes(byte[]
>>>>> bytes,int width, int height, int bits) throws BadElementException,
>>>>> MalformedURLException, IOException {
>>>>> com.lowagie.text.Image img =
>>>>> com.lowagie.text.Image.getInstance(bytes);
>>>>>
>>>>> DataBuffer db = new DataBufferByte (img.getRawData(),
>>>>> img.getRawData().length);
>>>>>
>>>>> WritableRaster raster = Raster.createPackedRaster(db, //DATA BUFFER
>>>>> width, //LARGURA
>>>>> height, //ALTURA
>>>>> width*bits, //LARGURA * BITS POR PIXEL = PIXEL POR
>>>>> LINHA
>>>>> ->scanlineStride
>>>>> // bits, //BITS POR PIXEL ->pixelStride
>>>>> new int [] {bits},
>>>>>
>>>>> null);
>>>>>
>>>>> ColorSpace cs = ColorSpace.getInstance (img.getColorspace());
>>>>> ColorModel cm = new ComponentColorModel(cs, false, false,
>>>>> Transparency.OPAQUE, db.getDataType());
>>>>> BufferedImage bi = new BufferedImage (cm, raster, false, null);
>>>>> return null;
>>>>> }
>>>>>
>>>>>
>>>>
>>>> this code is up to where I could get, but there are variables that I
>>>> know
>>>> of
>>>> to generate bufferedImage, please someone help me see if I'm on track.
>>>> If I write something wrong.
>>>>
>>>>
>>>>
>>>> Fernando Gomes wrote:
>>>>>
>>>>> can anyone help-me one more time..
>>>>> i dont know what i do ..
>>>>>
>>>>> I need to get the image bytes, now decoded...
>>>>>
>>>>> String colorSpace = pdfStrem.get(PdfName.COLORSPACE).toString();
>>>>>> String filter = pdfStrem.get(PdfName.FILTER).toString();
>>>>>> int bits =
>>>>>> Integer.valueOf(pdfStrem.get(PdfName.BITSPERCOMPONENT).toString());
>>>>>> int width =
>>>>>> Integer.valueOf(pdfStrem.get(PdfName.WIDTH).toString());
>>>>>> int height =
>>>>>> Integer.valueOf(pdfStrem.get(PdfName.HEIGHT).toString());
>>>>>> PdfDictionary param =
>>>>>> (PdfDictionary)pdfStrem.get(PdfName.DECODEPARMS);
>>>>>> int colors =
>>>>>> Integer.valueOf(param.get(PdfName.COLORS).toString());
>>>>>> int predictor =
>>>>>> Integer.valueOf(param.get(PdfName.PREDICTOR).toString());
>>>>>> int colums =
>>>>>> Integer.valueOf(param.get(PdfName.COLUMNS).toString());
>>>>>> if(filter.equals("/FlateDecode"))
>>>>>> {
>>>>>> byte[] bytesDecod = PdfReader.FlateDecode(bytes);
>>>>>
>>>>> these are all the information that I can withdraw PDF
>>>>>
>>>>> I have to do to create my image in general ..
>>>>> I'm trying to do, or learn, but this hard, all my attempts have
>>>>> failed.
>>>>> ty
>>>>>
>>>>>
>>>>> Fernando Gomes wrote:
>>>>>>
>>>>>> Sirs, really sorry for duplicating, can delete other topics ?
>>>>>> so sorry ..:blush:
>>>>>>
>>>>>> very thkx for help..
>>>>>> and so good fast help ..
>>>>>> i will estudy more ..
>>>>>>
>>>>>>
>>>>>> Leonard Rosenthol-3 wrote:
>>>>>>>
>>>>>>> You are assuming that PDF maintains the PNG nature of the image -
>>>>>>> that
>>>>>>> is NOT the case. PDF only supports two kinds of images JPEG (which
>>>>>>> is
>>>>>>> why this works) and "raw bitmaps" (aka an array of bits). So in your
>>>>>>> case, with the PNG, it is transcoded into the latter case and so if
>>>>>>> you
>>>>>>> want it back you will need to reverse the process on your end.
>>>>>>>
>>>>>>
>>>>>>
>>>>>> for this response in other same email :blush:
>>>>>> quote of "1T3XT info" below ..
>>>>>>
>>>>>> really thanks. I must have seen the realance the chapter that you
>>>>>> mentioned, I will read again and very carefully. My English is very
>>>>>> weak,
>>>>>> and it is very difficult to read.
>>>>>>
>>>>>> you are very funny, I laughed a lot. I know I deserved the scolding.
>>>>>> Really thanks for your help. I will test and then come back to post
>>>>>> the
>>>>>> result.
>>>>>> Thank you!
>>>>>>
>>>>>>
>>>>>> 1T3XT info wrote:
>>>>>>>
>>>>>>> Fernando Henrique Gomes wrote:
>>>>>>>> the problem is when I insert an image in PNG format and then try to
>>>>>>>> get
>>>>>>>> the same...
>>>>>>>
>>>>>>> OK, we're talking about a PNG.
>>>>>>> If you've read chapter 10 of the 2nd edition of "iText in Action",
>>>>>>> you know that PNGs are transformed into zipped pixels.
>>>>>>> If you didn't know, you should read the book!
>>>>>>>
>>>>>>>> on here i try to take that image...
>>>>>>>>
>>>>>>>> [code]
>>>>>>>> int XrefIndex =((PRIndirectReference)obj).getNumber();
>>>>>>>> PdfObject pdfObj = pdf.getPdfObject(XrefIndex);
>>>>>>>> PdfStream pdfStrem = (PdfStream)pdfObj;
>>>>>>>> byte[] bytes =
>>>>>>>> PdfReader.getStreamBytesRaw((PRStream)pdfStrem);
>>>>>>>> if ((bytes != null)) {
>>>>>>>> String fileName = "Image_P"+pageNumber+"_";
>>>>>>>> File file = new File(fileName);
>>>>>>>> FileOutputStream fw = new FileOutputStream(file);
>>>>>>>> fw.write(bytes);
>>>>>>>> fw.flush();
>>>>>>>> fw.close();
>>>>>>>> BufferedImage img2 = ImageIO.read(file);
>>>>>>>> com.lowagie.text.Image img =
>>>>>>>> com.lowagie.text.Image.getInstance(file.toURL());
>>>>>>>> }
>>>>>>>> [/code]
>>>>>>>>
>>>>>>>> img2 returned a null !!!!
>>>>>>>
>>>>>>> Of course, why do you think that would work???
>>>>>>>
>>>>>>>> in line of img .. has a Excpetion
>>>>>>>> "Image_P1_ is not a recognized imageformat"
>>>>>>>
>>>>>>> Of course, you're sending iText a bunch of pixels,
>>>>>>> but: what are the dimensions of the image,
>>>>>>> how many bits are there per component?
>>>>>>>
>>>>>>>> when i try to do :
>>>>>>>> [code]
>>>>>>>> Image image = Toolkit.getDefaultToolkit().createImage(bytes);
>>>>>>>> [code]
>>>>>>>>
>>>>>>>> and before create an image from this image getting the width and
>>>>>>>> height
>>>>>>>> from my PdfStream (create a buffered and draw the image)
>>>>>>>> when i serialize on a file and visualize this.. this image in a
>>>>>>>> fucking
>>>>>>>> black picture .. all black -.-
>>>>>>>
>>>>>>> It's because you don't have a fucking clue about what you're doing
>>>>>>> :P
>>>>>>> Hehe, I was waiting for an occasion to use the F* word on the list.
>>>>>>> Thanks!
>>>>>>>
>>>>>>>> if i use JPEG encode for my images.. all the 3 solution i have ..
>>>>>>>> its
>>>>>>>> ok.. have effects..
>>>>>>>
>>>>>>> Well, that's because iText stores JPEGs literally as a JPEG without
>>>>>>> changing any of the bytes. If you look inside, you'll see that the
>>>>>>> filter is DCTDecode (Discrete Cosine Transform).
>>>>>>>
>>>>>>>> i can vizualize my images how to i create then .. perfect..
>>>>>>>> but if i change de JPEG ... for any other encode.. thats not have
>>>>>>>> efect
>>>>>>>> ..
>>>>>>>
>>>>>>> No idea what you're saying here, but you also need to study images.
>>>>>>>
>>>>>>>> can any help-me plz ?
>>>>>>>
>>>>>>> This example doesn't involve iText, but explains what you're
>>>>>>> missing.
>>>>>>>
>>>>>>> Let's create an image byte per byte:
>>>>>>>
>>>>>>> byte b[] = new byte[256 * 3];
>>>>>>> for (int i = 0; i < 256; i++) {
>>>>>>> b[i * 3] = (byte) (255 - i);
>>>>>>> b[i * 3 + 1] = (byte) (255 - i);
>>>>>>> b[i * 3 + 2] = (byte) i;
>>>>>>> }
>>>>>>>
>>>>>>> This is how a PNG, GIF, and some other image types are stored
>>>>>>> in a PDF, but in zipped format (FlateDecode). These bytes don't
>>>>>>> make any sense if you don't know the bpc, color space and
>>>>>>> dimensions.
>>>>>>>
>>>>>>> If you want to create an image from this bytes, you could do this:
>>>>>>>
>>>>>>> DataBuffer db = new DataBufferByte(b, b.length);
>>>>>>> WritableRaster raster = Raster.createInterleavedRaster(
>>>>>>> db, 16, 16, 48, 3, new int[]{0,1,2}, null);
>>>>>>> ColorSpace cs = ColorSpace.getInstance(ColorSpace.CS_sRGB);
>>>>>>> ColorModel cm = new ComponentColorModel(
>>>>>>> cs, false, false, Transparency.OPAQUE, DataBuffer.TYPE_BYTE);
>>>>>>> BufferedImage bi = new BufferedImage(cm, raster, false, null);
>>>>>>> ImageIO.write(bi, "bmp", new File("hello.bmp"));
>>>>>>>
>>>>>>> In this example, I treat the image as 16 x 16 pixels, using RGB,
>>>>>>> and converting it to a Bitmap. It's up to you to adapt the example
>>>>>>> if your image is a GrayScale or CMYK image, or if you want another
>>>>>>> format.
>>>>>>>
>>>>>>> (And please don't post the same question multiple times!!!)
>>>>>>> --
>>>>>>> This answer is provided by 1T3XT BVBA
>>>>>>> http://www.1t3xt.com/ - http://www.1t3xt.info
>>>>>>>
>>>>>>> ------------------------------------------------------------------------------
>>>>>>> Download Intel® Parallel Studio Eval
>>>>>>> Try the new software tools for yourself. Speed compiling, find bugs
>>>>>>> proactively, and fine-tune applications for parallel performance.
>>>>>>> See why Intel Parallel Studio got high marks during beta.
>>>>>>> http://p.sf.net/sfu/intel-sw-dev
>>>>>>> _______________________________________________
>>>>>>> iText-questions mailing list
>>>>>>> iText-questions@lists.sourceforge.net
>>>>>>> https://lists.sourceforge.net/lists/listinfo/itext-questions
>>>>>>>
>>>>>>> Buy the iText book: http://www.1t3xt.com/docs/book.php
>>>>>>> Check the site with examples before you ask questions:
>>>>>>> http://www.1t3xt.info/examples/
>>>>>>> You can also search the keywords list:
>>>>>>> http://1t3xt.info/tutorials/keywords/
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>> --
>>>> View this message in context:
>>>> http://old.nabble.com/Using-Images-extracted-from-a-pdf-tp27693711p27708516.html
>>>> Sent from the iText - General mailing list archive at Nabble.com.
>>>>
>>>>
>>>> ------------------------------------------------------------------------------
>>>> Download Intel® Parallel Studio Eval
>>>> Try the new software tools for yourself. Speed compiling, find bugs
>>>> proactively, and fine-tune applications for parallel performance.
>>>> See why Intel Parallel Studio got high marks during beta.
>>>> http://p.sf.net/sfu/intel-sw-dev
>>>> _______________________________________________
>>>> iText-questions mailing list
>>>> iText-questions@lists.sourceforge.net
>>>> https://lists.sourceforge.net/lists/listinfo/itext-questions
>>>>
>>>> Buy the iText book: http://www.1t3xt.com/docs/book.php
>>>> Check the site with examples before you ask questions:
>>>> http://www.1t3xt.info/examples/
>>>> You can also search the keywords list:
>>>> http://1t3xt.info/tutorials/keywords/
>>>>
>>>> ------------------------------------------------------------------------------
>>>> Download Intel® Parallel Studio Eval
>>>> Try the new software tools for yourself. Speed compiling, find bugs
>>>> proactively, and fine-tune applications for parallel performance.
>>>> See why Intel Parallel Studio got high marks during beta.
>>>> http://p.sf.net/sfu/intel-sw-dev
>>>> _______________________________________________
>>>> iText-questions mailing list
>>>> iText-questions@lists.sourceforge.net
>>>> https://lists.sourceforge.net/lists/listinfo/itext-questions
>>>>
>>>> Buy the iText book: http://www.1t3xt.com/docs/book.php
>>>> Check the site with examples before you ask questions:
>>>> http://www.1t3xt.info/examples/
>>>> You can also search the keywords list:
>>>> http://1t3xt.info/tutorials/keywords/
>>>>
>>>>
>>>
>>> --
>>> View this message in context:
>>> http://old.nabble.com/Using-Images-extracted-from-a-pdf-tp27693711p27709815.html
>>> Sent from the iText - General mailing list archive at Nabble.com.
>>>
>>>
>>> ------------------------------------------------------------------------------
>>> Download Intel® Parallel Studio Eval
>>> Try the new software tools for yourself. Speed compiling, find bugs
>>> proactively, and fine-tune applications for parallel performance.
>>> See why Intel Parallel Studio got high marks during beta.
>>> http://p.sf.net/sfu/intel-sw-dev
>>> _______________________________________________
>>> iText-questions mailing list
>>> iText-questions@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/itext-questions
>>>
>>> Buy the iText book: http://www.1t3xt.com/docs/book.php
>>> Check the site with examples before you ask questions:
>>> http://www.1t3xt.info/examples/
>>> You can also search the keywords list:
>>> http://1t3xt.info/tutorials/keywords/
>>>
>>> ------------------------------------------------------------------------------
>>> Download Intel® Parallel Studio Eval
>>> Try the new software tools for yourself. Speed compiling, find bugs
>>> proactively, and fine-tune applications for parallel performance.
>>> See why Intel Parallel Studio got high marks during beta.
>>> http://p.sf.net/sfu/intel-sw-dev
>>> _______________________________________________
>>> iText-questions mailing list
>>> iText-questions@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/itext-questions
>>>
>>> Buy the iText book: http://www.1t3xt.com/docs/book.php
>>> Check the site with examples before you ask questions:
>>> http://www.1t3xt.info/examples/
>>> You can also search the keywords list:
>>> http://1t3xt.info/tutorials/keywords/
>>>
>>>
>>
>> --
>> View this message in context:
>> http://old.nabble.com/Using-Images-extracted-from-a-pdf-tp27693711p27710159.html
>> Sent from the iText - General mailing list archive at Nabble.com.
>>
>>
>> ------------------------------------------------------------------------------
>> Download Intel® Parallel Studio Eval
>> Try the new software tools for yourself. Speed compiling, find bugs
>> proactively, and fine-tune applications for parallel performance.
>> See why Intel Parallel Studio got high marks during beta.
>> http://p.sf.net/sfu/intel-sw-dev
>> _______________________________________________
>> iText-questions mailing list
>> iText-questions@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/itext-questions
>>
>> Buy the iText book: http://www.1t3xt.com/docs/book.php
>> Check the site with examples before you ask questions:
>> http://www.1t3xt.info/examples/
>> You can also search the keywords list:
>> http://1t3xt.info/tutorials/keywords/
>                                         
> _________________________________________________________________
> Hotmail: Powerful Free email with security by Microsoft.
> http://clk.atdmt.com/GBL/go/201469230/direct/01/
> ------------------------------------------------------------------------------
> Download Intel&#174; Parallel Studio Eval
> Try the new software tools for yourself. Speed compiling, find bugs
> proactively, and fine-tune applications for parallel performance.
> See why Intel Parallel Studio got high marks during beta.
> http://p.sf.net/sfu/intel-sw-dev
> _______________________________________________
> iText-questions mailing list
> iText-questions@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/itext-questions
> 
> Buy the iText book: http://www.1t3xt.com/docs/book.php
> Check the site with examples before you ask questions:
> http://www.1t3xt.info/examples/
> You can also search the keywords list:
> http://1t3xt.info/tutorials/keywords/
> 
> 

-- 
View this message in context: 
http://old.nabble.com/Using-Images-extracted-from-a-pdf-tp27693711p27710517.html
Sent from the iText - General mailing list archive at Nabble.com.


------------------------------------------------------------------------------
Download Intel&#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
_______________________________________________
iText-questions mailing list
iText-questions@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/itext-questions

Buy the iText book: http://www.1t3xt.com/docs/book.php
Check the site with examples before you ask questions: 
http://www.1t3xt.info/examples/
You can also search the keywords list: http://1t3xt.info/tutorials/keywords/

Reply via email to