The text pasted from the PDF to the clipboard is correct. It will probably 
require more investigation but not related to iText.

Paulo

----- Original Message ----- 
From: "Jake C" <[EMAIL PROTECTED]>
To: <[email protected]>
Sent: Tuesday, March 20, 2007 6:33 PM
Subject: Re: [iText-questions] Can't view text segments in certain PDF files


> No, there is actually text there now, but not a single one is 
> alphanumeric.
> I'm pasting in the text that I copied/pasted into notepad:
>
> ¿½½·¼»²¬ó·²·¬·¿¬·²¹
> «¬·´·¬§
> ª»²¼±®ó­«°°´·»¼ ¬¸»®³¿´ó¸§¼®¿«´·½­ ½¿´½«´¿ó
> ׬
> ·²
> °´¿²¬
> ±³·¬¬»¼ô
> ¼»ª»´±°»¼
> ¼·®»½¬´§
> °®»½»¼·²¹
> «­«¿´´§
> ¾§
> ª»®§
> °´¿²¬ ¼»­·¹²
> ¿½½·¼»²¬ó·²·¬·¿¬·²¹
> »²¹·²»»®·²¹
> ·­
> ª¿´«¿¾´»
> ®·­µó¿­­»­­³»²¬ °®±½»­­ô
> ¿
> ©±«´¼
> ¿ º±®³¿´´§ ¼±½«³»²¬»¼
> ß²¿´§­·­
> Ú«²½¬·±² Ûª»²¬
> °´¿²¬
> ³·¬·¹¿¬·²¹
> ·²·¬·¿¬·²¹
> ¬®¿²­´¿¬»¼
> °»®º±®³·²¹
> °®±ª·¼»­
> °®»°¿®·²¹
> ³±®»
> ¼»¬¿·´»¼
> Ú«²½¬·±²
> ¿ ¼·­¬·²½¬´§
> °´¿²¬
> »ª»²¬
> °®±ª·¼»­ ¿
> ¾¿­»´·²»
> °»®³·¬­ ¿
> ¿°°®±¿½¸
> ¾»¬©»»²
> ³·¬·¹¿¬·²¹
> °´¿²¬
> ¿
> »ª»²¬«¿´´§ ¼»½±³°±­»¼
> ±®
> «²¿ª¿·´¿¾·´·¬§
> ¯«¿²¬·¬¿¬·ª»´§ ³»¿­«®»¼ò
> ½±²­¬®«½¬·²¹
> °®»ª»²¬
> ½±²­»¯«»²½»­ô
> ®»´¿¬·±²­¸·°­
> ¾»¬©»»²
> ·²ª»²¬±®§
> ³¿·²ó
> ¬¿·²»¼ô
> ¿½½±³°´·­¸»¼ò
> ·²
> »´·³·²¿¬·²¹
> ¸»¿¬ó®»³±ª¿´
> ­«½½»­­º«´´§ ³¿·²¬¿·²»¼ò
> ¿
> ¿
> ÔÑÝßò
> ½±²­·¼»®»¼
> ïò
> îò ݱ²¬¿·²³»²¬ ±ª»®°®»­­«®»
> ¾´±©¼±©² ¾§
>
>
>>From: "Paulo Soares" <[EMAIL PROTECTED]>
>>Reply-To: Post all your questions about iText here
>><[email protected]>
>>To: "Post all your questions about iText here"
>><[email protected]>
>>Subject: Re: [iText-questions] Can't view text segments in certain PDF
>>files
>>Date: Tue, 20 Mar 2007 18:02:04 -0000
>>
>>See if it works now.
>>
>>Paulo
>>
>>----- Original Message ----- From: "Jake C" <[EMAIL PROTECTED]>
>>To: <[email protected]>
>>Sent: Tuesday, March 20, 2007 4:44 PM
>>Subject: [iText-questions] Can't view text segments in certain PDF files
>>
>>
>>>We use an OCR product to generate a PDF from a TIF with the original 
>>>image
>>>plus hidden text, so that you can search/select the text, but only see 
>>>the
>>>originally scanned image. We then use Adobe FlashPaper 2 to turn it into 
>>>a
>>>SWF that can be imbedded in a web page. However, the hidden text is being
>>>stripped out of the final SWF, so that it is no longer searchable. Adobe
>>>considers this a "limitation" (we consider it a "bug"). Most other OCR
>>>software has the same problem as the platform we chose, but there is one
>>>that seems to convert to SWF just fine. In an attempt to find out what 
>>>the
>>>difference was between the two files, I tried to use the Tree Viewer from
>>>iText to examine the contents of the files. However, when I select the
>>>Content node of the one that gets the text stripped out, I don't see
>>>anything. If I use the API to try to extract the Stream directly, I get a
>>>NullPointerException.
>>>
>>>So I guess I really have two questions.
>>>
>>>1) Is there something wrong with how the PDF is constructed that we 
>>>cannot
>>>examine the text content with iText, or is there a bug in iText?
>>>
>>>2) Is there a way we can manipulate the PDF from the OCR software we 
>>>chose
>>>to make it structurally look like the one that actually keeps the text
>>>when
>>>converted to SWF?
>>>
>>>I'm attaching a copy of the two files (0112_094_no_text_select.pdf from
>>>our
>>>selected OCR product, which we cannot view the text content, and
>>>0112_094_text_select.pdf from the other product, which we CAN view the
>>>text
>>>content, and actually keeps the text in the SWF) in a zip file.
>>>
>>>OK, it seems I can't attach a file, or the message gets refused. I've
>>>uploaded it to http://www.sharebigfile.com/file/116699/0112-094-zip.html
>>>
>>>_________________________________________________________________
>>>i'm making a difference. Make every IM count for the cause of your 
>>>choice.
>>>Join Now.
>>>http://clk.atdmt.com/MSN/go/msnnkwme0080000001msn/direct/01/?href=http://im.live.com/messenger/im/home/?source=hmtagline
>>>
>>>
>>>
>>
>>
>>--------------------------------------------------------------------------------
>>
>>
>>>-------------------------------------------------------------------------
>>>Take Surveys. Earn Cash. Influence the Future of IT
>>>Join SourceForge.net's Techsay panel and you'll get the chance to share
>>>your
>>>opinions on IT & business topics through brief surveys-and earn cash
>>>http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
>>
>>
>>--------------------------------------------------------------------------------
>>
>>
>>>_______________________________________________
>>>iText-questions mailing list
>>>[email protected]
>>>https://lists.sourceforge.net/lists/listinfo/itext-questions
>>>Buy the iText book: http://itext.ugent.be/itext-in-action/
>>>
>
>
>><< 0112_094_no_text_select_mod.pdf >>
>
>
>>-------------------------------------------------------------------------
>>Take Surveys. Earn Cash. Influence the Future of IT
>>Join SourceForge.net's Techsay panel and you'll get the chance to share
>>your
>>opinions on IT & business topics through brief surveys-and earn cash
>>http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
>
>
>>_______________________________________________
>>iText-questions mailing list
>>[email protected]
>>https://lists.sourceforge.net/lists/listinfo/itext-questions
>>Buy the iText book: http://itext.ugent.be/itext-in-action/
>
> _________________________________________________________________
> Get a FREE Web site, company branded e-mail and more from Microsoft Office
> Live! http://clk.atdmt.com/MRT/go/mcrssaub0050001411mrt/direct/01/
>


--------------------------------------------------------------------------------


> -------------------------------------------------------------------------
> Take Surveys. Earn Cash. Influence the Future of IT
> Join SourceForge.net's Techsay panel and you'll get the chance to share 
> your
> opinions on IT & business topics through brief surveys-and earn cash
> http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV


--------------------------------------------------------------------------------


> _______________________________________________
> iText-questions mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/itext-questions
> Buy the iText book: http://itext.ugent.be/itext-in-action/
> 


-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
iText-questions mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/itext-questions
Buy the iText book: http://itext.ugent.be/itext-in-action/

Reply via email to