The text pasted from the PDF to the clipboard is correct. It will probably require more investigation but not related to iText.
Paulo ----- Original Message ----- From: "Jake C" <[EMAIL PROTECTED]> To: <[email protected]> Sent: Tuesday, March 20, 2007 6:33 PM Subject: Re: [iText-questions] Can't view text segments in certain PDF files > No, there is actually text there now, but not a single one is > alphanumeric. > I'm pasting in the text that I copied/pasted into notepad: > > ¿½½·¼»²¬ó·²·¬·¿¬·²¹ > «¬·´·¬§ > ª»²¼±®ó«°°´·»¼ ¬¸»®³¿´ó¸§¼®¿«´·½ ½¿´½«´¿ó > ׬ > ·² > °´¿²¬ > ±³·¬¬»¼ô > ¼»ª»´±°»¼ > ¼·®»½¬´§ > °®»½»¼·²¹ > ««¿´´§ > ¾§ > ª»®§ > °´¿²¬ ¼»·¹² > ¿½½·¼»²¬ó·²·¬·¿¬·²¹ > »²¹·²»»®·²¹ > · > ª¿´«¿¾´» > ®·µó¿»³»²¬ °®±½»ô > ¿ > ©±«´¼ > ¿ º±®³¿´´§ ¼±½«³»²¬»¼ > ß²¿´§· > Ú«²½¬·±² Ûª»²¬ > °´¿²¬ > ³·¬·¹¿¬·²¹ > ·²·¬·¿¬·²¹ > ¬®¿²´¿¬»¼ > °»®º±®³·²¹ > °®±ª·¼» > °®»°¿®·²¹ > ³±®» > ¼»¬¿·´»¼ > Ú«²½¬·±² > ¿ ¼·¬·²½¬´§ > °´¿²¬ > »ª»²¬ > °®±ª·¼» ¿ > ¾¿»´·²» > °»®³·¬ ¿ > ¿°°®±¿½¸ > ¾»¬©»»² > ³·¬·¹¿¬·²¹ > °´¿²¬ > ¿ > »ª»²¬«¿´´§ ¼»½±³°±»¼ > ±® > «²¿ª¿·´¿¾·´·¬§ > ¯«¿²¬·¬¿¬·ª»´§ ³»¿«®»¼ò > ½±²¬®«½¬·²¹ > °®»ª»²¬ > ½±²»¯«»²½»ô > ®»´¿¬·±²¸·° > ¾»¬©»»² > ·²ª»²¬±®§ > ³¿·²ó > ¬¿·²»¼ô > ¿½½±³°´·¸»¼ò > ·² > »´·³·²¿¬·²¹ > ¸»¿¬ó®»³±ª¿´ > «½½»º«´´§ ³¿·²¬¿·²»¼ò > ¿ > ¿ > ÔÑÝßò > ½±²·¼»®»¼ > ïò > îò ݱ²¬¿·²³»²¬ ±ª»®°®»«®» > ¾´±©¼±©² ¾§ > > >>From: "Paulo Soares" <[EMAIL PROTECTED]> >>Reply-To: Post all your questions about iText here >><[email protected]> >>To: "Post all your questions about iText here" >><[email protected]> >>Subject: Re: [iText-questions] Can't view text segments in certain PDF >>files >>Date: Tue, 20 Mar 2007 18:02:04 -0000 >> >>See if it works now. >> >>Paulo >> >>----- Original Message ----- From: "Jake C" <[EMAIL PROTECTED]> >>To: <[email protected]> >>Sent: Tuesday, March 20, 2007 4:44 PM >>Subject: [iText-questions] Can't view text segments in certain PDF files >> >> >>>We use an OCR product to generate a PDF from a TIF with the original >>>image >>>plus hidden text, so that you can search/select the text, but only see >>>the >>>originally scanned image. We then use Adobe FlashPaper 2 to turn it into >>>a >>>SWF that can be imbedded in a web page. However, the hidden text is being >>>stripped out of the final SWF, so that it is no longer searchable. Adobe >>>considers this a "limitation" (we consider it a "bug"). Most other OCR >>>software has the same problem as the platform we chose, but there is one >>>that seems to convert to SWF just fine. In an attempt to find out what >>>the >>>difference was between the two files, I tried to use the Tree Viewer from >>>iText to examine the contents of the files. However, when I select the >>>Content node of the one that gets the text stripped out, I don't see >>>anything. If I use the API to try to extract the Stream directly, I get a >>>NullPointerException. >>> >>>So I guess I really have two questions. >>> >>>1) Is there something wrong with how the PDF is constructed that we >>>cannot >>>examine the text content with iText, or is there a bug in iText? >>> >>>2) Is there a way we can manipulate the PDF from the OCR software we >>>chose >>>to make it structurally look like the one that actually keeps the text >>>when >>>converted to SWF? >>> >>>I'm attaching a copy of the two files (0112_094_no_text_select.pdf from >>>our >>>selected OCR product, which we cannot view the text content, and >>>0112_094_text_select.pdf from the other product, which we CAN view the >>>text >>>content, and actually keeps the text in the SWF) in a zip file. >>> >>>OK, it seems I can't attach a file, or the message gets refused. I've >>>uploaded it to http://www.sharebigfile.com/file/116699/0112-094-zip.html >>> >>>_________________________________________________________________ >>>i'm making a difference. Make every IM count for the cause of your >>>choice. >>>Join Now. >>>http://clk.atdmt.com/MSN/go/msnnkwme0080000001msn/direct/01/?href=http://im.live.com/messenger/im/home/?source=hmtagline >>> >>> >>> >> >> >>-------------------------------------------------------------------------------- >> >> >>>------------------------------------------------------------------------- >>>Take Surveys. Earn Cash. Influence the Future of IT >>>Join SourceForge.net's Techsay panel and you'll get the chance to share >>>your >>>opinions on IT & business topics through brief surveys-and earn cash >>>http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV >> >> >>-------------------------------------------------------------------------------- >> >> >>>_______________________________________________ >>>iText-questions mailing list >>>[email protected] >>>https://lists.sourceforge.net/lists/listinfo/itext-questions >>>Buy the iText book: http://itext.ugent.be/itext-in-action/ >>> > > >><< 0112_094_no_text_select_mod.pdf >> > > >>------------------------------------------------------------------------- >>Take Surveys. Earn Cash. Influence the Future of IT >>Join SourceForge.net's Techsay panel and you'll get the chance to share >>your >>opinions on IT & business topics through brief surveys-and earn cash >>http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV > > >>_______________________________________________ >>iText-questions mailing list >>[email protected] >>https://lists.sourceforge.net/lists/listinfo/itext-questions >>Buy the iText book: http://itext.ugent.be/itext-in-action/ > > _________________________________________________________________ > Get a FREE Web site, company branded e-mail and more from Microsoft Office > Live! http://clk.atdmt.com/MRT/go/mcrssaub0050001411mrt/direct/01/ > -------------------------------------------------------------------------------- > ------------------------------------------------------------------------- > Take Surveys. Earn Cash. Influence the Future of IT > Join SourceForge.net's Techsay panel and you'll get the chance to share > your > opinions on IT & business topics through brief surveys-and earn cash > http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV -------------------------------------------------------------------------------- > _______________________________________________ > iText-questions mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/itext-questions > Buy the iText book: http://itext.ugent.be/itext-in-action/ > ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys-and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV _______________________________________________ iText-questions mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/itext-questions Buy the iText book: http://itext.ugent.be/itext-in-action/
