You are asking a FlashPaper question. The working PDF structure that FlashPaper requires must be found and then a tool would be used to create/recreate that structure. FlashPaper is the limitation here. This isn't about iText reading text blocks, which it may or may not read, you have a long way before getting there.
Paulo ----- Original Message ----- From: "Jake C" <[EMAIL PROTECTED]> To: <[email protected]> Sent: Tuesday, March 20, 2007 11:53 PM Subject: Re: [iText-questions] Can't view text segments in certain PDF files > As to that last question, I wasn't asking a FlashPaper question. I wanted > to > modify one PDF structurally to look like another PDF. However, since iText > isn't capable of reading all text blocks, I guess the question is moot. > > >>From: "Paulo Soares" <[EMAIL PROTECTED]> >>Reply-To: Post all your questions about iText here >><[email protected]> >>To: "Post all your questions about iText here" >><[email protected]> >>Subject: Re: [iText-questions] Can't view text segments in certain PDF >>files >>Date: Tue, 20 Mar 2007 22:58:19 -0000 >> >> >>----- Original Message ----- >>From: "Jake C" <[EMAIL PROTECTED]> >>To: <[email protected]> >>Sent: Tuesday, March 20, 2007 7:41 PM >>Subject: Re: [iText-questions] Can't view text segments in certain PDF >>files >> >> >> > What did you do to the original document to create your modified >>version? >> >>Removed the invisible text rendering. >> >> > Why can't the TreeViewPDF tool view the Content of either my original >> > version or your modified version? Is it possible to make the structure >>of >> >>It probably has limitations. You should look at >>http://www.windjack.com/products/pdfcanopener.html. >> >> > one that doesn't convert to FlashPaper to look like the one that DOES >> > convert to FlashPaper using iText? >> > >> >>That's something that can't be done without the Flash environment and it >>goes somewhat above the scope of this mailing list. >> >>Paulo >> >> >>From: "Paulo Soares" <[EMAIL PROTECTED]> >> >>Reply-To: Post all your questions about iText here >> >><[email protected]> >> >>To: "Post all your questions about iText here" >> >><[email protected]> >> >>Subject: Re: [iText-questions] Can't view text segments in certain PDF >> >>files >> >>Date: Tue, 20 Mar 2007 18:50:46 -0000 >> >> >> >>The text pasted from the PDF to the clipboard is correct. It will >>probably >> >>require more investigation but not related to iText. >> >> >> >>Paulo >> >> >> >>----- Original Message ----- >> >>From: "Jake C" <[EMAIL PROTECTED]> >> >>To: <[email protected]> >> >>Sent: Tuesday, March 20, 2007 6:33 PM >> >>Subject: Re: [iText-questions] Can't view text segments in certain PDF >> >>files >> >> >> >> >> >> > No, there is actually text there now, but not a single one is >> >> > alphanumeric. >> >> > I'm pasting in the text that I copied/pasted into notepad: >> >> > >> >> > ¿½½·¼»²¬ó·²·¬·¿¬·²¹ >> >> > «¬·´·¬§ >> >> > ª»²¼±®ó«°°´·»¼ ¬¸»®³¿´ó¸§¼®¿«´·½ ½¿´½«´¿ó >> >> > ׬ >> >> > ·² >> >> > °´¿²¬ >> >> > ±³·¬¬»¼ô >> >> > ¼»ª»´±°»¼ >> >> > ¼·®»½¬´§ >> >> > °®»½»¼·²¹ >> >> > ««¿´´§ >> >> > ¾§ >> >> > ª»®§ >> >> > °´¿²¬ ¼»·¹² >> >> > ¿½½·¼»²¬ó·²·¬·¿¬·²¹ >> >> > »²¹·²»»®·²¹ >> >> > · >> >> > ª¿´«¿¾´» >> >> > ®·µó¿»³»²¬ °®±½»ô >> >> > ¿ >> >> > ©±«´¼ >> >> > ¿ º±®³¿´´§ ¼±½«³»²¬»¼ >> >> > ß²¿´§· >> >> > Ú«²½¬·±² Ûª»²¬ >> >> > °´¿²¬ >> >> > ³·¬·¹¿¬·²¹ >> >> > ·²·¬·¿¬·²¹ >> >> > ¬®¿²´¿¬»¼ >> >> > °»®º±®³·²¹ >> >> > °®±ª·¼» >> >> > °®»°¿®·²¹ >> >> > ³±®» >> >> > ¼»¬¿·´»¼ >> >> > Ú«²½¬·±² >> >> > ¿ ¼·¬·²½¬´§ >> >> > °´¿²¬ >> >> > »ª»²¬ >> >> > °®±ª·¼» ¿ >> >> > ¾¿»´·²» >> >> > °»®³·¬ ¿ >> >> > ¿°°®±¿½¸ >> >> > ¾»¬©»»² >> >> > ³·¬·¹¿¬·²¹ >> >> > °´¿²¬ >> >> > ¿ >> >> > »ª»²¬«¿´´§ ¼»½±³°±»¼ >> >> > ±® >> >> > «²¿ª¿·´¿¾·´·¬§ >> >> > ¯«¿²¬·¬¿¬·ª»´§ ³»¿«®»¼ò >> >> > ½±²¬®«½¬·²¹ >> >> > °®»ª»²¬ >> >> > ½±²»¯«»²½»ô >> >> > ®»´¿¬·±²¸·° >> >> > ¾»¬©»»² >> >> > ·²ª»²¬±®§ >> >> > ³¿·²ó >> >> > ¬¿·²»¼ô >> >> > ¿½½±³°´·¸»¼ò >> >> > ·² >> >> > »´·³·²¿¬·²¹ >> >> > ¸»¿¬ó®»³±ª¿´ >> >> > «½½»º«´´§ ³¿·²¬¿·²»¼ò >> >> > ¿ >> >> > ¿ >> >> > ÔÑÝßò >> >> > ½±²·¼»®»¼ >> >> > ïò >> >> > îò ݱ²¬¿·²³»²¬ ±ª»®°®»«®» >> >> > ¾´±©¼±©² ¾§ >> >> > >> >> > >> >> >>From: "Paulo Soares" <[EMAIL PROTECTED]> >> >> >>Reply-To: Post all your questions about iText here >> >> >><[email protected]> >> >> >>To: "Post all your questions about iText here" >> >> >><[email protected]> >> >> >>Subject: Re: [iText-questions] Can't view text segments in certain >>PDF >> >> >>files >> >> >>Date: Tue, 20 Mar 2007 18:02:04 -0000 >> >> >> >> >> >>See if it works now. >> >> >> >> >> >>Paulo >> >> >> >> >> >>----- Original Message ----- From: "Jake C" >> >> >><[EMAIL PROTECTED]> >> >> >>To: <[email protected]> >> >> >>Sent: Tuesday, March 20, 2007 4:44 PM >> >> >>Subject: [iText-questions] Can't view text segments in certain PDF >> >> >>files >> >> >> >> >> >> >> >> >>>We use an OCR product to generate a PDF from a TIF with the >> >> >>>original >> >> >>>image >> >> >>>plus hidden text, so that you can search/select the text, but only >>see >> >> >>>the >> >> >>>originally scanned image. We then use Adobe FlashPaper 2 to turn it >> >>into >> >> >>>a >> >> >>>SWF that can be imbedded in a web page. However, the hidden text is >> >>being >> >> >>>stripped out of the final SWF, so that it is no longer searchable. >> >>Adobe >> >> >>>considers this a "limitation" (we consider it a "bug"). Most other >>OCR >> >> >>>software has the same problem as the platform we chose, but there >> >> >>>is >> >>one >> >> >>>that seems to convert to SWF just fine. In an attempt to find out >>what >> >> >>>the >> >> >>>difference was between the two files, I tried to use the Tree >> >> >>>Viewer >> >>from >> >> >>>iText to examine the contents of the files. However, when I select >>the >> >> >>>Content node of the one that gets the text stripped out, I don't >> >> >>>see >> >> >>>anything. If I use the API to try to extract the Stream directly, I >> >> >>>get >> >>a >> >> >>>NullPointerException. >> >> >>> >> >> >>>So I guess I really have two questions. >> >> >>> >> >> >>>1) Is there something wrong with how the PDF is constructed that we >> >> >>>cannot >> >> >>>examine the text content with iText, or is there a bug in iText? >> >> >>> >> >> >>>2) Is there a way we can manipulate the PDF from the OCR software >> >> >>>we >> >> >>>chose >> >> >>>to make it structurally look like the one that actually keeps the >>text >> >> >>>when >> >> >>>converted to SWF? >> >> >>> >> >> >>>I'm attaching a copy of the two files (0112_094_no_text_select.pdf >> >> >>>from >> >> >>>our >> >> >>>selected OCR product, which we cannot view the text content, and >> >> >>>0112_094_text_select.pdf from the other product, which we CAN view >>the >> >> >>>text >> >> >>>content, and actually keeps the text in the SWF) in a zip file. >> >> >>> >> >> >>>OK, it seems I can't attach a file, or the message gets refused. >>I've >> >> >>>uploaded it to >> >>http://www.sharebigfile.com/file/116699/0112-094-zip.html >> >> >>> >> >> >>>_________________________________________________________________ >> >> >>>i'm making a difference. Make every IM count for the cause of your >> >> >>>choice. >> >> >>>Join Now. >> >> >> >>>http://clk.atdmt.com/MSN/go/msnnkwme0080000001msn/direct/01/?href=http://im.live.com/messenger/im/home/?source=hmtagline >> >> >>> >> >> >>> >> >> >>> >> >> >> >> >> >> >> >> >> >>-------------------------------------------------------------------------------- >> >> >> >> >> >> >> >> >> >>>------------------------------------------------------------------------- >> >> >>>Take Surveys. Earn Cash. Influence the Future of IT >> >> >>>Join SourceForge.net's Techsay panel and you'll get the chance to >> >> >>>share >> >> >>>your >> >> >>>opinions on IT & business topics through brief surveys-and earn >> >> >>>cash >> >> >> >>>http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV >> >> >> >> >> >> >> >> >> >>-------------------------------------------------------------------------------- >> >> >> >> >> >> >> >> >>>_______________________________________________ >> >> >>>iText-questions mailing list >> >> >>>[email protected] >> >> >>>https://lists.sourceforge.net/lists/listinfo/itext-questions >> >> >>>Buy the iText book: http://itext.ugent.be/itext-in-action/ >> >> >>> >> >> > >> >> > >> >> >><< 0112_094_no_text_select_mod.pdf >> ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys-and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV _______________________________________________ iText-questions mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/itext-questions Buy the iText book: http://itext.ugent.be/itext-in-action/
