Re: Newbie question
Le 4 sept. 09 à 03:34, Dola Woolfe a écrit : I'm trying to put together several elements to build a PDF translator. 1. Load a PDF in a foreign language (???) 2. Translate the content (Google Translate) 3. Output the translated PDF (FOP) So I'm guessing step 1 is not part of FOP. Can you perhaps recommend what I can use for 1.? Thanks again! I think you should try iText. You will find an explanation of what you need near the end of iText in Action, the authoritative book by Bruno Lowagie, the guy who designed iText in the first place. And before proceeding in your project you *should* read the caveats in his book: extracting text content from an existing PDF may not be as straightforward as you think - in fact may be almost nonsense in certain situations. A PDF API will get you the text content in the order it was technically generated, which may not be the textual order (the order you read the elements in a book). My own experience in top of this is that it is very difficult to extract text content from non-European or large fonts (the CID-keyed fonts, roughly said, those who have more than WinAnsi or ISO-8859-1 characters). HTH, Jean-François - To unsubscribe, e-mail: fop-users-unsubscr...@xmlgraphics.apache.org For additional commands, e-mail: fop-users-h...@xmlgraphics.apache.org
Re: Newbie question
John Burgess wrote: It doesn't! That isn't 100% accurate. FOP can parse a PDF included as an external graphic using an extension developed by Jeremias Maerki. Further details and download link to the extension can be found here: http://wiki.apache.org/xmlgraphics-fop/HowTo/EmbeddedPdf Although I realise that isn't quite what the OP asked for. I mention this just for the sake of accuracy for the archives. Thanks, Chris -- John Burgess Risk Decisions Limited Whichford House Parkway Court Oxford Business Park South OX4 2JY T: 01865 718666 F: 01865 718600 M: 07984 863890 E: john.burg...@riskdecisions.com W: http://www.riskdecisions.com - Original Message - *From:* Dola Woolfe dolac...@yahoo.com *To:* fop-users@xmlgraphics.apache.org *Sent:* 04/09/2009 1:02:50 AM +0100 *Subject:* Newbie question I did my homework, but this does not appear to be an FAQ! How does FOP read PDF's? Many thanks in advance, Dola - To unsubscribe, e-mail: fop-users-unsubscr...@xmlgraphics.apache.org For additional commands, e-mail: fop-users-h...@xmlgraphics.apache.org The annual Risk Decisions European User Conference will take place on the 12th November 2009, Oxfordshire - click here http://www.riskdecisions.com to find out more and register to attend. - To unsubscribe, e-mail: fop-users-unsubscr...@xmlgraphics.apache.org For additional commands, e-mail: fop-users-h...@xmlgraphics.apache.org
Re: Newbie question
Thank you. (Sounds like more than the 1 hour I was allocating for it.) - Original Message From: Jean-François El Fouly jean-franc...@elfouly.fr To: fop-users@xmlgraphics.apache.org Sent: Friday, September 4, 2009 3:44:55 AM Subject: Re: Newbie question Le 4 sept. 09 à 03:34, Dola Woolfe a écrit : I'm trying to put together several elements to build a PDF translator. 1. Load a PDF in a foreign language (???) 2. Translate the content (Google Translate) 3. Output the translated PDF (FOP) So I'm guessing step 1 is not part of FOP. Can you perhaps recommend what I can use for 1.? Thanks again! I think you should try iText. You will find an explanation of what you need near the end of iText in Action, the authoritative book by Bruno Lowagie, the guy who designed iText in the first place. And before proceeding in your project you *should* read the caveats in his book: extracting text content from an existing PDF may not be as straightforward as you think - in fact may be almost nonsense in certain situations. A PDF API will get you the text content in the order it was technically generated, which may not be the textual order (the order you read the elements in a book). My own experience in top of this is that it is very difficult to extract text content from non-European or large fonts (the CID-keyed fonts, roughly said, those who have more than WinAnsi or ISO-8859-1 characters). HTH, Jean-François - To unsubscribe, e-mail: fop-users-unsubscr...@xmlgraphics.apache.org For additional commands, e-mail: fop-users-h...@xmlgraphics.apache.org - To unsubscribe, e-mail: fop-users-unsubscr...@xmlgraphics.apache.org For additional commands, e-mail: fop-users-h...@xmlgraphics.apache.org
Re: Newbie question
On 04.09.2009 15:22, Dola Woolfe wrote: (Sounds like more than the 1 hour I was allocating for it.) PDF as a format isn't meant to be parsed for advanced text processing, it was designed for presentation. PDF generators could make your job of parsing text out of the file arbitrarily hard. As an extreme (and rather theoretical) example, a PDF could contain two text streams Tiset and hsiatx, with embedded positioning commands, which reads on the screen as This is a text. In any case, even putting up reasonable guards against running into out-of-order text blocks will take a few days, unless you find a ready-to-use library for this task (no, I don't have pointers). If you can, try to get your source text in a more processing-friendly format, like DocBook XML. J.Pietschmann - To unsubscribe, e-mail: fop-users-unsubscr...@xmlgraphics.apache.org For additional commands, e-mail: fop-users-h...@xmlgraphics.apache.org
Newbie question
I did my homework, but this does not appear to be an FAQ! How does FOP read PDF's? Many thanks in advance, Dola - To unsubscribe, e-mail: fop-users-unsubscr...@xmlgraphics.apache.org For additional commands, e-mail: fop-users-h...@xmlgraphics.apache.org
RE: Newbie question
Hey Dola, Try http://xmlgraphics.apache.org/fop/faq.html What do you mean 'read PDFs' FOP is predominately for generating formats such as PDF, PCL, Postscript and AFP Thanks Martin. -Original Message- From: Dola Woolfe [mailto:dolac...@yahoo.com] Sent: Friday, 4 September 2009 10:03 AM To: fop-users@xmlgraphics.apache.org Subject: Newbie question I did my homework, but this does not appear to be an FAQ! How does FOP read PDF's? Many thanks in advance, Dola - To unsubscribe, e-mail: fop-users-unsubscr...@xmlgraphics.apache.org For additional commands, e-mail: fop-users-h...@xmlgraphics.apache.org - To unsubscribe, e-mail: fop-users-unsubscr...@xmlgraphics.apache.org For additional commands, e-mail: fop-users-h...@xmlgraphics.apache.org
Re: Newbie question
Hello Dola, I did my homework, but this does not appear to be an FAQ! How does FOP read PDF's? It doesn't, at least not to my knowledge. It reads Formatting Objects files (typical extension: .fo) and *produces* PDF and other formats. Paul Vinkenoog - To unsubscribe, e-mail: fop-users-unsubscr...@xmlgraphics.apache.org For additional commands, e-mail: fop-users-h...@xmlgraphics.apache.org
RE: Newbie question
and predominately is spelt predominantly (whoops)... -Original Message- From: Martin Edge [mailto:martin.e...@asmorphic.net.au] Sent: Friday, 4 September 2009 10:05 AM To: fop-users@xmlgraphics.apache.org Subject: RE: Newbie question Hey Dola, Try http://xmlgraphics.apache.org/fop/faq.html What do you mean 'read PDFs' FOP is predominately for generating formats such as PDF, PCL, Postscript and AFP Thanks Martin. -Original Message- From: Dola Woolfe [mailto:dolac...@yahoo.com] Sent: Friday, 4 September 2009 10:03 AM To: fop-users@xmlgraphics.apache.org Subject: Newbie question I did my homework, but this does not appear to be an FAQ! How does FOP read PDF's? Many thanks in advance, Dola - To unsubscribe, e-mail: fop-users-unsubscr...@xmlgraphics.apache.org For additional commands, e-mail: fop-users-h...@xmlgraphics.apache.org - To unsubscribe, e-mail: fop-users-unsubscr...@xmlgraphics.apache.org For additional commands, e-mail: fop-users-h...@xmlgraphics.apache.org - To unsubscribe, e-mail: fop-users-unsubscr...@xmlgraphics.apache.org For additional commands, e-mail: fop-users-h...@xmlgraphics.apache.org
Re: Newbie question
I'm trying to put together several elements to build a PDF translator. 1. Load a PDF in a foreign language (???) 2. Translate the content (Google Translate) 3. Output the translated PDF (FOP) So I'm guessing step 1 is not part of FOP. Can you perhaps recommend what I can use for 1.? Thanks again! - Original Message From: Martin Edge martin.e...@asmorphic.net.au To: fop-users@xmlgraphics.apache.org Sent: Thursday, September 3, 2009 8:08:28 PM Subject: RE: Newbie question and predominately is spelt predominantly (whoops)... -Original Message- From: Martin Edge [mailto:martin.e...@asmorphic.net.au] Sent: Friday, 4 September 2009 10:05 AM To: fop-users@xmlgraphics.apache.org Subject: RE: Newbie question Hey Dola, Try http://xmlgraphics.apache.org/fop/faq.html What do you mean 'read PDFs' FOP is predominately for generating formats such as PDF, PCL, Postscript and AFP Thanks Martin. -Original Message- From: Dola Woolfe [mailto:dolac...@yahoo.com] Sent: Friday, 4 September 2009 10:03 AM To: fop-users@xmlgraphics.apache.org Subject: Newbie question I did my homework, but this does not appear to be an FAQ! How does FOP read PDF's? Many thanks in advance, Dola - To unsubscribe, e-mail: fop-users-unsubscr...@xmlgraphics.apache.org For additional commands, e-mail: fop-users-h...@xmlgraphics.apache.org - To unsubscribe, e-mail: fop-users-unsubscr...@xmlgraphics.apache.org For additional commands, e-mail: fop-users-h...@xmlgraphics.apache.org - To unsubscribe, e-mail: fop-users-unsubscr...@xmlgraphics.apache.org For additional commands, e-mail: fop-users-h...@xmlgraphics.apache.org - To unsubscribe, e-mail: fop-users-unsubscr...@xmlgraphics.apache.org For additional commands, e-mail: fop-users-h...@xmlgraphics.apache.org
Newbie question for incrementing a number
Hi Guys, This is a newbie question. I have an XSL which reads in some of value from xml using Document() function and some some XSL:PARM. I'm using fo:block to display the content in a block. Simply what I want to do is create a counter that will increment for my each FO:BLOCKi.e. fo:block display counter value xsl:valueOf select=$param1/ /fo:block fo:block display counter value xsl:valueOf select=document(somexml.xml)/note1/ /fo:block etc. Please help? - I've tried the below code but it doesn't work: xsl:parm name=counter select=1/ some template fo:block xsl:with-parm name=counter select=$counter + 1/ xsl:valueOf select=$counter/ - this return 1 instead of 2 /some template -- View this message in context: http://www.nabble.com/Newbie-question-for-incrementing-a-number-tf4075495.html#a11582913 Sent from the FOP - Users mailing list archive at Nabble.com. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Newbie question for incrementing a number
Check the archives for this group or try: http://www.dpawson.co.uk/xsl/sect2/N4806.html Oh, and try not to cross post. -Lou alphamanic [EMAIL PROTECTED] wrote on 07/13/2007 12:29:33 PM: Hi Guys, This is a newbie question. I have an XSL which reads in some of value from xml using Document() function and some some XSL:PARM. I'm using fo:block to display the content in a block. Simply what I want to do is create a counter that will increment for my each FO:BLOCKi.e. fo:block display counter value xsl:valueOf select=$param1/ /fo:block fo:block display counter value xsl:valueOf select=document(somexml.xml)/note1/ /fo:block etc. Please help? - I've tried the below code but it doesn't work: xsl:parm name=counter select=1/ some template fo:block xsl:with-parm name=counter select=$counter + 1/ xsl:valueOf select=$counter/ - this return 1 instead of 2 /some template -- View this message in context: http://www.nabble.com/Newbie-question- for-incrementing-a-number-tf4075495.html#a11582913 Sent from the FOP - Users mailing list archive at Nabble.com. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Newbie question for incrementing a number
You cannot increment a number using (x = x + 1) as you can in most languages, however you can use the position() function as a counter in many cases. For example: xsl:valueOf select=position()/ The alternative to this is to use a recursive template, which is a bit more tricky. Hope that helps, Trevor. alphamanic wrote: Hi Guys, This is a newbie question. I have an XSL which reads in some of value from xml using Document() function and some some XSL:PARM. I'm using fo:block to display the content in a block. Simply what I want to do is create a counter that will increment for my each FO:BLOCKi.e. fo:block display counter value xsl:valueOf select=$param1/ /fo:block fo:block display counter value xsl:valueOf select=document(somexml.xml)/note1/ /fo:block etc. Please help? - I've tried the below code but it doesn't work: xsl:parm name=counter select=1/ some template fo:block xsl:with-parm name=counter select=$counter + 1/ xsl:valueOf select=$counter/ - this return 1 instead of 2 /some template -- Trevor Keast Client Server Specialists Inc. Email: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: XSL FO newbie question - Hide overflow page content
Hi, IMHO, if fo:page-sequence refers to fo:page-sequence-master with not enough page to render the content, then the pages are repeated (I have not found expected behaviour in REC XSL-FO 1.0 for such case). If you want to truncate content to get a single page, you shoud try to put your content i a fo:block-container in absolute position. Logs should inform you that is overlapping content. HTH, Pascal -Message d'origine- De : Peter [mailto:[EMAIL PROTECTED] Envoyé : jeudi 18 janvier 2007 08:48 Hello, I tried the following with fop 0.93 fo:root xmlns:fo=http://www.w3.org/1999/XSL/Format; fo:layout-master-set fo:simple-page-master master-name=page page-height=50pt page-width=200pt fo:region-body/fo:region-body /fo:simple-page-master fo:page-sequence-master master-name=single fo:repeatable-page-master-reference master-reference=page maximum-repeats=1/fo:repeatable-page-master-reference /fo:page-sequence-master /fo:layout-master-set fo:page-sequence master-reference=single fo:flow flow-name=xsl-region-body fo:block font-size=28pt linefeed-treatment=preserveLine Line Line/fo:block /fo:flow /fo:page-sequence /fo:root Which results in fop -fo c:\temp\t.fo -pdf c:\temp\t.pdf Jan 18, 2007 8:34:17 AM org.apache.fop.fo.pagination.PageSequenceMaster getNextSimplePageMaster WARNING: subsequences exhausted in page-sequence-master 'single', using previous subsequence Jan 18, 2007 8:34:17 AM org.apache.fop.fo.pagination.PageSequenceMaster getNextSimplePageMaster WARNING: subsequences exhausted in page-sequence-master 'single', using previous subsequence And 3 pages in t.pdf. Anyone any thoughts on what I am doing wrong? Not sure what it tells but XEP 4.5 results in XEP 4.5 build 20060313 (document [system-id file:/C:/DOCUME~1/pc/LOCALS~1/Temp/pro3B6.xml] (validate [validation OK]) (compile (masters (sequence-master [master-name page]) (sequence-master [master-name single])) (sequence [master-reference single] (flow [flow-name xsl-region-body]))) (format (sequence [master-reference single] (flow [1] [error] com.renderx.xep.cmp.NoPageMasterException: state: rest filled even ) (static-content [1]))) (generate [output-format pdf][1])) And a single page pdf All suggestions or guidance warmly welcomed! Thanks, Peter -Original Message- From: Nicol Bolas [mailto:[EMAIL PROTECTED] Sent: Thursday, January 18, 2007 12:56 AM This is pretty easy, though it won't get you any warnings. Make a 1-length page-sequence-master. That is, make a regular page-sequence-master and then use it in a repeatable-page-master, but only with 1 repetition. That will force the output to be one page. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: XSL FO newbie question - Hide overflow page content
Thanks for all the help. block-container it will be then. Peter -Original Message- From: Vincent Hennebert [mailto:[EMAIL PROTECTED] Sent: Thursday, January 18, 2007 9:42 AM To: fop-users@xmlgraphics.apache.org Subject: Re: XSL FO newbie question - Hide overflow page content Pascal Sancho a écrit : Hi, IMHO, if fo:page-sequence refers to fo:page-sequence-master with not enough page to render the content, then the pages are repeated (I have not found expected behaviour in REC XSL-FO 1.0 for such case). That's in section 6.4.7 fo:page-sequence-master of the 1.0 spec: It is an error if the entire sequence of sub-sequence-specifiers children is exhausted while some areas returned by an fo:flow are not placed. Implementations may recover, if possible, by re-using the sub-sequence-specifier that was last used to generate a page. snip/ Vincent - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
XSL FO newbie question - Hide overflow page content
Hello, Upfront apologies for the 101 nature of this question. I want to make sure an xsl-fo stylesheet will only generate a single page. Possible overflow should be hidden with a warning or, if possible, an error should be generated. I found how fo(p) can hide overflow with a block-container object, but I would rather not use this if not needed. Thanks, Peter
Re: XSL FO newbie question - Hide overflow page content
This is pretty easy, though it won't get you any warnings. Make a 1-length page-sequence-master. That is, make a regular page-sequence-master and then use it in a repeatable-page-master, but only with 1 repetition. That will force the output to be one page. -- View this message in context: http://www.nabble.com/XSL-FO-newbie-question---Hide-overflow-page-content-tf3031234.html#a8422453 Sent from the FOP - Users mailing list archive at Nabble.com. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: XSL FO newbie question - Hide overflow page content
Hello, I tried the following with fop 0.93 fo:root xmlns:fo=http://www.w3.org/1999/XSL/Format; fo:layout-master-set fo:simple-page-master master-name=page page-height=50pt page-width=200pt fo:region-body/fo:region-body /fo:simple-page-master fo:page-sequence-master master-name=single fo:repeatable-page-master-reference master-reference=page maximum-repeats=1/fo:repeatable-page-master-reference /fo:page-sequence-master /fo:layout-master-set fo:page-sequence master-reference=single fo:flow flow-name=xsl-region-body fo:block font-size=28pt linefeed-treatment=preserveLine Line Line/fo:block /fo:flow /fo:page-sequence /fo:root Which results in fop -fo c:\temp\t.fo -pdf c:\temp\t.pdf Jan 18, 2007 8:34:17 AM org.apache.fop.fo.pagination.PageSequenceMaster getNextSimplePageMaster WARNING: subsequences exhausted in page-sequence-master 'single', using previous subsequence Jan 18, 2007 8:34:17 AM org.apache.fop.fo.pagination.PageSequenceMaster getNextSimplePageMaster WARNING: subsequences exhausted in page-sequence-master 'single', using previous subsequence And 3 pages in t.pdf. Anyone any thoughts on what I am doing wrong? Not sure what it tells but XEP 4.5 results in XEP 4.5 build 20060313 (document [system-id file:/C:/DOCUME~1/pc/LOCALS~1/Temp/pro3B6.xml] (validate [validation OK]) (compile (masters (sequence-master [master-name page]) (sequence-master [master-name single])) (sequence [master-reference single] (flow [flow-name xsl-region-body]))) (format (sequence [master-reference single] (flow [1] [error] com.renderx.xep.cmp.NoPageMasterException: state: rest filled even ) (static-content [1]))) (generate [output-format pdf][1])) And a single page pdf All suggestions or guidance warmly welcomed! Thanks, Peter -Original Message- From: Nicol Bolas [mailto:[EMAIL PROTECTED] Sent: Thursday, January 18, 2007 12:56 AM To: fop-users@xmlgraphics.apache.org Subject: Re: XSL FO newbie question - Hide overflow page content This is pretty easy, though it won't get you any warnings. Make a 1-length page-sequence-master. That is, make a regular page-sequence-master and then use it in a repeatable-page-master, but only with 1 repetition. That will force the output to be one page. -- View this message in context: http://www.nabble.com/XSL-FO-newbie- question---Hide-overflow-page-content-tf3031234.html#a8422453 Sent from the FOP - Users mailing list archive at Nabble.com. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]