Re: Newbie question

2009-09-04 Thread Jean-François El Fouly


Le 4 sept. 09 à 03:34, Dola Woolfe a écrit :


I'm trying to put together several elements to build a PDF translator.

1. Load a PDF in a foreign language (???)
2. Translate the content (Google Translate)
3. Output the translated PDF (FOP)

So I'm guessing step 1 is not part of FOP. Can you perhaps recommend  
what I can use for 1.?


Thanks again!


I think you should try iText. You will find an explanation of what you  
need near the end of iText in Action, the authoritative book by  
Bruno Lowagie, the guy who designed iText in the first place. And  
before proceeding in your project you *should* read the caveats in his  
book: extracting text content from an existing PDF may not be as  
straightforward as you think - in fact may be almost nonsense in  
certain situations. A PDF API will get you the text content in the  
order it was technically generated, which may not be the textual  
order (the order you read the elements in a book).
My own experience in top of this is that it is very difficult to  
extract text content from non-European or large fonts (the CID-keyed  
fonts, roughly said, those who have more than WinAnsi or ISO-8859-1  
characters).


HTH,

Jean-François
-
To unsubscribe, e-mail: fop-users-unsubscr...@xmlgraphics.apache.org
For additional commands, e-mail: fop-users-h...@xmlgraphics.apache.org



Re: Newbie question

2009-09-04 Thread Chris Bowditch

John Burgess wrote:

It doesn't!


That isn't 100% accurate. FOP can parse a PDF included as an external 
graphic using an extension developed by Jeremias Maerki. Further details 
and download link to the extension can be found here: 
http://wiki.apache.org/xmlgraphics-fop/HowTo/EmbeddedPdf


Although I realise that isn't quite what the OP asked for. I mention 
this just for the sake of accuracy for the archives.


Thanks,

Chris



--
John Burgess
Risk Decisions Limited
Whichford House
Parkway Court
Oxford Business Park South
OX4 2JY

T: 01865 718666
F: 01865 718600
M: 07984 863890
E: john.burg...@riskdecisions.com
W: http://www.riskdecisions.com



- Original Message -
*From:* Dola Woolfe dolac...@yahoo.com
*To:* fop-users@xmlgraphics.apache.org
*Sent:* 04/09/2009 1:02:50 AM +0100
*Subject:* Newbie question



I did my homework, but this does not appear to be an FAQ!

How does FOP read PDF's?

Many thanks in advance,

Dola


  


-
To unsubscribe, e-mail: fop-users-unsubscr...@xmlgraphics.apache.org
For additional commands, e-mail: fop-users-h...@xmlgraphics.apache.org

  


The annual Risk Decisions European User Conference will take place on 
the 12th November 2009, Oxfordshire -
click here http://www.riskdecisions.com to find out more and register 
to attend.



-
To unsubscribe, e-mail: fop-users-unsubscr...@xmlgraphics.apache.org
For additional commands, e-mail: fop-users-h...@xmlgraphics.apache.org



Re: Newbie question

2009-09-04 Thread Dola Woolfe


Thank you.

(Sounds like more than the 1 hour  I was allocating for it.)



- Original Message 
From: Jean-François El Fouly jean-franc...@elfouly.fr
To: fop-users@xmlgraphics.apache.org
Sent: Friday, September 4, 2009 3:44:55 AM
Subject: Re: Newbie question


Le 4 sept. 09 à 03:34, Dola Woolfe a écrit :

 I'm trying to put together several elements to build a PDF translator.
 
 1. Load a PDF in a foreign language (???)
 2. Translate the content (Google Translate)
 3. Output the translated PDF (FOP)
 
 So I'm guessing step 1 is not part of FOP. Can you perhaps recommend what I 
 can use for 1.?
 
 Thanks again!

I think you should try iText. You will find an explanation of what you need 
near the end of iText in Action, the authoritative book by Bruno Lowagie, the 
guy who designed iText in the first place. And before proceeding in your 
project you *should* read the caveats in his book: extracting text content from 
an existing PDF may not be as straightforward as you think - in fact may be 
almost nonsense in certain situations. A PDF API will get you the text content 
in the order it was technically generated, which may not be the textual order 
(the order you read the elements in a book).
My own experience in top of this is that it is very difficult to extract text 
content from non-European or large fonts (the CID-keyed fonts, roughly said, 
those who have more than WinAnsi or ISO-8859-1 characters).

HTH,

Jean-François
-
To unsubscribe, e-mail: fop-users-unsubscr...@xmlgraphics.apache.org
For additional commands, e-mail: fop-users-h...@xmlgraphics.apache.org




-
To unsubscribe, e-mail: fop-users-unsubscr...@xmlgraphics.apache.org
For additional commands, e-mail: fop-users-h...@xmlgraphics.apache.org



Re: Newbie question

2009-09-04 Thread J.Pietschmann

On 04.09.2009 15:22, Dola Woolfe wrote:

(Sounds like more than the 1 hour  I was allocating for it.)


PDF as a format isn't meant to be parsed for advanced text processing,
it was designed for presentation. PDF generators could make your job
of parsing text out of the file arbitrarily hard. As an extreme (and 
rather theoretical) example, a PDF could contain two text streams

Tiset and hsiatx, with embedded positioning commands, which
reads on the screen as This is a text. In any case, even putting
up reasonable guards against running into out-of-order text blocks
will take a few days, unless you find a ready-to-use library for
this task (no, I don't have pointers).

If you can, try to get your source text in a more processing-friendly
format, like DocBook XML.

J.Pietschmann

-
To unsubscribe, e-mail: fop-users-unsubscr...@xmlgraphics.apache.org
For additional commands, e-mail: fop-users-h...@xmlgraphics.apache.org



Newbie question

2009-09-03 Thread Dola Woolfe
I did my homework, but this does not appear to be an FAQ!

How does FOP read PDF's?

Many thanks in advance,

Dola


  

-
To unsubscribe, e-mail: fop-users-unsubscr...@xmlgraphics.apache.org
For additional commands, e-mail: fop-users-h...@xmlgraphics.apache.org



RE: Newbie question

2009-09-03 Thread Martin Edge
Hey Dola,

Try http://xmlgraphics.apache.org/fop/faq.html 

What do you mean 'read PDFs' FOP is predominately for generating formats
such as PDF, PCL, Postscript and AFP

Thanks
Martin.


-Original Message-
From: Dola Woolfe [mailto:dolac...@yahoo.com] 
Sent: Friday, 4 September 2009 10:03 AM
To: fop-users@xmlgraphics.apache.org
Subject: Newbie question

I did my homework, but this does not appear to be an FAQ!

How does FOP read PDF's?

Many thanks in advance,

Dola


  

-
To unsubscribe, e-mail: fop-users-unsubscr...@xmlgraphics.apache.org
For additional commands, e-mail: fop-users-h...@xmlgraphics.apache.org



-
To unsubscribe, e-mail: fop-users-unsubscr...@xmlgraphics.apache.org
For additional commands, e-mail: fop-users-h...@xmlgraphics.apache.org



Re: Newbie question

2009-09-03 Thread Paul Vinkenoog
Hello Dola,

 I did my homework, but this does not appear to be an FAQ!

 How does FOP read PDF's?

It doesn't, at least not to my knowledge. It reads Formatting Objects
files (typical extension: .fo) and *produces* PDF and other formats.


Paul Vinkenoog

-
To unsubscribe, e-mail: fop-users-unsubscr...@xmlgraphics.apache.org
For additional commands, e-mail: fop-users-h...@xmlgraphics.apache.org



RE: Newbie question

2009-09-03 Thread Martin Edge
 and predominately is spelt predominantly (whoops)...

-Original Message-
From: Martin Edge [mailto:martin.e...@asmorphic.net.au] 
Sent: Friday, 4 September 2009 10:05 AM
To: fop-users@xmlgraphics.apache.org
Subject: RE: Newbie question

Hey Dola,

Try http://xmlgraphics.apache.org/fop/faq.html 

What do you mean 'read PDFs' FOP is predominately for generating formats
such as PDF, PCL, Postscript and AFP

Thanks
Martin.


-Original Message-
From: Dola Woolfe [mailto:dolac...@yahoo.com] 
Sent: Friday, 4 September 2009 10:03 AM
To: fop-users@xmlgraphics.apache.org
Subject: Newbie question

I did my homework, but this does not appear to be an FAQ!

How does FOP read PDF's?

Many thanks in advance,

Dola


  

-
To unsubscribe, e-mail: fop-users-unsubscr...@xmlgraphics.apache.org
For additional commands, e-mail: fop-users-h...@xmlgraphics.apache.org



-
To unsubscribe, e-mail: fop-users-unsubscr...@xmlgraphics.apache.org
For additional commands, e-mail: fop-users-h...@xmlgraphics.apache.org



-
To unsubscribe, e-mail: fop-users-unsubscr...@xmlgraphics.apache.org
For additional commands, e-mail: fop-users-h...@xmlgraphics.apache.org



Re: Newbie question

2009-09-03 Thread Dola Woolfe
I'm trying to put together several elements to build a PDF translator.

1. Load a PDF in a foreign language (???)
2. Translate the content (Google Translate)
3. Output the translated PDF (FOP)

So I'm guessing step 1 is not part of FOP. Can you perhaps recommend what I can 
use for 1.?

Thanks again!


- Original Message 
From: Martin Edge martin.e...@asmorphic.net.au
To: fop-users@xmlgraphics.apache.org
Sent: Thursday, September 3, 2009 8:08:28 PM
Subject: RE: Newbie question

 and predominately is spelt predominantly (whoops)...

-Original Message-
From: Martin Edge [mailto:martin.e...@asmorphic.net.au] 
Sent: Friday, 4 September 2009 10:05 AM
To: fop-users@xmlgraphics.apache.org
Subject: RE: Newbie question

Hey Dola,

Try http://xmlgraphics.apache.org/fop/faq.html 

What do you mean 'read PDFs' FOP is predominately for generating formats
such as PDF, PCL, Postscript and AFP

Thanks
Martin.


-Original Message-
From: Dola Woolfe [mailto:dolac...@yahoo.com] 
Sent: Friday, 4 September 2009 10:03 AM
To: fop-users@xmlgraphics.apache.org
Subject: Newbie question

I did my homework, but this does not appear to be an FAQ!

How does FOP read PDF's?

Many thanks in advance,

Dola


  

-
To unsubscribe, e-mail: fop-users-unsubscr...@xmlgraphics.apache.org
For additional commands, e-mail: fop-users-h...@xmlgraphics.apache.org



-
To unsubscribe, e-mail: fop-users-unsubscr...@xmlgraphics.apache.org
For additional commands, e-mail: fop-users-h...@xmlgraphics.apache.org



-
To unsubscribe, e-mail: fop-users-unsubscr...@xmlgraphics.apache.org
For additional commands, e-mail: fop-users-h...@xmlgraphics.apache.org


  

-
To unsubscribe, e-mail: fop-users-unsubscr...@xmlgraphics.apache.org
For additional commands, e-mail: fop-users-h...@xmlgraphics.apache.org



Newbie question for incrementing a number

2007-07-13 Thread alphamanic

Hi Guys,

This is a newbie question.

I have an XSL which reads in some of value from xml using Document()
function and some some XSL:PARM. I'm using fo:block to display the
content in a block.
Simply what I want to do is create a counter that will increment for my each
FO:BLOCKi.e.

fo:block
display counter value
xsl:valueOf select=$param1/
/fo:block

fo:block
display counter value
xsl:valueOf select=document(somexml.xml)/note1/
/fo:block

etc.

Please help? - I've tried the below code but it doesn't work:
xsl:parm name=counter select=1/
some template
fo:block
xsl:with-parm name=counter select=$counter + 1/
xsl:valueOf select=$counter/ - this return 1 instead of 2
/some template
-- 
View this message in context: 
http://www.nabble.com/Newbie-question-for-incrementing-a-number-tf4075495.html#a11582913
Sent from the FOP - Users mailing list archive at Nabble.com.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Newbie question for incrementing a number

2007-07-13 Thread Louis . Masters
Check the archives for this group or try: 
http://www.dpawson.co.uk/xsl/sect2/N4806.html

Oh, and try not to cross post.

-Lou

alphamanic [EMAIL PROTECTED] wrote on 07/13/2007 12:29:33 PM:

 
 Hi Guys,
 
 This is a newbie question.
 
 I have an XSL which reads in some of value from xml using Document()
 function and some some XSL:PARM. I'm using fo:block to display the
 content in a block.
 Simply what I want to do is create a counter that will increment for my 
each
 FO:BLOCKi.e.
 
 fo:block
 display counter value
 xsl:valueOf select=$param1/
 /fo:block
 
 fo:block
 display counter value
 xsl:valueOf select=document(somexml.xml)/note1/
 /fo:block
 
 etc.
 
 Please help? - I've tried the below code but it doesn't work:
 xsl:parm name=counter select=1/
 some template
 fo:block
 xsl:with-parm name=counter select=$counter + 1/
 xsl:valueOf select=$counter/ - this return 1 instead of 2
 /some template
 -- 
 View this message in context: http://www.nabble.com/Newbie-question-
 for-incrementing-a-number-tf4075495.html#a11582913
 Sent from the FOP - Users mailing list archive at Nabble.com.
 
 
 -
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]
 


Re: Newbie question for incrementing a number

2007-07-13 Thread Trevor Keast
You cannot increment a number using (x = x + 1) as you can in most 
languages, however you can use the position() function as a counter in 
many cases.


For example:

xsl:valueOf select=position()/

The alternative to this is to use a recursive template, which is a bit 
more tricky.


Hope that helps,

Trevor.

alphamanic wrote:

Hi Guys,

This is a newbie question.

I have an XSL which reads in some of value from xml using Document()
function and some some XSL:PARM. I'm using fo:block to display the
content in a block.
Simply what I want to do is create a counter that will increment for my each
FO:BLOCKi.e.

fo:block
display counter value
xsl:valueOf select=$param1/
/fo:block

fo:block
display counter value
xsl:valueOf select=document(somexml.xml)/note1/
/fo:block

etc.

Please help? - I've tried the below code but it doesn't work:
xsl:parm name=counter select=1/
some template
fo:block
xsl:with-parm name=counter select=$counter + 1/
xsl:valueOf select=$counter/ - this return 1 instead of 2
/some template


--
Trevor Keast
Client Server Specialists Inc.

Email: [EMAIL PROTECTED]

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



RE: XSL FO newbie question - Hide overflow page content

2007-01-18 Thread Pascal Sancho
Hi,

IMHO, if fo:page-sequence refers to fo:page-sequence-master with not enough 
page to render the content, then the pages are repeated (I have not found 
expected behaviour in REC XSL-FO 1.0 for such case).

If you want to truncate content to get a single page, you shoud try to put your 
content i a fo:block-container in absolute position.

Logs should inform you that is overlapping content.

HTH,

Pascal

 -Message d'origine-
 De : Peter [mailto:[EMAIL PROTECTED] 
 Envoyé : jeudi 18 janvier 2007 08:48
 
 Hello,
 
 I tried the following with fop 0.93 
 
 fo:root xmlns:fo=http://www.w3.org/1999/XSL/Format;
   fo:layout-master-set
 fo:simple-page-master master-name=page 
   page-height=50pt page-width=200pt
   fo:region-body/fo:region-body
 /fo:simple-page-master
 fo:page-sequence-master master-name=single
   fo:repeatable-page-master-reference master-reference=page 
  maximum-repeats=1/fo:repeatable-page-master-reference
 /fo:page-sequence-master
   /fo:layout-master-set
   fo:page-sequence master-reference=single
 fo:flow flow-name=xsl-region-body
   fo:block font-size=28pt linefeed-treatment=preserveLine
 Line
 Line/fo:block
 /fo:flow
   /fo:page-sequence
 /fo:root
 
 Which results in 
 
 fop -fo c:\temp\t.fo -pdf c:\temp\t.pdf
 Jan 18, 2007 8:34:17 AM 
 org.apache.fop.fo.pagination.PageSequenceMaster
 getNextSimplePageMaster
 WARNING: subsequences exhausted in page-sequence-master 
 'single', using previous subsequence Jan 18, 2007 8:34:17 AM 
 org.apache.fop.fo.pagination.PageSequenceMaster
 getNextSimplePageMaster
 WARNING: subsequences exhausted in page-sequence-master 
 'single', using previous subsequence
 
 And 3 pages in t.pdf.
 
 Anyone any thoughts on what I am doing wrong?
 
 
 Not sure what it tells but XEP 4.5 results in 
 
 XEP 4.5 build 20060313
 (document [system-id file:/C:/DOCUME~1/pc/LOCALS~1/Temp/pro3B6.xml]
   (validate [validation OK])
   (compile 
 (masters 
   (sequence-master [master-name page])
   (sequence-master [master-name single]))
 (sequence [master-reference single]
   (flow [flow-name xsl-region-body])))
   (format 
 (sequence [master-reference single]
   (flow [1]
 [error] com.renderx.xep.cmp.NoPageMasterException: 
 state: rest filled even
   )
   (static-content [1])))
   (generate [output-format pdf][1]))
 
 
 And a single page pdf
 
 All suggestions or guidance warmly welcomed!
 Thanks,
 
 Peter
 
  -Original Message-
  From: Nicol Bolas [mailto:[EMAIL PROTECTED]
  Sent: Thursday, January 18, 2007 12:56 AM
  
  This is pretty easy, though it won't get you any warnings.
  
  Make a 1-length page-sequence-master. That is, make a regular 
  page-sequence-master and then use it in a 
 repeatable-page-master, but 
  only with 1 repetition. That will force the output to be one page.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



RE: XSL FO newbie question - Hide overflow page content

2007-01-18 Thread Peter
Thanks for all the help. block-container it will be then.

Peter

 -Original Message-
 From: Vincent Hennebert [mailto:[EMAIL PROTECTED]
 Sent: Thursday, January 18, 2007 9:42 AM
 To: fop-users@xmlgraphics.apache.org
 Subject: Re: XSL FO newbie question - Hide overflow page content
 
 Pascal Sancho a écrit :
  Hi,
 
  IMHO, if fo:page-sequence refers to fo:page-sequence-master with not
 enough page to render the content, then the pages are repeated (I have not
 found expected behaviour in REC XSL-FO 1.0 for such case).
 
 That's in section 6.4.7 fo:page-sequence-master of the 1.0 spec:
 It is an error if the entire sequence of sub-sequence-specifiers
 children is exhausted while some areas returned by an fo:flow are not
 placed. Implementations may recover, if possible, by re-using the
 sub-sequence-specifier that was last used to generate a page.
 
 snip/
 
 
 Vincent
 
 
 -
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



XSL FO newbie question - Hide overflow page content

2007-01-17 Thread Peter
Hello,

 

Upfront apologies for the 101 nature of this question. 

 

I want to make sure an xsl-fo stylesheet will only generate a single page.
Possible overflow should be hidden with a warning or, if possible, an error
should be generated.

 

I found how fo(p) can hide overflow with a block-container object, but I
would rather not use this if not needed.

 

Thanks,

 

Peter



Re: XSL FO newbie question - Hide overflow page content

2007-01-17 Thread Nicol Bolas


This is pretty easy, though it won't get you any warnings.

Make a 1-length page-sequence-master. That is, make a regular
page-sequence-master and then use it in a repeatable-page-master, but only
with 1 repetition. That will force the output to be one page.
-- 
View this message in context: 
http://www.nabble.com/XSL-FO-newbie-question---Hide-overflow-page-content-tf3031234.html#a8422453
Sent from the FOP - Users mailing list archive at Nabble.com.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



RE: XSL FO newbie question - Hide overflow page content

2007-01-17 Thread Peter
Hello,

I tried the following with fop 0.93 

fo:root xmlns:fo=http://www.w3.org/1999/XSL/Format;
  fo:layout-master-set
fo:simple-page-master master-name=page 
  page-height=50pt page-width=200pt
  fo:region-body/fo:region-body
/fo:simple-page-master
fo:page-sequence-master master-name=single
fo:repeatable-page-master-reference master-reference=page 
 maximum-repeats=1/fo:repeatable-page-master-reference
/fo:page-sequence-master
  /fo:layout-master-set
  fo:page-sequence master-reference=single
fo:flow flow-name=xsl-region-body
  fo:block font-size=28pt linefeed-treatment=preserveLine
Line
Line/fo:block
/fo:flow
  /fo:page-sequence
/fo:root


Which results in 

fop -fo c:\temp\t.fo -pdf c:\temp\t.pdf
Jan 18, 2007 8:34:17 AM org.apache.fop.fo.pagination.PageSequenceMaster
getNextSimplePageMaster
WARNING: subsequences exhausted in page-sequence-master 'single', using
previous subsequence
Jan 18, 2007 8:34:17 AM org.apache.fop.fo.pagination.PageSequenceMaster
getNextSimplePageMaster
WARNING: subsequences exhausted in page-sequence-master 'single', using
previous subsequence

And 3 pages in t.pdf.

Anyone any thoughts on what I am doing wrong?


Not sure what it tells but XEP 4.5 results in 

XEP 4.5 build 20060313
(document [system-id file:/C:/DOCUME~1/pc/LOCALS~1/Temp/pro3B6.xml]
  (validate [validation OK])
  (compile 
(masters 
  (sequence-master [master-name page])
  (sequence-master [master-name single]))
(sequence [master-reference single]
  (flow [flow-name xsl-region-body])))
  (format 
(sequence [master-reference single]
  (flow [1]
[error] com.renderx.xep.cmp.NoPageMasterException: state: rest
filled even
  )
  (static-content [1])))
  (generate [output-format pdf][1]))


And a single page pdf

All suggestions or guidance warmly welcomed!


Thanks,

Peter



 -Original Message-
 From: Nicol Bolas [mailto:[EMAIL PROTECTED]
 Sent: Thursday, January 18, 2007 12:56 AM
 To: fop-users@xmlgraphics.apache.org
 Subject: Re: XSL FO newbie question - Hide overflow page content
 
 
 
 This is pretty easy, though it won't get you any warnings.
 
 Make a 1-length page-sequence-master. That is, make a regular
 page-sequence-master and then use it in a repeatable-page-master, but only
 with 1 repetition. That will force the output to be one page.
 --
 View this message in context: http://www.nabble.com/XSL-FO-newbie-
 question---Hide-overflow-page-content-tf3031234.html#a8422453
 Sent from the FOP - Users mailing list archive at Nabble.com.
 
 
 -
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]