date:20110212

[iText-questions] Page resize with iText

2011-02-12 Thread Valentin Boiadjiev

Hi,

 

I've am trying to resize pages so I can put header and footer.

However, when I use PdfWriter and GetImportedPage and AddTemplate, I lose
all annotations and interactive content.

If I use PdfCopy I don't see a way to specify new page size (new
Document(size) has no effect).

 

So, my question is: Can you help me resolve this.

 

Thank you

 

Valentin 

 

--
The ultimate all-in-one performance toolkit: Intel(R) Parallel Studio XE:
Pinpoint memory and threading errors before they happen.
Find and fix more than 250 security defects in the development cycle.
Locate bottlenecks in serial and parallel code that limit performance.
http://p.sf.net/sfu/intel-dev2devfeb___
iText-questions mailing list
iText-questions@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/itext-questions

Many questions posted to this list can (and will) be answered with a reference 
to the iText book: http://www.itextpdf.com/book/
Please check the keywords list before you ask for examples: 
http://itextpdf.com/themes/keywords.php

[iText-questions] PDFReader getPageContent() method returning weird escape codes

2011-02-12 Thread Wyatt Biker

So I have a PDF that I read the contents. I didnt make this PDF but I I get
in the text the following two escape characters:   \222  and \036

\222 seems to be the single quote (') and \036 seems to be something with
the letter (f)

These codes appear in several places however the Acrobat Reader displays it
correctly. Here is some partial examples.

*Example: 1
*...  /Span /MCID 947 BDC  /T1_1 1 Tf ( )Tj EMC  /Span /MCID 948
BDC  -13.716 -1.6 Td [(complex of)-14(fers eight scaled-down ball *
\036elds* replicated from famous )]TJ EMC  /Span /MCID 949 BDC  T*
..


*Example 2:*
.. /Span /MCID 950 BDC  T* [(Y)110(ankee Stadium. And if *
you\222re* interested in *\036nding* ice in the middl\   ..

The above line is *Yankee Statium, And if you're interested in finding ice
in the middl*


I thought these are supposed to be Ascii Octal codes but they don't match
ASCII. Is there a different way of decoding them?
Here is the coee I use to read.

*PdfReader reader = new PdfReader(filein);
byte[] streamBytes = reader.getPageContent(1);
StringBuffer buf = new StringBuffer();
String contentStream = new String(streamBytes);
*
Any idea what this is? Do I need to post the whole PDF?

Thanks
--
The ultimate all-in-one performance toolkit: Intel(R) Parallel Studio XE:
Pinpoint memory and threading errors before they happen.
Find and fix more than 250 security defects in the development cycle.
Locate bottlenecks in serial and parallel code that limit performance.
http://p.sf.net/sfu/intel-dev2devfeb___
iText-questions mailing list
iText-questions@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/itext-questions

Many questions posted to this list can (and will) be answered with a reference 
to the iText book: http://www.itextpdf.com/book/
Please check the keywords list before you ask for examples: 
http://itextpdf.com/themes/keywords.php

Re: [iText-questions] Page resize with iText

2011-02-12 Thread 1T3XT BVBA


Hello,
you have posted a mail to itext-questions@lists.sourceforge.net but you 
weren't subscribed.

You are receiving this answer because I've added your mail address in Bcc:
I will do this only once! Further answers will be sent to the 
mailing-list only (you won't receive them if you don't subscribe). 
Further questions to the mailing-list will be rejected unless you 
subscribe. Further questions sent to the 1t3xt address will be ignored. 
Please understand that, as long as you don't subscribe, somebody has to 
MANUALLY approve your mail among a huge load of SPAM. You can help us 
avoid this boring administrative job by following the rules: 
http://itextpdf.com/support.php


Valentin Boiadjiev wrote:


Hi,

I've am trying to resize pages so I can put header and footer.

However, when I use PdfWriter and GetImportedPage and AddTemplate, I 
lose all annotations and interactive content.


If I use PdfCopy I don't see a way to specify new page size (new 
Document(size) has no effect).


So, my question is: Can you help me resolve this.

Your problems are explained in chapter 6 of iText in Action - Second 
Edition.

That chapter can be downloaded for free if you go to the following page:
http://affiliate.manning.com/idevaffiliate.php?id=223_212
(See the right column with title Downloads.)

If you want to use PdfWriter + PdfImportedPage, you need to copy all 
annotations separately, and scale all the dimensions. This isn't 
impossible, but it's plenty of work. I would advise against it.


If you want to add a header and a footer, why don't you just change the 
MediaBox (and CropBox if any)? That way, you don't have to scale the 
content, you just provide more space.
--
The ultimate all-in-one performance toolkit: Intel(R) Parallel Studio XE:
Pinpoint memory and threading errors before they happen.
Find and fix more than 250 security defects in the development cycle.
Locate bottlenecks in serial and parallel code that limit performance.
http://p.sf.net/sfu/intel-dev2devfeb___
iText-questions mailing list
iText-questions@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/itext-questions

Many questions posted to this list can (and will) be answered with a reference 
to the iText book: http://www.itextpdf.com/book/
Please check the keywords list before you ask for examples: 
http://itextpdf.com/themes/keywords.php

Re: [iText-questions] PDFReader getPageContent() method returning weird escape codes

2011-02-12 Thread 1T3XT BVBA

Wyatt Biker wrote:
  So I have a PDF that I read the contents. I didnt make this PDF but I I
  get in the text the following two escape characters:   \222  and \036

Those are indeed octals.

  \222 seems to be the single quote (') and \036 seems to be something
  with the letter (f)

That's possible, although the actual glyphs depends on the encoding.

  These codes appear in several places however the Acrobat Reader displays
  it correctly. Here is some partial examples.

OK, so there's no problem.

  I thought these are supposed to be Ascii Octal codes but they don't
  match ASCII. Is there a different way of decoding them?

In your code snippet, I see: /T1_1 1 Tf
/T1_1 is a reference to a font dictionary. You can find the object 
number of that font in the /Resources of the /Page dictionary.
If you look at the font dictionary, you'll find the encoding that is 
needed, for example MacRomanEncoding, MacExpertEncoding, WinAnsiEncoding,...

  Here is the code I use to read.
 
   PdfReader reader = new PdfReader(filein);
   byte[] streamBytes = reader.getPageContent(1);
   StringBuffer buf = new StringBuffer();
   String contentStream = new String(streamBytes);

Are you going to parse the PDF syntax yourself?
If so, how come you don't know about font dictionaries?
Did you try the com.itextpdf.text.pdf.parser classes?
If so, did they generate the correct output?

--
The ultimate all-in-one performance toolkit: Intel(R) Parallel Studio XE:
Pinpoint memory and threading errors before they happen.
Find and fix more than 250 security defects in the development cycle.
Locate bottlenecks in serial and parallel code that limit performance.
http://p.sf.net/sfu/intel-dev2devfeb
___
iText-questions mailing list
iText-questions@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/itext-questions

Many questions posted to this list can (and will) be answered with a reference 
to the iText book: http://www.itextpdf.com/book/
Please check the keywords list before you ask for examples: 
http://itextpdf.com/themes/keywords.php

Re: [iText-questions] Question on inline image parse exception

2011-02-12 Thread 1T3XT BVBA


Op 12/02/2011 1:39, Bharathi Kongara schreef:


Hi guys,

I'm fairly new to iText and have looked for options to get around the 
following error but couldn't find much help. All I'm trying to do is 
to just extract text from the first page of a PDF (a valid one). I 
tried to use both the PdfContentStreamProcessor.processContent method 
and the PdfTextExtractor./getTextFromPage /method but looks like the 
latter one is using the first one underneath anyway. Would appreciate 
any help!


ExceptionConverter: 
_com.itextpdf.text.pdf.parser.InlineImageUtils$InlineImageParseException_: 
EI not found after end of image data


Normally, images are added to a page using an external object: an Image 
XObject. This reduces the file size: the bytes of an image that is used 
on different pages are added to the file only once.
In the case of inline images, the bytes are added in the content stream 
of a page. So if an image is added the content stream of two different 
pages, its bytes are in the PDF twice.
You can recognize inline images in the content stream because they are 
between two operators: BI (= begin image) and EI (end image). The 
exception is telling you that it found a BI, but not an EI. There could 
be several reasons for this: parsing inline images is error prone. Maybe 
you aren't using the latest iText with the fixes to catch some of these 
errors. Maybe you've found a PDF revealing a bug in iText. You're not 
telling us which iText version you're using, nor are you providing us 
with the PDF, so we can't help you any further.
--
The ultimate all-in-one performance toolkit: Intel(R) Parallel Studio XE:
Pinpoint memory and threading errors before they happen.
Find and fix more than 250 security defects in the development cycle.
Locate bottlenecks in serial and parallel code that limit performance.
http://p.sf.net/sfu/intel-dev2devfeb___
iText-questions mailing list
iText-questions@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/itext-questions

Many questions posted to this list can (and will) be answered with a reference 
to the iText book: http://www.itextpdf.com/book/
Please check the keywords list before you ask for examples: 
http://itextpdf.com/themes/keywords.php

Re: [iText-questions] suggestion about Paragraph and Phrase

2011-02-12 Thread Bruno Lowagie

Op 11/02/2011 18:05, Michael Niedermair schreef:
 Hi Bruno,

 I have a suggestion about Paragraph and Phrase.
 It were nice, if a new Constructor exists, which set the init size of 
 the ArrayList.

This is a suggestion that should be posted to the mailing list. 
Personally, I don't see the value of such an extra constructor. When I 
use Phrase or Paragraph, there's absolutely no way to know the size of 
the List in advance. If you have a controlled environment where you do, 
you could extend the classes and use that specific subclass.

 e.g.
 public Paragraph(int initialCapacity) {
super(initialCapacity));
 }

 public Phrase(int initialCapacity) {
super(initialCapacity));
 }

 // change
 public Phrase() {
 this(16.0f);
 }

 The problem is, that the default init size is 10. Each time, I arrive 
 the limit, the size is increasing.

 newCapacity = (oldCapacity * 3)/2 + 1
 elementData = Arrays.copyOf(elementData, newCapacity);

 10 .. copy array,
 16 ..  copy array
 25 .. copy array
 38 .. copy array
 58 .. copy array
 an so on.

 If I add a lot of text (e.g. from a file, line by line) the arraylist 
 copy the array and copy the array and so on.
 The init size set by the user can solve the problem with array copy.

 By
 Michael






--
The ultimate all-in-one performance toolkit: Intel(R) Parallel Studio XE:
Pinpoint memory and threading errors before they happen.
Find and fix more than 250 security defects in the development cycle.
Locate bottlenecks in serial and parallel code that limit performance.
http://p.sf.net/sfu/intel-dev2devfeb
___
iText-questions mailing list
iText-questions@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/itext-questions

Many questions posted to this list can (and will) be answered with a reference 
to the iText book: http://www.itextpdf.com/book/
Please check the keywords list before you ask for examples: 
http://itextpdf.com/themes/keywords.php

Re: [iText-questions] Question on inline image parse exception

2011-02-12 Thread Kevin Day


First, try out the very latest code from SVN and see if it fixes your
problem.  I added code a week ago to work around improperly implemented
inline images in files generated by a large financial institution.

If you still have problems after that, I'd suggest that you open a ticket
and attach a *small* PDF that demonstrates the problem (i.e. a single page
PDF).
-- 
View this message in context: 
http://itext-general.2136553.n4.nabble.com/Question-on-inline-image-parse-exception-tp3302271p3302728.html
Sent from the iText - General mailing list archive at Nabble.com.

--
The ultimate all-in-one performance toolkit: Intel(R) Parallel Studio XE:
Pinpoint memory and threading errors before they happen.
Find and fix more than 250 security defects in the development cycle.
Locate bottlenecks in serial and parallel code that limit performance.
http://p.sf.net/sfu/intel-dev2devfeb
___
iText-questions mailing list
iText-questions@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/itext-questions

Many questions posted to this list can (and will) be answered with a reference 
to the iText book: http://www.itextpdf.com/book/
Please check the keywords list before you ask for examples: 
http://itextpdf.com/themes/keywords.php

Re: [iText-questions] PDFReader getPageContent() method returning weird escape codes

2011-02-12 Thread Wyatt Biker

I am planning to do something conceptually simple: Find the ? and ?
enclosing tags in a user designed pdf and replace it with input from a
database while maintaining kerning and tracking intact. Whatever is inside
? ? will be of a single font and single tracking.

Can this be done easily with the parser classes?

I ordered the book. I hope it has some good examples.


Your help is appreciated.



On Sat, Feb 12, 2011 at 4:25 AM, 1T3XT BVBA i...@1t3xt.info wrote:

 Wyatt Biker wrote:
   So I have a PDF that I read the contents. I didnt make this PDF but I I
   get in the text the following two escape characters:   \222  and \036

 Those are indeed octals.

   \222 seems to be the single quote (') and \036 seems to be something
   with the letter (f)

 That's possible, although the actual glyphs depends on the encoding.

   These codes appear in several places however the Acrobat Reader displays
   it correctly. Here is some partial examples.

 OK, so there's no problem.

   I thought these are supposed to be Ascii Octal codes but they don't
   match ASCII. Is there a different way of decoding them?

 In your code snippet, I see: /T1_1 1 Tf
 /T1_1 is a reference to a font dictionary. You can find the object
 number of that font in the /Resources of the /Page dictionary.
 If you look at the font dictionary, you'll find the encoding that is
 needed, for example MacRomanEncoding, MacExpertEncoding,
 WinAnsiEncoding,...

   Here is the code I use to read.
  
PdfReader reader = new PdfReader(filein);
byte[] streamBytes = reader.getPageContent(1);
StringBuffer buf = new StringBuffer();
String contentStream = new String(streamBytes);

 Are you going to parse the PDF syntax yourself?
 If so, how come you don't know about font dictionaries?
 Did you try the com.itextpdf.text.pdf.parser classes?
 If so, did they generate the correct output?


 --
 The ultimate all-in-one performance toolkit: Intel(R) Parallel Studio XE:
 Pinpoint memory and threading errors before they happen.
 Find and fix more than 250 security defects in the development cycle.
 Locate bottlenecks in serial and parallel code that limit performance.
 http://p.sf.net/sfu/intel-dev2devfeb
 ___
 iText-questions mailing list
 iText-questions@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/itext-questions

 Many questions posted to this list can (and will) be answered with a
 reference to the iText book: http://www.itextpdf.com/book/
 Please check the keywords list before you ask for examples:
 http://itextpdf.com/themes/keywords.php

--
The ultimate all-in-one performance toolkit: Intel(R) Parallel Studio XE:
Pinpoint memory and threading errors before they happen.
Find and fix more than 250 security defects in the development cycle.
Locate bottlenecks in serial and parallel code that limit performance.
http://p.sf.net/sfu/intel-dev2devfeb___
iText-questions mailing list
iText-questions@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/itext-questions

Many questions posted to this list can (and will) be answered with a reference 
to the iText book: http://www.itextpdf.com/book/
Please check the keywords list before you ask for examples: 
http://itextpdf.com/themes/keywords.php

Re: [iText-questions] PDFReader getPageContent() method returning weird escape codes

2011-02-12 Thread mkl

Wyatt,

Wyatt Biker wrote:

I am planning to do something conceptually simple: Find the ? and ?
enclosing tags in a user designed pdf and replace it with input from a
database while maintaining kerning and tracking intact. Whatever is inside
? ? will be of a single font and single tracking.

Considering that pdf essentially is a format that describes where individual
or small groups of glyphs shall appear on screen or on paper, I don't
consider that simple. If your replacements from your db aren't guaranteed to
have the same length as your placeholders, you're out of luck. If they are,
a generic solution is merely difficult. That you seem unaware of ligatures,
doesn't really help.

Regards, Michael.
--
View this message in context:
http://itext-general.2136553.n4.nabble.com/PDFReader-getPageContent-method-returning-weird-escape-codes-tp3302481p3303248.html
Sent from the iText - General mailing list archive at Nabble.com.

--
The ultimate all-in-one performance toolkit: Intel(R) Parallel Studio XE:
Pinpoint memory and threading errors before they happen.
Find and fix more than 250 security defects in the development cycle.
Locate bottlenecks in serial and parallel code that limit performance.
http://p.sf.net/sfu/intel-dev2devfeb
___
iText-questions mailing list
iText-questions@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/itext-questions

Many questions posted to this list can (and will) be answered with a reference
to the iText book: http://www.itextpdf.com/book/
Please check the keywords list before you ask for examples:
http://itextpdf.com/themes/keywords.php

Re: [iText-questions] Change producer metadata

2011-02-12 Thread qplace

1T3XT BVBA info at 1t3xt.info writes:

 
 Op 10/02/2011 23:40, qplace schreef:
  I am trying to change producer information in existing pdf using the 
example
... 
  What is the right way to change producer info?
 
  I have commercial iText license.
 Please use the mail address you've obtained when buying the commercial 
 license.
 

I am using ad...@tradeplatform.us to post here and it is also email address 
used in communications with Mr.Bradbury. I received all license-related 
confirmations on that address.



--
The ultimate all-in-one performance toolkit: Intel(R) Parallel Studio XE:
Pinpoint memory and threading errors before they happen.
Find and fix more than 250 security defects in the development cycle.
Locate bottlenecks in serial and parallel code that limit performance.
http://p.sf.net/sfu/intel-dev2devfeb
___
iText-questions mailing list
iText-questions@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/itext-questions

Many questions posted to this list can (and will) be answered with a reference 
to the iText book: http://www.itextpdf.com/book/
Please check the keywords list before you ask for examples: 
http://itextpdf.com/themes/keywords.php

[iText-questions] Page resize with iText

[iText-questions] PDFReader getPageContent() method returning weird escape codes

Re: [iText-questions] Page resize with iText

Re: [iText-questions] PDFReader getPageContent() method returning weird escape codes

Re: [iText-questions] Question on inline image parse exception

Re: [iText-questions] suggestion about Paragraph and Phrase

Re: [iText-questions] Question on inline image parse exception

Re: [iText-questions] PDFReader getPageContent() method returning weird escape codes

Re: [iText-questions] PDFReader getPageContent() method returning weird escape codes

Re: [iText-questions] Change producer metadata

10 matches

Site Navigation

Mail list logo

Footer information