date:20160224

Re: Printing non english characters in a PDF with PDFBox 1.8.10

2016-02-24 Thread Tilman Hausherr


Then don't use that character.

Looking further, you used a stardard 14 font. These have only 255 (or 
even less) characters. That is why I told you to look at the examples. 
See here at EmbeddedFonts.java:


String dir = 
"../pdfbox/src/main/resources/org/apache/pdfbox/resources/ttf/";
PDType0Font font = PDType0Font.load(document, new File(dir + 
"LiberationSans-Regular.ttf"));


PDPageContentStream stream = new PDPageContentStream(document, 
page);


stream.beginText();
stream.setFont(font, 12);
stream.setLeading(12 * 1.2);

stream.newLineAtOffset(50, 600);
stream.showText("PDFBox's Unicode with Embedded TrueType Font");
stream.newLine();

stream.showText("Supports full Unicode text ☺");
stream.newLine();

stream.showText("English русский язык Tiếng Việt");
stream.newLine();

// ligature
stream.showText("Ligatures: \uFB01lm \uFB02ood");

stream.endText();
stream.close();


Tilman

Am 25.02.2016 um 06:47 schrieb Sunrita Bagchi Basu:

Thanks for the quick tip. I moved to 2.0.0-RC3 . Now I'm getting the
following exception for the following line of code:

 float size = CreatePDF.fontSize *
PDType1Font.HELVETICA_BOLD.getStringWidth(subString) / 1000;


java.lang.IllegalArgumentException: U+FFFD ('.notdef') is not available in
this font's encoding: WinAnsiEncoding
at org.apache.pdfbox.pdmodel.font.PDType1Font.encode(PDType1Font.java:344)
at org.apache.pdfbox.pdmodel.font.PDFont.encode(PDFont.java:285)
at org.apache.pdfbox.pdmodel.font.PDFont.getStringWidth(PDFont.java:314)
at
com.evolv.dataflow.compliance.pdf.PDFColumn.createContent(PDFColumn.java:74)
at com.evolv.dataflow.compliance.pdf.PDFColumn.(PDFColumn.java:40)
at
com.evolv.dataflow.compliance.pdf.CreateUserPDF.createPDFRow(CreateUserPDF.java:89)
at
com.evolv.dataflow.compliance.pdf.CreateUserPDF.createPDFRow(CreateUserPDF.java:1)
at com.evolv.dataflow.compliance.pdf.CreatePDF.lambda$0(CreatePDF.java:234)
at
com.evolv.dataflow.compliance.pdf.CreatePDF$$Lambda$1/992768706.accept(Unknown
Source)
at java.util.ArrayList.forEach(ArrayList.java:1249)
at
com.evolv.dataflow.compliance.pdf.CreatePDF.CreatePDFDocument(CreatePDF.java:234)
at com.evolv.dataflow.compliance.pdf.TestPDF.testUserPDF(TestPDF.java:59)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:483)
at
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
at
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
at
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
at
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
at
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
at
org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:50)
at
org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38)
at
org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:459)
at
org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:675)
at
org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:382)
at
org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:192)


On Thu, Feb 25, 2016 at 10:43 AM, Tilman Hausherr 
wrote:


Hi,

You can't. Use the 2.0RC3 version. The API to create PDF is slightly
different, see the examples in the source download.

Tilman


Am 25.02.2016 um 05:13 schrieb Sunrita Bagchi Basu:


Hi All,

I am creating a PDF document with a simple table where i write my Data.
Most of my data is plain english, but sometimes there are some non
english characters
too like french , japanese. Whenever these characters apprear, the font
style (that I have chosen for my PDF) changes! There's junk characters with
lots of white space in between.I tried the following:

1. removing the non english characters
2. extracting the data using UTF-8 charset

but non of it works! Attached is the sample of the font change.
How

Re: Printing non english characters in a PDF with PDFBox 1.8.10

2016-02-24 Thread Sunrita Bagchi Basu

Thanks for the quick tip. I moved to 2.0.0-RC3 . Now I'm getting the
following exception for the following line of code:

float size = CreatePDF.fontSize *
PDType1Font.HELVETICA_BOLD.getStringWidth(subString) / 1000;


java.lang.IllegalArgumentException: U+FFFD ('.notdef') is not available in
this font's encoding: WinAnsiEncoding
at org.apache.pdfbox.pdmodel.font.PDType1Font.encode(PDType1Font.java:344)
at org.apache.pdfbox.pdmodel.font.PDFont.encode(PDFont.java:285)
at org.apache.pdfbox.pdmodel.font.PDFont.getStringWidth(PDFont.java:314)
at
com.evolv.dataflow.compliance.pdf.PDFColumn.createContent(PDFColumn.java:74)
at com.evolv.dataflow.compliance.pdf.PDFColumn.(PDFColumn.java:40)
at
com.evolv.dataflow.compliance.pdf.CreateUserPDF.createPDFRow(CreateUserPDF.java:89)
at
com.evolv.dataflow.compliance.pdf.CreateUserPDF.createPDFRow(CreateUserPDF.java:1)
at com.evolv.dataflow.compliance.pdf.CreatePDF.lambda$0(CreatePDF.java:234)
at
com.evolv.dataflow.compliance.pdf.CreatePDF$$Lambda$1/992768706.accept(Unknown
Source)
at java.util.ArrayList.forEach(ArrayList.java:1249)
at
com.evolv.dataflow.compliance.pdf.CreatePDF.CreatePDFDocument(CreatePDF.java:234)
at com.evolv.dataflow.compliance.pdf.TestPDF.testUserPDF(TestPDF.java:59)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:483)
at
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
at
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
at
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
at
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
at
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
at
org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:50)
at
org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38)
at
org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:459)
at
org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:675)
at
org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:382)
at
org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:192)


On Thu, Feb 25, 2016 at 10:43 AM, Tilman Hausherr 
wrote:

> Hi,
>
> You can't. Use the 2.0RC3 version. The API to create PDF is slightly
> different, see the examples in the source download.
>
> Tilman
>
>
> Am 25.02.2016 um 05:13 schrieb Sunrita Bagchi Basu:
>
>> Hi All,
>>
>> I am creating a PDF document with a simple table where i write my Data.
>> Most of my data is plain english, but sometimes there are some non
>> english characters
>> too like french , japanese. Whenever these characters apprear, the font
>> style (that I have chosen for my PDF) changes! There's junk characters with
>> lots of white space in between.I tried the following:
>>
>> 1. removing the non english characters
>> 2. extracting the data using UTF-8 charset
>>
>> but non of it works! Attached is the sample of the font change.
>> How to tackle this ?
>>
>>
>> -
>> To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org
>> For additional commands, e-mail: users-h...@pdfbox.apache.org
>>
>
>

Re: Printing non english characters in a PDF with PDFBox 1.8.10

2016-02-24 Thread Tilman Hausherr


Hi,

You can't. Use the 2.0RC3 version. The API to create PDF is slightly 
different, see the examples in the source download.


Tilman

Am 25.02.2016 um 05:13 schrieb Sunrita Bagchi Basu:

Hi All,

I am creating a PDF document with a simple table where i write my Data.
Most of my data is plain english, but sometimes there are some non 
english characters
too like french , japanese. Whenever these characters apprear, the 
font style (that I have chosen for my PDF) changes! There's junk 
characters with lots of white space in between.I tried the following:


1. removing the non english characters
2. extracting the data using UTF-8 charset

but non of it works! Attached is the sample of the font change.
How to tackle this ?


-
To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: users-h...@pdfbox.apache.org

Printing non english characters in a PDF with PDFBox 1.8.10

2016-02-24 Thread Sunrita Bagchi Basu

Hi All,

I am creating a PDF document with a simple table where i write my Data.
Most of my data is plain english, but sometimes there are some non english
characters
too like french , japanese. Whenever these characters apprear, the font
style (that I have chosen for my PDF) changes! There's junk characters with
lots of white space in between.I tried the following:

1. removing the non english characters
2. extracting the data using UTF-8 charset

but non of it works! Attached is the sample of the font change.
How to tackle this ?

-
To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: users-h...@pdfbox.apache.org

Re: Bad text extraction result

2016-02-24 Thread Francisco Andrés Fernández

Many thanks Tilman.
I'll try to find a workaround in the meantime.
Cheers,

Francisco

El mié., 24 de feb. de 2016 a la(s) 17:47, Tilman Hausherr <
thaush...@t-online.de> escribió:

> I'll create an issue in JIRA later or tomorrow, but don't expect that
> this will be fixed quickly (unless I missed something obvious). We want
> to release 2.0 before doing any "big" work on text extraction.
>
> Tilman
>
> -
> To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org
> For additional commands, e-mail: users-h...@pdfbox.apache.org
>
>

Re: PDAnnotationMarkup.getInReplyTo

2016-02-24 Thread Tilman Hausherr


Am 24.02.2016 um 23:38 schrieb David Lattimore:

On Wed, Feb 24, 2016 at 6:39 PM, Tilman Hausherr
wrote:


>Am 24.02.2016 um 06:21 schrieb David Lattimore:
>

>>I'm trying to read annotations from PDFs and am having trouble matching up
>>replies with the annotations they're in reply to.
>>
>>PDPage.getAnnotations() returns a list of PDAnnotation. When I have
>>a PDAnnotationMarkup, I try to call getInReplyTo to get the previous
>>annotation in the thread. But I have two problems:
>>
>>1) getInReplyTo() crashes if the annotation isn't a reply. It'd be nice if
>>it just returned null. I can work around this by getting the COSDictionary
>>and checking for an IRT entry first.
>>

>
>Please post a stack trace

java.io.IOException: Error: Unknown annotation type null
at
org.apache.pdfbox.pdmodel.interactive.annotation.PDAnnotation.createAnnotation(PDAnnotation.java:167)
at
org.apache.pdfbox.pdmodel.interactive.annotation.PDAnnotationMarkup.getInReplyTo(PDAnnotationMarkup.java:225)

The following code can be used to reproduce this:
new PDAnnotationMarkup().getInReplyTo();



So it is an IOException, this isn't as bad as a nullpointerexception. 
But it is kindof weird indeed, usually we just return null when 
something doesn't exist. (I know that this is bad design 
http://www.yegor256.com/2014/05/13/why-null-is-bad.html but it's too 
late now). I'll sleep over it and then probably change it.


Tilman




-
To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: users-h...@pdfbox.apache.org

Re: PDAnnotationMarkup.getInReplyTo

2016-02-24 Thread David Lattimore

On Wed, Feb 24, 2016 at 6:39 PM, Tilman Hausherr 
wrote:

> Am 24.02.2016 um 06:21 schrieb David Lattimore:
>
>> I'm trying to read annotations from PDFs and am having trouble matching up
>> replies with the annotations they're in reply to.
>>
>> PDPage.getAnnotations() returns a list of PDAnnotation. When I have
>> a PDAnnotationMarkup, I try to call getInReplyTo to get the previous
>> annotation in the thread. But I have two problems:
>>
>> 1) getInReplyTo() crashes if the annotation isn't a reply. It'd be nice if
>> it just returned null. I can work around this by getting the COSDictionary
>> and checking for an IRT entry first.
>>
>
> Please post a stack trace


java.io.IOException: Error: Unknown annotation type null
at
org.apache.pdfbox.pdmodel.interactive.annotation.PDAnnotation.createAnnotation(PDAnnotation.java:167)
at
org.apache.pdfbox.pdmodel.interactive.annotation.PDAnnotationMarkup.getInReplyTo(PDAnnotationMarkup.java:225)

The following code can be used to reproduce this:
new PDAnnotationMarkup().getInReplyTo();



> 2) The PDAnnotation returned by getInReplyTo() isn't one of the annotations
>> returned by PDPage.getAnnotations() and I can't see how I can match them
>> up. Ideally I'd like to get the object ID for each, but the PDAnnotation
>> doesn't seem to know it's object ID as far as I can see. I could match
>> them
>> by keying on various attributes of the annotation like the text content
>> and
>> the timestamp, but this feels pretty hacky.
>>
>
> try getCOSObject(), this should work.


Ah, of course. That does indeed work. Sorry I didn't think of that myself
and thank you.

Re: Bad text extraction result

2016-02-24 Thread Tilman Hausherr

I'll create an issue in JIRA later or tomorrow, but don't expect that 
this will be fixed quickly (unless I missed something obvious). We want 
to release 2.0 before doing any "big" work on text extraction.


Tilman

-
To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: users-h...@pdfbox.apache.org

Re: Bad text extraction result

2016-02-24 Thread Tilman Hausherr

I tried all the settings and was unsuccessful. I was unable to extract 
"Cada frasco ampolla" which looked pretty obvious, it always appeared as 
"Ca da fras co ampo lla".


Then I looked into the content stream and found this:

6 0 1.058 6 122.0924 312.51 Tm
(Ca) Tj
/Span << /ActualText (\376\377\000\255) >> BDC
  ( ) Tj
EMC
[ (da ) -301 (fras) ] TJ
/Span << /ActualText (\376\377\000\255) >> BDC
  ( ) Tj
EMC
[ (co ) -301 (ampo) ] TJ
/Span << /ActualText (\376\377\000\255) >> BDC
  ( ) Tj
EMC
[ (lla ) -301 (con) ] TJ

So there are really spaces there, and we keep them. Adobe is smarter, 
and ignores them because they are overwritten thanks to the "-301" you 
see (that is a positioning).


This /ActualText thing might be of some help, but I don't think we 
process this.


Tilman


Am 24.02.2016 um 20:47 schrieb Francisco Andrés Fernández:

Hi Tilman, many thanks for your answer.
I doesn't find any configuration file to tweak this.
I send you the link to the pdf file to see if you could figure an idea
about what the problem is.
https://drive.google.com/file/d/0B0PMZsHkpcJRSEpBSWhtQndKZTg/view?usp=sharing
Many thanks in advance,

Francisco

El mié., 24 de feb. de 2016 a la(s) 16:29, Tilman Hausherr <
thaush...@t-online.de> escribió:


Am 24.02.2016 um 20:17 schrieb Francisco Andrés Fernández:

Hi all,
I'm extracting some text from pdf, through Tika in Solr. As result, some
important words end with spaces between characters.
For example, I could have the word "Subtitle" that I want to detect,
written like "S u b t i t l e".

You could try to modify spacingTolerance or averageCharTolerance in
PDFTextStripper (find out if TIKA supports this), but it is likely that
if spaces are ignored, they would be ignored at other places where you
don't want it.

If possible, please upload your file somewhere.

Tilman


How could I make PdfBox detect this type of word occurrence?
Many thanks,

Francisco



-
To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: users-h...@pdfbox.apache.org





-
To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: users-h...@pdfbox.apache.org

Re: Bad text extraction result

2016-02-24 Thread Francisco Andrés Fernández

Hi Tilman, many thanks for your answer.
I doesn't find any configuration file to tweak this.
I send you the link to the pdf file to see if you could figure an idea
about what the problem is.
https://drive.google.com/file/d/0B0PMZsHkpcJRSEpBSWhtQndKZTg/view?usp=sharing
Many thanks in advance,

Francisco

El mié., 24 de feb. de 2016 a la(s) 16:29, Tilman Hausherr <
thaush...@t-online.de> escribió:

> Am 24.02.2016 um 20:17 schrieb Francisco Andrés Fernández:
> > Hi all,
> > I'm extracting some text from pdf, through Tika in Solr. As result, some
> > important words end with spaces between characters.
> > For example, I could have the word "Subtitle" that I want to detect,
> > written like "S u b t i t l e".
>
> You could try to modify spacingTolerance or averageCharTolerance in
> PDFTextStripper (find out if TIKA supports this), but it is likely that
> if spaces are ignored, they would be ignored at other places where you
> don't want it.
>
> If possible, please upload your file somewhere.
>
> Tilman
>
> > How could I make PdfBox detect this type of word occurrence?
> > Many thanks,
> >
> > Francisco
> >
>
>
> -
> To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org
> For additional commands, e-mail: users-h...@pdfbox.apache.org
>
>

Re: Bad text extraction result

2016-02-24 Thread Tilman Hausherr


Am 24.02.2016 um 20:17 schrieb Francisco Andrés Fernández:

Hi all,
I'm extracting some text from pdf, through Tika in Solr. As result, some
important words end with spaces between characters.
For example, I could have the word "Subtitle" that I want to detect,
written like "S u b t i t l e".


You could try to modify spacingTolerance or averageCharTolerance in 
PDFTextStripper (find out if TIKA supports this), but it is likely that 
if spaces are ignored, they would be ignored at other places where you 
don't want it.


If possible, please upload your file somewhere.

Tilman


How could I make PdfBox detect this type of word occurrence?
Many thanks,

Francisco




-
To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: users-h...@pdfbox.apache.org

Bad text extraction result

2016-02-24 Thread Francisco Andrés Fernández

Hi all,
I'm extracting some text from pdf, through Tika in Solr. As result, some
important words end with spaces between characters.
For example, I could have the word "Subtitle" that I want to detect,
written like "S u b t i t l e".
How could I make PdfBox detect this type of word occurrence?
Many thanks,

Francisco

Re: Rotating a new annotation to match the page's rotation

2016-02-24 Thread Tilman Hausherr


Am 24.02.2016 um 09:34 schrieb Gilad Denneboom:

No one has any ideas? ...


I don't know of a utility. I assume you need to use AffineTransform math 
to calculate the coordinates, i.e. a rotation and a translation on one 
axis. Note that rotations always go around the 0 point, that's why you 
need the transation.



Tilman


On Sun, Feb 21, 2016 at 12:30 AM, Gilad Denneboom 
wrote:
Hi all,

Hoping someone can help me with this issue...
I have a tool that adds new highlight annotations to a page. It works very
well, except for when the page is rotated. I know I need to apply a
transformation to my rect and/or quads to get them to match the rotated
user space, but I just can't get it to work.
Is there a utility in PDFBox (I'm using 1.8.11 at the moment) that can
help me perform this transformation so I can place my annotations at the
right location on these pages?

Thanks a lot in advance for any helpful tips...

Gilad




-
To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: users-h...@pdfbox.apache.org

Re: (pdffile) does not allow extracting content

2016-02-24 Thread Brzrk One

Oh... I am re-encrypting the loadNonSeq() document by using
document.openProtection() I'll stop that...
Thanks!

On Wed, Feb 24, 2016 at 2:02 AM, Tilman Hausherr 
wrote:

> Am 24.02.2016 um 00:27 schrieb Brzrk One:
>
>> Yea, I think that's it.
>> Comparing the input pdf to the loadNonSeq() output, I see objects that
>> have the same content.
>> This means that the loadNonSeq() output is encrypted - like the input -
>> while the load() output is not. However, the loadNonSeq() output has no
>> /Encrypt dictionary.
>>
>> I am using this on both paths:
>> StandardDecryptionMaterial sdm = new StandardDecryptionMaterial("");
>> document.openProtection(sdm);
>>
>
> You shouldn't use this on loadNonSeq, or in 2.0 (it isn't available there
> anyway).
>
> You only need it with load() in 1.8.
>
>
>> without error.
>> Is this a feature of loadNonSeq() in the face of
>> AccessPermission.canExtractContent() == true?
>> Or did I do something wrong here?
>>
>
> You need openProtection() only with load() in 1.8 and only if the file is
> encrypted. (Yours is)
>
> Tilman
>

Re: Rotating a new annotation to match the page's rotation

2016-02-24 Thread Karl Heinz Kremer

I ended up with a transformation that I applied to the coordinates of my
unrotated annotation.

Re: Rotating a new annotation to match the page's rotation

2016-02-24 Thread Gilad Denneboom

Did you create a generic converter, or did you just hard-code the results
you got from manually rotating the page?
I'm sure it's possible to abstract this transformation to a formula that
could be applied to the quads to rotate them so that they fit the page's
rotation. I just can't seem to figure it out...
I actually think that the Matrix2D object in Acrobat's JavaScript does
that, but it's too opaque for me to figure out how it works, exactly. I can
post the source code for it, if anyone is interested.

On Wed, Feb 24, 2016 at 3:27 PM, Karl Heinz Kremer  wrote:

> I came up with a solution for such a problem in a previous job (which means
> I don't have access to the code anymore). The process I used was to create
> test documents in all rotations, with annotations in all four rotations in
> Adobe Acrobat. I then analyzed the files and came up with a transformation
> for every case. I then just used a switch statement for the four different
> page rotations to place my annotation in the correct spot.
>
> Hope that helps,
>
> Karl Heinz
>
> On Wed, Feb 24, 2016 at 8:30 AM, Gilad Denneboom <
> gilad.denneb...@gmail.com>
> wrote:
>
> > It's a highlight.
> >
> > Let me give you the background. I'm using a variation on
> PrintTextLocation
> > to find the locations I want to highlight and then add them. As I wrote,
> it
> > works very well in pages with zero rotation, but the results are skewed
> > when the pages are rotated.
> > What I mean by skewed is that they appear on the location of the page
> where
> > the text would have have been had it not been rotated, if that makes
> sense.
> > I'll try to set up a simple example that demonstrates this issue and
> share
> > it.
> >
> > On Wed, Feb 24, 2016 at 1:34 PM, Maruan Sahyoun 
> > wrote:
> >
> > > Hi,
> > >
> > > what type of annotation are you trying to put on the page. I could
> create
> > > a little sample placing an annotation at the (visual) upper left corner
> > of
> > > a portrait and landscape page.
> > >
> > > BR
> > > Maruan
> > >
> > > > Am 24.02.2016 um 09:34 schrieb Gilad Denneboom <
> > > gilad.denneb...@gmail.com>:
> > > >
> > > > No one has any ideas? ...
> > > >
> > > > On Sun, Feb 21, 2016 at 12:30 AM, Gilad Denneboom <
> > > gilad.denneb...@gmail.com
> > > >> wrote:
> > > >
> > > >> Hi all,
> > > >>
> > > >> Hoping someone can help me with this issue...
> > > >> I have a tool that adds new highlight annotations to a page. It
> works
> > > very
> > > >> well, except for when the page is rotated. I know I need to apply a
> > > >> transformation to my rect and/or quads to get them to match the
> > rotated
> > > >> user space, but I just can't get it to work.
> > > >> Is there a utility in PDFBox (I'm using 1.8.11 at the moment) that
> can
> > > >> help me perform this transformation so I can place my annotations at
> > the
> > > >> right location on these pages?
> > > >>
> > > >> Thanks a lot in advance for any helpful tips...
> > > >>
> > > >> Gilad
> > > >>
> > >
> > >
> > > -
> > > To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org
> > > For additional commands, e-mail: users-h...@pdfbox.apache.org
> > >
> > >
> >
>

Re: Rotating a new annotation to match the page's rotation

2016-02-24 Thread Karl Heinz Kremer

I came up with a solution for such a problem in a previous job (which means
I don't have access to the code anymore). The process I used was to create
test documents in all rotations, with annotations in all four rotations in
Adobe Acrobat. I then analyzed the files and came up with a transformation
for every case. I then just used a switch statement for the four different
page rotations to place my annotation in the correct spot.

Hope that helps,

Karl Heinz

On Wed, Feb 24, 2016 at 8:30 AM, Gilad Denneboom 
wrote:

> It's a highlight.
>
> Let me give you the background. I'm using a variation on PrintTextLocation
> to find the locations I want to highlight and then add them. As I wrote, it
> works very well in pages with zero rotation, but the results are skewed
> when the pages are rotated.
> What I mean by skewed is that they appear on the location of the page where
> the text would have have been had it not been rotated, if that makes sense.
> I'll try to set up a simple example that demonstrates this issue and share
> it.
>
> On Wed, Feb 24, 2016 at 1:34 PM, Maruan Sahyoun 
> wrote:
>
> > Hi,
> >
> > what type of annotation are you trying to put on the page. I could create
> > a little sample placing an annotation at the (visual) upper left corner
> of
> > a portrait and landscape page.
> >
> > BR
> > Maruan
> >
> > > Am 24.02.2016 um 09:34 schrieb Gilad Denneboom <
> > gilad.denneb...@gmail.com>:
> > >
> > > No one has any ideas? ...
> > >
> > > On Sun, Feb 21, 2016 at 12:30 AM, Gilad Denneboom <
> > gilad.denneb...@gmail.com
> > >> wrote:
> > >
> > >> Hi all,
> > >>
> > >> Hoping someone can help me with this issue...
> > >> I have a tool that adds new highlight annotations to a page. It works
> > very
> > >> well, except for when the page is rotated. I know I need to apply a
> > >> transformation to my rect and/or quads to get them to match the
> rotated
> > >> user space, but I just can't get it to work.
> > >> Is there a utility in PDFBox (I'm using 1.8.11 at the moment) that can
> > >> help me perform this transformation so I can place my annotations at
> the
> > >> right location on these pages?
> > >>
> > >> Thanks a lot in advance for any helpful tips...
> > >>
> > >> Gilad
> > >>
> >
> >
> > -
> > To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org
> > For additional commands, e-mail: users-h...@pdfbox.apache.org
> >
> >
>

Re: Rotating a new annotation to match the page's rotation

2016-02-24 Thread Gilad Denneboom

It's a highlight.

Let me give you the background. I'm using a variation on PrintTextLocation
to find the locations I want to highlight and then add them. As I wrote, it
works very well in pages with zero rotation, but the results are skewed
when the pages are rotated.
What I mean by skewed is that they appear on the location of the page where
the text would have have been had it not been rotated, if that makes sense.
I'll try to set up a simple example that demonstrates this issue and share
it.

On Wed, Feb 24, 2016 at 1:34 PM, Maruan Sahyoun 
wrote:

> Hi,
>
> what type of annotation are you trying to put on the page. I could create
> a little sample placing an annotation at the (visual) upper left corner of
> a portrait and landscape page.
>
> BR
> Maruan
>
> > Am 24.02.2016 um 09:34 schrieb Gilad Denneboom <
> gilad.denneb...@gmail.com>:
> >
> > No one has any ideas? ...
> >
> > On Sun, Feb 21, 2016 at 12:30 AM, Gilad Denneboom <
> gilad.denneb...@gmail.com
> >> wrote:
> >
> >> Hi all,
> >>
> >> Hoping someone can help me with this issue...
> >> I have a tool that adds new highlight annotations to a page. It works
> very
> >> well, except for when the page is rotated. I know I need to apply a
> >> transformation to my rect and/or quads to get them to match the rotated
> >> user space, but I just can't get it to work.
> >> Is there a utility in PDFBox (I'm using 1.8.11 at the moment) that can
> >> help me perform this transformation so I can place my annotations at the
> >> right location on these pages?
> >>
> >> Thanks a lot in advance for any helpful tips...
> >>
> >> Gilad
> >>
>
>
> -
> To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org
> For additional commands, e-mail: users-h...@pdfbox.apache.org
>
>

Re: Rotating a new annotation to match the page's rotation

2016-02-24 Thread Maruan Sahyoun

Hi,

what type of annotation are you trying to put on the page. I could create a 
little sample placing an annotation at the (visual) upper left corner of a 
portrait and landscape page.

BR
Maruan

> Am 24.02.2016 um 09:34 schrieb Gilad Denneboom :
> 
> No one has any ideas? ...
> 
> On Sun, Feb 21, 2016 at 12:30 AM, Gilad Denneboom > wrote:
> 
>> Hi all,
>> 
>> Hoping someone can help me with this issue...
>> I have a tool that adds new highlight annotations to a page. It works very
>> well, except for when the page is rotated. I know I need to apply a
>> transformation to my rect and/or quads to get them to match the rotated
>> user space, but I just can't get it to work.
>> Is there a utility in PDFBox (I'm using 1.8.11 at the moment) that can
>> help me perform this transformation so I can place my annotations at the
>> right location on these pages?
>> 
>> Thanks a lot in advance for any helpful tips...
>> 
>> Gilad
>> 


-
To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: users-h...@pdfbox.apache.org

Re: Rotating a new annotation to match the page's rotation

2016-02-24 Thread Andreas Lehmkühler

Hi,

> Gilad Denneboom  hat am 24. Februar 2016 um 09:34
> geschrieben:
> 
> 
> No one has any ideas? ...
> 
> On Sun, Feb 21, 2016 at 12:30 AM, Gilad Denneboom  > wrote:
> 
> > Hi all,
> >
> > Hoping someone can help me with this issue...
> > I have a tool that adds new highlight annotations to a page. It works very
> > well, except for when the page is rotated. I know I need to apply a
> > transformation to my rect and/or quads to get them to match the rotated
> > user space, but I just can't get it to work.
> > Is there a utility in PDFBox (I'm using 1.8.11 at the moment) that can
> > help me perform this transformation so I can place my annotations at the
> > right location on these pages?
> >
> > Thanks a lot in advance for any helpful tips...
I'm not an annotation expert, but according to the spec both the Rect and the
QuadPoints values are specified in default user space which doesn't include any
rotation or scaling. But I have no clue where to put these information instead.
Can you create a sample pdf with such an annotation using acrobat or something
similar so that we can have a look how it looks like?

> > Gilad

BR
Andreas

-
To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: users-h...@pdfbox.apache.org

Re: Rotating a new annotation to match the page's rotation

2016-02-24 Thread Gilad Denneboom

No one has any ideas? ...

On Sun, Feb 21, 2016 at 12:30 AM, Gilad Denneboom  wrote:

> Hi all,
>
> Hoping someone can help me with this issue...
> I have a tool that adds new highlight annotations to a page. It works very
> well, except for when the page is rotated. I know I need to apply a
> transformation to my rect and/or quads to get them to match the rotated
> user space, but I just can't get it to work.
> Is there a utility in PDFBox (I'm using 1.8.11 at the moment) that can
> help me perform this transformation so I can place my annotations at the
> right location on these pages?
>
> Thanks a lot in advance for any helpful tips...
>
> Gilad
>

Re: Printing non english characters in a PDF with PDFBox 1.8.10

Re: Printing non english characters in a PDF with PDFBox 1.8.10

Re: Printing non english characters in a PDF with PDFBox 1.8.10

Printing non english characters in a PDF with PDFBox 1.8.10

Re: Bad text extraction result

Re: PDAnnotationMarkup.getInReplyTo

Re: PDAnnotationMarkup.getInReplyTo

Re: Bad text extraction result

Re: Bad text extraction result

Re: Bad text extraction result

Re: Bad text extraction result

Bad text extraction result

Re: Rotating a new annotation to match the page's rotation

Re: (pdffile) does not allow extracting content

Re: Rotating a new annotation to match the page's rotation

Re: Rotating a new annotation to match the page's rotation

Re: Rotating a new annotation to match the page's rotation

Re: Rotating a new annotation to match the page's rotation

Re: Rotating a new annotation to match the page's rotation

Re: Rotating a new annotation to match the page's rotation

Re: Rotating a new annotation to match the page's rotation

21 matches

Site Navigation

Mail list logo

Footer information