Re: Printing non english characters in a PDF with PDFBox 1.8.10
Then don't use that character. Looking further, you used a stardard 14 font. These have only 255 (or even less) characters. That is why I told you to look at the examples. See here at EmbeddedFonts.java: String dir = "../pdfbox/src/main/resources/org/apache/pdfbox/resources/ttf/"; PDType0Font font = PDType0Font.load(document, new File(dir + "LiberationSans-Regular.ttf")); PDPageContentStream stream = new PDPageContentStream(document, page); stream.beginText(); stream.setFont(font, 12); stream.setLeading(12 * 1.2); stream.newLineAtOffset(50, 600); stream.showText("PDFBox's Unicode with Embedded TrueType Font"); stream.newLine(); stream.showText("Supports full Unicode text ☺"); stream.newLine(); stream.showText("English русский язык Tiếng Việt"); stream.newLine(); // ligature stream.showText("Ligatures: \uFB01lm \uFB02ood"); stream.endText(); stream.close(); Tilman Am 25.02.2016 um 06:47 schrieb Sunrita Bagchi Basu: Thanks for the quick tip. I moved to 2.0.0-RC3 . Now I'm getting the following exception for the following line of code: float size = CreatePDF.fontSize * PDType1Font.HELVETICA_BOLD.getStringWidth(subString) / 1000; java.lang.IllegalArgumentException: U+FFFD ('.notdef') is not available in this font's encoding: WinAnsiEncoding at org.apache.pdfbox.pdmodel.font.PDType1Font.encode(PDType1Font.java:344) at org.apache.pdfbox.pdmodel.font.PDFont.encode(PDFont.java:285) at org.apache.pdfbox.pdmodel.font.PDFont.getStringWidth(PDFont.java:314) at com.evolv.dataflow.compliance.pdf.PDFColumn.createContent(PDFColumn.java:74) at com.evolv.dataflow.compliance.pdf.PDFColumn.(PDFColumn.java:40) at com.evolv.dataflow.compliance.pdf.CreateUserPDF.createPDFRow(CreateUserPDF.java:89) at com.evolv.dataflow.compliance.pdf.CreateUserPDF.createPDFRow(CreateUserPDF.java:1) at com.evolv.dataflow.compliance.pdf.CreatePDF.lambda$0(CreatePDF.java:234) at com.evolv.dataflow.compliance.pdf.CreatePDF$$Lambda$1/992768706.accept(Unknown Source) at java.util.ArrayList.forEach(ArrayList.java:1249) at com.evolv.dataflow.compliance.pdf.CreatePDF.CreatePDFDocument(CreatePDF.java:234) at com.evolv.dataflow.compliance.pdf.TestPDF.testUserPDF(TestPDF.java:59) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:483) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229) at org.junit.runners.ParentRunner.run(ParentRunner.java:309) at org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:50) at org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38) at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:459) at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:675) at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:382) at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:192) On Thu, Feb 25, 2016 at 10:43 AM, Tilman Hausherr wrote: Hi, You can't. Use the 2.0RC3 version. The API to create PDF is slightly different, see the examples in the source download. Tilman Am 25.02.2016 um 05:13 schrieb Sunrita Bagchi Basu: Hi All, I am creating a PDF document with a simple table where i write my Data. Most of my data is plain english, but sometimes there are some non english characters too like french , japanese. Whenever these characters apprear, the font style (that I have chosen for my PDF) changes! There's junk characters with lots of white space in between.I tried the following: 1. removing the non english characters 2. extracting the data using UTF-8 charset but non of it works! Attached is the sample of the font change. How
Re: Printing non english characters in a PDF with PDFBox 1.8.10
Thanks for the quick tip. I moved to 2.0.0-RC3 . Now I'm getting the following exception for the following line of code: float size = CreatePDF.fontSize * PDType1Font.HELVETICA_BOLD.getStringWidth(subString) / 1000; java.lang.IllegalArgumentException: U+FFFD ('.notdef') is not available in this font's encoding: WinAnsiEncoding at org.apache.pdfbox.pdmodel.font.PDType1Font.encode(PDType1Font.java:344) at org.apache.pdfbox.pdmodel.font.PDFont.encode(PDFont.java:285) at org.apache.pdfbox.pdmodel.font.PDFont.getStringWidth(PDFont.java:314) at com.evolv.dataflow.compliance.pdf.PDFColumn.createContent(PDFColumn.java:74) at com.evolv.dataflow.compliance.pdf.PDFColumn.(PDFColumn.java:40) at com.evolv.dataflow.compliance.pdf.CreateUserPDF.createPDFRow(CreateUserPDF.java:89) at com.evolv.dataflow.compliance.pdf.CreateUserPDF.createPDFRow(CreateUserPDF.java:1) at com.evolv.dataflow.compliance.pdf.CreatePDF.lambda$0(CreatePDF.java:234) at com.evolv.dataflow.compliance.pdf.CreatePDF$$Lambda$1/992768706.accept(Unknown Source) at java.util.ArrayList.forEach(ArrayList.java:1249) at com.evolv.dataflow.compliance.pdf.CreatePDF.CreatePDFDocument(CreatePDF.java:234) at com.evolv.dataflow.compliance.pdf.TestPDF.testUserPDF(TestPDF.java:59) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:483) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229) at org.junit.runners.ParentRunner.run(ParentRunner.java:309) at org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:50) at org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38) at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:459) at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:675) at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:382) at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:192) On Thu, Feb 25, 2016 at 10:43 AM, Tilman Hausherr wrote: > Hi, > > You can't. Use the 2.0RC3 version. The API to create PDF is slightly > different, see the examples in the source download. > > Tilman > > > Am 25.02.2016 um 05:13 schrieb Sunrita Bagchi Basu: > >> Hi All, >> >> I am creating a PDF document with a simple table where i write my Data. >> Most of my data is plain english, but sometimes there are some non >> english characters >> too like french , japanese. Whenever these characters apprear, the font >> style (that I have chosen for my PDF) changes! There's junk characters with >> lots of white space in between.I tried the following: >> >> 1. removing the non english characters >> 2. extracting the data using UTF-8 charset >> >> but non of it works! Attached is the sample of the font change. >> How to tackle this ? >> >> >> - >> To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org >> For additional commands, e-mail: users-h...@pdfbox.apache.org >> > >
Re: Printing non english characters in a PDF with PDFBox 1.8.10
Hi, You can't. Use the 2.0RC3 version. The API to create PDF is slightly different, see the examples in the source download. Tilman Am 25.02.2016 um 05:13 schrieb Sunrita Bagchi Basu: Hi All, I am creating a PDF document with a simple table where i write my Data. Most of my data is plain english, but sometimes there are some non english characters too like french , japanese. Whenever these characters apprear, the font style (that I have chosen for my PDF) changes! There's junk characters with lots of white space in between.I tried the following: 1. removing the non english characters 2. extracting the data using UTF-8 charset but non of it works! Attached is the sample of the font change. How to tackle this ? - To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org For additional commands, e-mail: users-h...@pdfbox.apache.org
Printing non english characters in a PDF with PDFBox 1.8.10
Hi All, I am creating a PDF document with a simple table where i write my Data. Most of my data is plain english, but sometimes there are some non english characters too like french , japanese. Whenever these characters apprear, the font style (that I have chosen for my PDF) changes! There's junk characters with lots of white space in between.I tried the following: 1. removing the non english characters 2. extracting the data using UTF-8 charset but non of it works! Attached is the sample of the font change. How to tackle this ? - To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org For additional commands, e-mail: users-h...@pdfbox.apache.org
Re: Bad text extraction result
Many thanks Tilman. I'll try to find a workaround in the meantime. Cheers, Francisco El mié., 24 de feb. de 2016 a la(s) 17:47, Tilman Hausherr < thaush...@t-online.de> escribió: > I'll create an issue in JIRA later or tomorrow, but don't expect that > this will be fixed quickly (unless I missed something obvious). We want > to release 2.0 before doing any "big" work on text extraction. > > Tilman > > - > To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org > For additional commands, e-mail: users-h...@pdfbox.apache.org > >
Re: PDAnnotationMarkup.getInReplyTo
Am 24.02.2016 um 23:38 schrieb David Lattimore: On Wed, Feb 24, 2016 at 6:39 PM, Tilman Hausherr wrote: >Am 24.02.2016 um 06:21 schrieb David Lattimore: > >>I'm trying to read annotations from PDFs and am having trouble matching up >>replies with the annotations they're in reply to. >> >>PDPage.getAnnotations() returns a list of PDAnnotation. When I have >>a PDAnnotationMarkup, I try to call getInReplyTo to get the previous >>annotation in the thread. But I have two problems: >> >>1) getInReplyTo() crashes if the annotation isn't a reply. It'd be nice if >>it just returned null. I can work around this by getting the COSDictionary >>and checking for an IRT entry first. >> > >Please post a stack trace java.io.IOException: Error: Unknown annotation type null at org.apache.pdfbox.pdmodel.interactive.annotation.PDAnnotation.createAnnotation(PDAnnotation.java:167) at org.apache.pdfbox.pdmodel.interactive.annotation.PDAnnotationMarkup.getInReplyTo(PDAnnotationMarkup.java:225) The following code can be used to reproduce this: new PDAnnotationMarkup().getInReplyTo(); So it is an IOException, this isn't as bad as a nullpointerexception. But it is kindof weird indeed, usually we just return null when something doesn't exist. (I know that this is bad design http://www.yegor256.com/2014/05/13/why-null-is-bad.html but it's too late now). I'll sleep over it and then probably change it. Tilman - To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org For additional commands, e-mail: users-h...@pdfbox.apache.org
Re: PDAnnotationMarkup.getInReplyTo
On Wed, Feb 24, 2016 at 6:39 PM, Tilman Hausherr wrote: > Am 24.02.2016 um 06:21 schrieb David Lattimore: > >> I'm trying to read annotations from PDFs and am having trouble matching up >> replies with the annotations they're in reply to. >> >> PDPage.getAnnotations() returns a list of PDAnnotation. When I have >> a PDAnnotationMarkup, I try to call getInReplyTo to get the previous >> annotation in the thread. But I have two problems: >> >> 1) getInReplyTo() crashes if the annotation isn't a reply. It'd be nice if >> it just returned null. I can work around this by getting the COSDictionary >> and checking for an IRT entry first. >> > > Please post a stack trace java.io.IOException: Error: Unknown annotation type null at org.apache.pdfbox.pdmodel.interactive.annotation.PDAnnotation.createAnnotation(PDAnnotation.java:167) at org.apache.pdfbox.pdmodel.interactive.annotation.PDAnnotationMarkup.getInReplyTo(PDAnnotationMarkup.java:225) The following code can be used to reproduce this: new PDAnnotationMarkup().getInReplyTo(); > 2) The PDAnnotation returned by getInReplyTo() isn't one of the annotations >> returned by PDPage.getAnnotations() and I can't see how I can match them >> up. Ideally I'd like to get the object ID for each, but the PDAnnotation >> doesn't seem to know it's object ID as far as I can see. I could match >> them >> by keying on various attributes of the annotation like the text content >> and >> the timestamp, but this feels pretty hacky. >> > > try getCOSObject(), this should work. Ah, of course. That does indeed work. Sorry I didn't think of that myself and thank you.
Re: Bad text extraction result
I'll create an issue in JIRA later or tomorrow, but don't expect that this will be fixed quickly (unless I missed something obvious). We want to release 2.0 before doing any "big" work on text extraction. Tilman - To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org For additional commands, e-mail: users-h...@pdfbox.apache.org
Re: Bad text extraction result
I tried all the settings and was unsuccessful. I was unable to extract "Cada frasco ampolla" which looked pretty obvious, it always appeared as "Ca da fras co ampo lla". Then I looked into the content stream and found this: 6 0 1.058 6 122.0924 312.51 Tm (Ca) Tj /Span << /ActualText (\376\377\000\255) >> BDC ( ) Tj EMC [ (da ) -301 (fras) ] TJ /Span << /ActualText (\376\377\000\255) >> BDC ( ) Tj EMC [ (co ) -301 (ampo) ] TJ /Span << /ActualText (\376\377\000\255) >> BDC ( ) Tj EMC [ (lla ) -301 (con) ] TJ So there are really spaces there, and we keep them. Adobe is smarter, and ignores them because they are overwritten thanks to the "-301" you see (that is a positioning). This /ActualText thing might be of some help, but I don't think we process this. Tilman Am 24.02.2016 um 20:47 schrieb Francisco Andrés Fernández: Hi Tilman, many thanks for your answer. I doesn't find any configuration file to tweak this. I send you the link to the pdf file to see if you could figure an idea about what the problem is. https://drive.google.com/file/d/0B0PMZsHkpcJRSEpBSWhtQndKZTg/view?usp=sharing Many thanks in advance, Francisco El mié., 24 de feb. de 2016 a la(s) 16:29, Tilman Hausherr < thaush...@t-online.de> escribió: Am 24.02.2016 um 20:17 schrieb Francisco Andrés Fernández: Hi all, I'm extracting some text from pdf, through Tika in Solr. As result, some important words end with spaces between characters. For example, I could have the word "Subtitle" that I want to detect, written like "S u b t i t l e". You could try to modify spacingTolerance or averageCharTolerance in PDFTextStripper (find out if TIKA supports this), but it is likely that if spaces are ignored, they would be ignored at other places where you don't want it. If possible, please upload your file somewhere. Tilman How could I make PdfBox detect this type of word occurrence? Many thanks, Francisco - To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org For additional commands, e-mail: users-h...@pdfbox.apache.org - To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org For additional commands, e-mail: users-h...@pdfbox.apache.org
Re: Bad text extraction result
Hi Tilman, many thanks for your answer. I doesn't find any configuration file to tweak this. I send you the link to the pdf file to see if you could figure an idea about what the problem is. https://drive.google.com/file/d/0B0PMZsHkpcJRSEpBSWhtQndKZTg/view?usp=sharing Many thanks in advance, Francisco El mié., 24 de feb. de 2016 a la(s) 16:29, Tilman Hausherr < thaush...@t-online.de> escribió: > Am 24.02.2016 um 20:17 schrieb Francisco Andrés Fernández: > > Hi all, > > I'm extracting some text from pdf, through Tika in Solr. As result, some > > important words end with spaces between characters. > > For example, I could have the word "Subtitle" that I want to detect, > > written like "S u b t i t l e". > > You could try to modify spacingTolerance or averageCharTolerance in > PDFTextStripper (find out if TIKA supports this), but it is likely that > if spaces are ignored, they would be ignored at other places where you > don't want it. > > If possible, please upload your file somewhere. > > Tilman > > > How could I make PdfBox detect this type of word occurrence? > > Many thanks, > > > > Francisco > > > > > - > To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org > For additional commands, e-mail: users-h...@pdfbox.apache.org > >
Re: Bad text extraction result
Am 24.02.2016 um 20:17 schrieb Francisco Andrés Fernández: Hi all, I'm extracting some text from pdf, through Tika in Solr. As result, some important words end with spaces between characters. For example, I could have the word "Subtitle" that I want to detect, written like "S u b t i t l e". You could try to modify spacingTolerance or averageCharTolerance in PDFTextStripper (find out if TIKA supports this), but it is likely that if spaces are ignored, they would be ignored at other places where you don't want it. If possible, please upload your file somewhere. Tilman How could I make PdfBox detect this type of word occurrence? Many thanks, Francisco - To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org For additional commands, e-mail: users-h...@pdfbox.apache.org
Bad text extraction result
Hi all, I'm extracting some text from pdf, through Tika in Solr. As result, some important words end with spaces between characters. For example, I could have the word "Subtitle" that I want to detect, written like "S u b t i t l e". How could I make PdfBox detect this type of word occurrence? Many thanks, Francisco
Re: Rotating a new annotation to match the page's rotation
Am 24.02.2016 um 09:34 schrieb Gilad Denneboom: No one has any ideas? ... I don't know of a utility. I assume you need to use AffineTransform math to calculate the coordinates, i.e. a rotation and a translation on one axis. Note that rotations always go around the 0 point, that's why you need the transation. Tilman On Sun, Feb 21, 2016 at 12:30 AM, Gilad Denneboom wrote: Hi all, Hoping someone can help me with this issue... I have a tool that adds new highlight annotations to a page. It works very well, except for when the page is rotated. I know I need to apply a transformation to my rect and/or quads to get them to match the rotated user space, but I just can't get it to work. Is there a utility in PDFBox (I'm using 1.8.11 at the moment) that can help me perform this transformation so I can place my annotations at the right location on these pages? Thanks a lot in advance for any helpful tips... Gilad - To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org For additional commands, e-mail: users-h...@pdfbox.apache.org
Re: (pdffile) does not allow extracting content
Oh... I am re-encrypting the loadNonSeq() document by using document.openProtection() I'll stop that... Thanks! On Wed, Feb 24, 2016 at 2:02 AM, Tilman Hausherr wrote: > Am 24.02.2016 um 00:27 schrieb Brzrk One: > >> Yea, I think that's it. >> Comparing the input pdf to the loadNonSeq() output, I see objects that >> have the same content. >> This means that the loadNonSeq() output is encrypted - like the input - >> while the load() output is not. However, the loadNonSeq() output has no >> /Encrypt dictionary. >> >> I am using this on both paths: >> StandardDecryptionMaterial sdm = new StandardDecryptionMaterial(""); >> document.openProtection(sdm); >> > > You shouldn't use this on loadNonSeq, or in 2.0 (it isn't available there > anyway). > > You only need it with load() in 1.8. > > >> without error. >> Is this a feature of loadNonSeq() in the face of >> AccessPermission.canExtractContent() == true? >> Or did I do something wrong here? >> > > You need openProtection() only with load() in 1.8 and only if the file is > encrypted. (Yours is) > > Tilman >
Re: Rotating a new annotation to match the page's rotation
I ended up with a transformation that I applied to the coordinates of my unrotated annotation.
Re: Rotating a new annotation to match the page's rotation
Did you create a generic converter, or did you just hard-code the results you got from manually rotating the page? I'm sure it's possible to abstract this transformation to a formula that could be applied to the quads to rotate them so that they fit the page's rotation. I just can't seem to figure it out... I actually think that the Matrix2D object in Acrobat's JavaScript does that, but it's too opaque for me to figure out how it works, exactly. I can post the source code for it, if anyone is interested. On Wed, Feb 24, 2016 at 3:27 PM, Karl Heinz Kremer wrote: > I came up with a solution for such a problem in a previous job (which means > I don't have access to the code anymore). The process I used was to create > test documents in all rotations, with annotations in all four rotations in > Adobe Acrobat. I then analyzed the files and came up with a transformation > for every case. I then just used a switch statement for the four different > page rotations to place my annotation in the correct spot. > > Hope that helps, > > Karl Heinz > > On Wed, Feb 24, 2016 at 8:30 AM, Gilad Denneboom < > gilad.denneb...@gmail.com> > wrote: > > > It's a highlight. > > > > Let me give you the background. I'm using a variation on > PrintTextLocation > > to find the locations I want to highlight and then add them. As I wrote, > it > > works very well in pages with zero rotation, but the results are skewed > > when the pages are rotated. > > What I mean by skewed is that they appear on the location of the page > where > > the text would have have been had it not been rotated, if that makes > sense. > > I'll try to set up a simple example that demonstrates this issue and > share > > it. > > > > On Wed, Feb 24, 2016 at 1:34 PM, Maruan Sahyoun > > wrote: > > > > > Hi, > > > > > > what type of annotation are you trying to put on the page. I could > create > > > a little sample placing an annotation at the (visual) upper left corner > > of > > > a portrait and landscape page. > > > > > > BR > > > Maruan > > > > > > > Am 24.02.2016 um 09:34 schrieb Gilad Denneboom < > > > gilad.denneb...@gmail.com>: > > > > > > > > No one has any ideas? ... > > > > > > > > On Sun, Feb 21, 2016 at 12:30 AM, Gilad Denneboom < > > > gilad.denneb...@gmail.com > > > >> wrote: > > > > > > > >> Hi all, > > > >> > > > >> Hoping someone can help me with this issue... > > > >> I have a tool that adds new highlight annotations to a page. It > works > > > very > > > >> well, except for when the page is rotated. I know I need to apply a > > > >> transformation to my rect and/or quads to get them to match the > > rotated > > > >> user space, but I just can't get it to work. > > > >> Is there a utility in PDFBox (I'm using 1.8.11 at the moment) that > can > > > >> help me perform this transformation so I can place my annotations at > > the > > > >> right location on these pages? > > > >> > > > >> Thanks a lot in advance for any helpful tips... > > > >> > > > >> Gilad > > > >> > > > > > > > > > - > > > To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org > > > For additional commands, e-mail: users-h...@pdfbox.apache.org > > > > > > > > >
Re: Rotating a new annotation to match the page's rotation
I came up with a solution for such a problem in a previous job (which means I don't have access to the code anymore). The process I used was to create test documents in all rotations, with annotations in all four rotations in Adobe Acrobat. I then analyzed the files and came up with a transformation for every case. I then just used a switch statement for the four different page rotations to place my annotation in the correct spot. Hope that helps, Karl Heinz On Wed, Feb 24, 2016 at 8:30 AM, Gilad Denneboom wrote: > It's a highlight. > > Let me give you the background. I'm using a variation on PrintTextLocation > to find the locations I want to highlight and then add them. As I wrote, it > works very well in pages with zero rotation, but the results are skewed > when the pages are rotated. > What I mean by skewed is that they appear on the location of the page where > the text would have have been had it not been rotated, if that makes sense. > I'll try to set up a simple example that demonstrates this issue and share > it. > > On Wed, Feb 24, 2016 at 1:34 PM, Maruan Sahyoun > wrote: > > > Hi, > > > > what type of annotation are you trying to put on the page. I could create > > a little sample placing an annotation at the (visual) upper left corner > of > > a portrait and landscape page. > > > > BR > > Maruan > > > > > Am 24.02.2016 um 09:34 schrieb Gilad Denneboom < > > gilad.denneb...@gmail.com>: > > > > > > No one has any ideas? ... > > > > > > On Sun, Feb 21, 2016 at 12:30 AM, Gilad Denneboom < > > gilad.denneb...@gmail.com > > >> wrote: > > > > > >> Hi all, > > >> > > >> Hoping someone can help me with this issue... > > >> I have a tool that adds new highlight annotations to a page. It works > > very > > >> well, except for when the page is rotated. I know I need to apply a > > >> transformation to my rect and/or quads to get them to match the > rotated > > >> user space, but I just can't get it to work. > > >> Is there a utility in PDFBox (I'm using 1.8.11 at the moment) that can > > >> help me perform this transformation so I can place my annotations at > the > > >> right location on these pages? > > >> > > >> Thanks a lot in advance for any helpful tips... > > >> > > >> Gilad > > >> > > > > > > - > > To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org > > For additional commands, e-mail: users-h...@pdfbox.apache.org > > > > >
Re: Rotating a new annotation to match the page's rotation
It's a highlight. Let me give you the background. I'm using a variation on PrintTextLocation to find the locations I want to highlight and then add them. As I wrote, it works very well in pages with zero rotation, but the results are skewed when the pages are rotated. What I mean by skewed is that they appear on the location of the page where the text would have have been had it not been rotated, if that makes sense. I'll try to set up a simple example that demonstrates this issue and share it. On Wed, Feb 24, 2016 at 1:34 PM, Maruan Sahyoun wrote: > Hi, > > what type of annotation are you trying to put on the page. I could create > a little sample placing an annotation at the (visual) upper left corner of > a portrait and landscape page. > > BR > Maruan > > > Am 24.02.2016 um 09:34 schrieb Gilad Denneboom < > gilad.denneb...@gmail.com>: > > > > No one has any ideas? ... > > > > On Sun, Feb 21, 2016 at 12:30 AM, Gilad Denneboom < > gilad.denneb...@gmail.com > >> wrote: > > > >> Hi all, > >> > >> Hoping someone can help me with this issue... > >> I have a tool that adds new highlight annotations to a page. It works > very > >> well, except for when the page is rotated. I know I need to apply a > >> transformation to my rect and/or quads to get them to match the rotated > >> user space, but I just can't get it to work. > >> Is there a utility in PDFBox (I'm using 1.8.11 at the moment) that can > >> help me perform this transformation so I can place my annotations at the > >> right location on these pages? > >> > >> Thanks a lot in advance for any helpful tips... > >> > >> Gilad > >> > > > - > To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org > For additional commands, e-mail: users-h...@pdfbox.apache.org > >
Re: Rotating a new annotation to match the page's rotation
Hi, what type of annotation are you trying to put on the page. I could create a little sample placing an annotation at the (visual) upper left corner of a portrait and landscape page. BR Maruan > Am 24.02.2016 um 09:34 schrieb Gilad Denneboom : > > No one has any ideas? ... > > On Sun, Feb 21, 2016 at 12:30 AM, Gilad Denneboom > wrote: > >> Hi all, >> >> Hoping someone can help me with this issue... >> I have a tool that adds new highlight annotations to a page. It works very >> well, except for when the page is rotated. I know I need to apply a >> transformation to my rect and/or quads to get them to match the rotated >> user space, but I just can't get it to work. >> Is there a utility in PDFBox (I'm using 1.8.11 at the moment) that can >> help me perform this transformation so I can place my annotations at the >> right location on these pages? >> >> Thanks a lot in advance for any helpful tips... >> >> Gilad >> - To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org For additional commands, e-mail: users-h...@pdfbox.apache.org
Re: Rotating a new annotation to match the page's rotation
Hi, > Gilad Denneboom hat am 24. Februar 2016 um 09:34 > geschrieben: > > > No one has any ideas? ... > > On Sun, Feb 21, 2016 at 12:30 AM, Gilad Denneboom > wrote: > > > Hi all, > > > > Hoping someone can help me with this issue... > > I have a tool that adds new highlight annotations to a page. It works very > > well, except for when the page is rotated. I know I need to apply a > > transformation to my rect and/or quads to get them to match the rotated > > user space, but I just can't get it to work. > > Is there a utility in PDFBox (I'm using 1.8.11 at the moment) that can > > help me perform this transformation so I can place my annotations at the > > right location on these pages? > > > > Thanks a lot in advance for any helpful tips... I'm not an annotation expert, but according to the spec both the Rect and the QuadPoints values are specified in default user space which doesn't include any rotation or scaling. But I have no clue where to put these information instead. Can you create a sample pdf with such an annotation using acrobat or something similar so that we can have a look how it looks like? > > Gilad BR Andreas - To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org For additional commands, e-mail: users-h...@pdfbox.apache.org
Re: Rotating a new annotation to match the page's rotation
No one has any ideas? ... On Sun, Feb 21, 2016 at 12:30 AM, Gilad Denneboom wrote: > Hi all, > > Hoping someone can help me with this issue... > I have a tool that adds new highlight annotations to a page. It works very > well, except for when the page is rotated. I know I need to apply a > transformation to my rect and/or quads to get them to match the rotated > user space, but I just can't get it to work. > Is there a utility in PDFBox (I'm using 1.8.11 at the moment) that can > help me perform this transformation so I can place my annotations at the > right location on these pages? > > Thanks a lot in advance for any helpful tips... > > Gilad >