Thx a lot for every answer. I managed to do it for relevant cases. Regards, Lukas
-----Ursprüngliche Nachricht----- Von: Peter Murray-Rust <peter.murray.r...@googlemail.com.INVALID> Gesendet: Freitag, 20. September 2019 18:58 An: users@pdfbox.apache.org Betreff: Re: AW: Finding a Box containing text I do a lot of this and there is no generic way. The rect might be a rect or 4 lines or a polyline 3 or 4 (or 5 for overlaps). It migh be drawn twice for emplhasis . I have have some heuristics for creating probable rects. in http://github.com/petermr/ami3 If you are serious and doing a *lot* I can show you where to find them. On Fri, 20 Sep 2019, 15:41 PDF Developer, <pdf...@yahoo.com.invalid> wrote: > Lukas. > Quick answer: > I looked at the page content stream using the PDFBox Debugger and > appendRectangle isn't triggering because there isn't a rectangle in > the page content stream. What is rendered is made up from a move and > lines. I also had to handle this in my project. One way would be to > Override other methods so that you catch a moveTo, lineTo, closePath > strokeAndFill etc and store the points, to see when closePath is called if > they form a rectangle. > > If I have time, I am about to go on a business trip, I will see if I > can cut down my code to illustrate this. > > PDFDev > > On Friday, September 20, 2019, 2:58:58 PM GMT+1, STAMPF Lukas < > lukas.sta...@bat.at> wrote: > > Hello, > > Thanks for the input. > https://filebox.batmen.at/index.php/s/R2PA4HB6eIXkc8c > > Seems like I cant use the appendRectangle method. It does not trigger. > > Regards, > Lukas > > -----Ursprüngliche Nachricht----- > Von: PDF Developer <pdf...@yahoo.com.INVALID> > Gesendet: Freitag, 20. September 2019 11:02 > An: users@pdfbox.apache.org > Betreff: Re: Finding a Box containing text > > Hello Lukas, > This mailing list doesn't accept attachments; you probably want to use > a hosting site instead. > > I am currently working on a project that needs to identify text on a > page within a rectangle. > > This may or may not be appropriate but to do this I Overrride > "PDFGraphicsStreamEngine"; Which has a method appendRectangle, if your > PDF creation application is well behaved you can just use that. That > said in the real world a rectangle can be made up of lines and moves, > so you may have a bit more work to do. If you have the coordinates of > the start of the string, then you could enumerate the rectangles to > see if the point was in a rectangle. Or you could use do things > slightly in reverse and use the bounds of the rectangle and use the > TextStripperByArea to get the text in the rectangle and identify if the > string is what you are looking for. > Unfortunately I can't share my project code but if you can find > somewhere to host the PDF, I will see if I can use it as a test for my > code and if that is successful provide something by way of a slimmed down > example. > PDFDev > > On Friday, September 20, 2019, 9:07:20 AM GMT+1, STAMPF Lukas < > lukas.sta...@bat.at> wrote: > > <!--#yiv9876807336 _filtered #yiv9876807336 {font-family:"Cambria > Math";panose-1:2 4 5 3 5 4 6 3 2 4;} _filtered #yiv9876807336 > {font-family:Calibri;panose-1:2 15 5 2 2 2 4 3 2 4;}#yiv9876807336 > #yiv9876807336 p.yiv9876807336MsoNormal, #yiv9876807336 > li.yiv9876807336MsoNormal, #yiv9876807336 div.yiv9876807336MsoNormal > {margin:0cm;margin-bottom:.0001pt;font-size:11.0pt;font-family:"Calibr > i", > sans-serif;}#yiv9876807336 a:link, #yiv9876807336 > span.yiv9876807336MsoHyperlink > {color:#0563C1;text-decoration:underline;}#yiv9876807336 a:visited, > #yiv9876807336 span.yiv9876807336MsoHyperlinkFollowed > {color:#954F72;text-decoration:underline;}#yiv9876807336 > span.yiv9876807336E-MailFormatvorlage17 {font-family:"Calibri", > sans-serif;color:windowtext;}#yiv9876807336 > .yiv9876807336MsoChpDefault {font-family:"Calibri", sans-serif;} > _filtered #yiv9876807336 {margin:70.85pt 70.85pt 2.0cm > 70.85pt;}#yiv9876807336 > div.yiv9876807336WordSection1 {}--> Hello, > > > > I am trying to find (x,y,widht,height) of a box containing a text > within an PDF document. Locating the text by inheriting the > TextPosition was pretty straightforward, but I had to realize that I > don’t know PDF Operators well enough to locate the box. > > > > Can somebody please have a look at the PDF I attached and tell me > which „q“ – „Q“ block represents my „FIND ME“ Box. Can I subclass > PDFRenderer to get the Box position? > > > > Regards, > > Lukas > > --------------------------------------------------------------------- > To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org > For additional commands, e-mail: users-h...@pdfbox.apache.org > > --------------------------------------------------------------------- > To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org > For additional commands, e-mail: users-h...@pdfbox.apache.org > >