I would say that the approach that fixes the most items w/o breaking those
that work is the better approach.

What I have done is run the regression tests before committing.  If, though
I've fixed something, I've broken a regression test, I haven't committed
that change.

We do have a regression test for the image rendering now (have since May,
IIRC) and 1 or 2 of those tests involve rotation.

Daniel Wilson

On 10/31/08, Brian Carrier (JIRA) <[EMAIL PROTECTED]> wrote:
>
>
>     [
> https://issues.apache.org/jira/browse/PDFBOX-363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12644414#action_12644414]
>
>
> Brian Carrier commented on PDFBOX-363:
> --------------------------------------
>
>
> What additional information is needed for a commiter to make a decision on
> which approach will be used?  There are some additional text rotation
> problems that are not addressed by either this patch or the one associated
> with PDFBOX-374 and the fix requires taking the text matrix into account.
> The current fix that I am working on will be impacted by which approach is
> used.
>
> FWIW, the approach of not  rotating the text in PDFTextStripper and taking
> the rotation into account in TextPositionComparator will be easiest for the
> next fix as well (which will allow page rotation as well as having text on a
> page going in a different direction - such as on the y-axis of a graph).
>
>
>
> > Fixed Page rotation
> > -------------------
> >
> >                 Key: PDFBOX-363
> >                 URL: https://issues.apache.org/jira/browse/PDFBOX-363
> >             Project: PDFBox
> >          Issue Type: Improvement
> >            Reporter: Jukka Zitting
> >
> > [Issue from SourceForge]
> >
> http://sourceforge.net/tracker/index.php?func=detail&aid=1977429&group_id=78314&atid=552834
> > Hi all,
> > Daniel asked me for my patch for the rotation-issue described in
> > https://sourceforge.net/forum/message.php?msg_id=4992032
> > Attention, I didn't apply the newest patches to the classes
> PDFStreamEngine
> > and PageDrawer.
> > There are 4 more probably affected classes calling the page.findRotation
> > method which I didn't change, because I'm didn't have to use them (until
> > now).
> > org.pdfbox.util.operator.pagedrawer.Invoke
> > org.pdfbox.util.TextPositionComparator
> > org.pdfbox.examples.pdmodel.PrintURLs
> > org.pdfbox.examples.util.PrintImageLocations
> > I've attached a pdf in DINA4-landscape. The text is missplaced whenever I
> > try to print or display (using the pdfbox-PDFReader and convertToImage
> > within my application) it with pdfbox. The acrobat reader has no problems
> > with my documents.
> > After my patch everything works fine. Perhaps it is a point of
> discussion,
> > if the convertToImage method has to rotate the image or if the user has
> to
> > do it. The PDFPagePanel didn't do it (yet).
> > Andreas
> >
> http://sourceforge.net/tracker/download.php?group_id=78314&atid=552834&file_id=279404&aid=1977429
> > [Comment from SourceForge]
> > Date: 2008-05-29 12:42
> > Sender: danielwilson
> > Logged In: YES
> > user_id=1737686
> > Originator: NO
> > I've just tried your sample PDF w/ the latest code -- prior to
> application
> > of your patch.  It doesn't work.
> > I'll work on incorporating your change for a full regression test in the
> > next hour or so.
> > [Comment from SourceForge]
> > Date: 2008-05-29 15:16
> > Sender: lehmialk
> > Logged In: YES
> > user_id=2069622
> > Originator: YES
> > Hi Daniel,
> > I've just added my patch to the newest sources you send me earlier this
> > day. I guess it works. During testing I've found another problem
> concernign
> > graphics within landscape-docs. I found the solution in patching the
> class
> > org.pdfbox.util.operator.pagedrawer.Invoke in the same way I've patched
> the
> > others. And consequently to be strict I've also patched the new methods
> in
> > org.pdfbox.pdfviewer.PageDrawer
> > For my everthings works fine inlc. the 4PP-pdf.
> > I've attached the patched files and another testpdf with a embedded
> > graphic.
> > Andreas
> > File Added: pdfbox_rotation_patch_2.zip
> >
> http://sourceforge.net/tracker/download.php?group_id=78314&atid=552834&file_id=279471&aid=1977429
> > [Comment from SourceForge]
> > Date: 2008-05-29 18:12
> > Sender: danielwilson
> > Logged In: YES
> > user_id=1737686
> > Originator: NO
> > Your code works w/ the 4PP test ... and with the other rendering stuff
> > I've tried so far.
> > However ... the text extraction test fails with it.  I can't figure that
> > one out ... ideas?
> > [Comment from SourceForge]
> > Date: 2008-05-29 18:19
> > Sender: lehmialk
> > Logged In: YES
> > user_id=2069622
> > Originator: YES
> > Can you give me some more details? I never do any textextractions with
> > pdfbox. Perhaps you'll provide with the code for test program, or is it
> > part of pdfbox, so that I can find it in the cvs?
> > However, it has to wait until tomorrow
> > [Comment from SourceForge]
> > Date: 2008-05-29 18:39
> > Sender: danielwilson
> > Logged In: YES
> > user_id=1737686
> > Originator: NO
> > If you've got the whole project set up, try
> > ant testextract
> > I'll see if I can narrow it down some.
> > [Comment from SourceForge]
> > Date: 2008-05-29 21:00
> > Sender: danielwilson
> > Logged In: YES
> > user_id=1737686
> > Originator: NO
> > The extraction problem seems to have to do w/ the changes to
> > PDFStreamEngine.
> > If I revert that file, extraction succeeds.  Unfortunately ... with that
> > reverted but your other changes in place, image rendering hangs.
> > Will work on it more ... probably tomorrow.
> > [Comment from SourceForge]
> > Date: 2008-05-29 21:12
> > Sender: danielwilson
> > Logged In: YES
> > user_id=1737686
> > Originator: NO
> > Correction ... it doesn't hang ... it's just slow on the first PDF to
> > render ... maybe just due to the first one I'm sending it.
> > Will look more tomorrow.
> > [Comment from SourceForge]
> > Date: 2008-05-30 07:11
> > Sender: lehmialk
> > Logged In: YES
> > user_id=2069622
> > Originator: YES
> > I've found one bug. While deleting the if rules for the rotation, I've
> > deleted line 394 which is still needed.
> > I've attached the corrected file
> > File Added: PDFStreamEngine.java
> >
> http://sourceforge.net/tracker/download.php?group_id=78314&atid=552834&file_id=279559&aid=1977429
> > [Comment from SourceForge]
> > Date: 2008-05-30 07:43
> > Sender: lehmialk
> > Logged In: YES
> > user_id=2069622
> > Originator: YES
> > I forgot to mention that I can't run the test suite. When I try to get
> the
> > whole project, I realized that I'm behind a firewall here in my office.
> > Consequently my cvs-client doesn't work. I've to do it from home. :-(
> > I've only tested one file: 601501018.pdf
> > There are additional blanks and they disapper after adding the missing
> > line. But starting at page 21, when the document orientation changes from
> > portrait to landscape, there are additional cr or lf. Hmmmm ??
> > [Comment from SourceForge]
> > Date: 2008-05-30 08:25
> > Sender: lehmialk
> > Logged In: YES
> > user_id=2069622
> > Originator: YES
> > I've continued testing and I guess the problem is somewhere starting in
> > org.pdfbox.util.PDFTextStripper.showCharacter(..). Obviously it handles
> the
> > coordinates for rotated pages somehow in an other way than the
> > implementation of the showCharacter() in org.pdfbox.pdfviewer.PageDrawer.
> > But for the moment I don't understand what's happening in the
> > TextStripper, perhaps I'll find out later.
> > I hope this hint helps ...
> > [Comment from SourceForge]
> > Date: 2008-05-30 16:20
> > Sender: danielwilson
> > Logged In: YES
> > user_id=1737686
> > Originator: NO
> > I've put a couple more hours into this, and I don't know the answer.
> > I do know the text extraction is the more mature side of this library.
> > For the moment, I'll be skipping over your changes to PDFStreamEngine.
> > Thanks for the other changes!
> > [Comment from SourceForge]
> > Date: 2008-06-02 09:21
> > Sender: lehmialk
> > Logged In: YES
> > user_id=2069622
> > Originator: YES
> > Hi Daniel,
> > I guess I've solved the problem. The textposition-handling has to be
> > adjusted within the method PDFTextStripper.flushText(). Of course my
> former
> > changes to the class PDFStreamEngine are needed. During debugging I found
> a
> > bug in the class TextPositionComparator (line 82). I solved it by
> removing
> > the rotation if-clauses. Whenever you compare two Textpositions, it is
> > needless to look at the rotation because they are on the same page so
> that
> > the comparison is independent of the rotation.
> > Furthermore my PDFTextStripper-patch seems to correct some minor
> problems,
> > which are described in
> > https://sourceforge.net/forum/message.php?msg_id=4976730.
> > I've tested the following cases:
> > Garcia2003b__Correlative_exploration_of_EEG_Signals.pdf works 100%
> > test_rotate_270.txt doesn't work 100%, but my patch corrected a bug in
> > lines 251-257, 278/279, 502/503, 574/575 and the other differences are
> some
> > kind of special-character-issues. I guess you have to correct the input
> at
> > first.
> > I've attached my changes based on the newest versions of both classes.
> > [Comment from SourceForge]
> > Date: 2008-06-02 09:22
> > Sender: lehmialk
> > Logged In: YES
> > user_id=2069622
> > Originator: YES
> > File Added: pdfbox_rotation_patch_3.zip
> >
> http://sourceforge.net/tracker/download.php?group_id=78314&atid=552834&file_id=279842&aid=1977429
>
> --
> This message is automatically generated by JIRA.
> -
> You can reply to this email to add a comment to the issue online.
>
>

Reply via email to