Allow to change the default TextPositionComparator in pdfTextStriper
--------------------------------------------------------------------
Key: PDFBOX-1163
URL: https://issues.apache.org/jira/browse/PDFBOX-1163
Project: PDFBox
Issue Type: New Feature
Components: Text extraction
Environment: All
Reporter: Sébastien Dailly
Priority: Minor
As mentioned in the mailing list, the pdfTextStriper does not allow to change
the default textPositionComparator used for orderning the char on the page.
Here is a patch for a setter :
23a24
> import java.util.Comparator;
69c70,76
< //enable the ability to set the default indent/drop thresholds
---
> /**
> * This comparator is used for sorting the char by their position on the
> * page.
> */
> private Comparator<TextPosition> textPositionComparator= new
> TextPositionComparator();
>
> //enable the ability to set the default indent/drop thresholds
550c557
< TextPositionComparator comparator = new
TextPositionComparator();
---
> TextPositionComparator comparator =
> getTextPositionComparator();
1312a1320,1338
> /**
> * Get the Comparator used when sortByPosition is True
> * @return a Comparator
> */
> public final Comparator<TextPosition> getTextPositionComparator() {
> return textPositionComparator;
> }
>
>
> /**
> * Define the Comparator used when sortByPosition is True
> * @param textPositionComparator the Comparator to use
> */
> public final void setTextPositionComparator(
> Comparator<TextPosition> textPositionComparator) {
> this.textPositionComparator = textPositionComparator;
> }
>
>
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira