Author: tilman
Date: Thu Jan 1 14:38:39 2026
New Revision: 1931039
Log:
PDFBOX-5660: improve javadoc, as suggested by Valery Bokov; closes #384
Modified:
pdfbox/trunk/pdfbox/src/main/java/org/apache/pdfbox/text/PDFTextStripper.java
Modified:
pdfbox/trunk/pdfbox/src/main/java/org/apache/pdfbox/text/PDFTextStripper.java
==============================================================================
---
pdfbox/trunk/pdfbox/src/main/java/org/apache/pdfbox/text/PDFTextStripper.java
Thu Jan 1 14:38:35 2026 (r1931038)
+++
pdfbox/trunk/pdfbox/src/main/java/org/apache/pdfbox/text/PDFTextStripper.java
Thu Jan 1 14:38:39 2026 (r1931039)
@@ -167,14 +167,14 @@ public class PDFTextStripper extends Leg
* The charactersByArticle is used to extract text by article divisions.
For example a PDF that has two columns like
* a newspaper, we want to extract the first column and then the second
column. In this example the PDF would have 2
* beads(or articles), one for each column. The size of the
charactersByArticle would be 5, because not all text on
- * the screen will fall into one of the articles. The five divisions are
shown below
- *
- * Text before first article
- * first article text
- * text between first article and second article
- * second article text
- * text after second article
- *
+ * the screen will fall into one of the articles. The five divisions are
shown below:
+ * <ol>
+ * <li>Text before first article</li>
+ * <li>first article text</li>
+ * <li>text between first article and second article</li>
+ * <li>second article text</li>
+ * <li>text after second article</li>
+ * </ol>
* Most PDFs won't have any beads, so charactersByArticle will contain a
single entry.
*/
protected ArrayList<List<TextPosition>> charactersByArticle = new
ArrayList<>();