Am 09.01.2017 um 13:23 schrieb [email protected]:
Hi,
thanks for reply, but I only have the string the pattern matches.
In PDF there is a table like this:
10110  Paperbox  3,49
30220N  Scissors    7,99

It seems that what you want to tell is that text extraction doesn't get the whole line.

Did you try the sort option?

re attachments: upload your file to a sharehoster

Tilman

My pattern only matches first column. To get description and price, I need 
whole line.
So does PDFBox knows a command to get the actual line completly?
thx



Gesendet: Montag, 09. Januar 2017 um 07:25 Uhr
Von: "Tilman Hausherr" <[email protected]>
An: [email protected]
Betreff: Re: PDFBox and Pattern
Am 09.01.2017 um 01:12 schrieb [email protected]:
Dear all,

I have a pdf with tables (and some other stuff).

I used
Pattern p = Pattern.compile("([0-9]{5})[A-Z]?");
to get the lines with 5 digits an optional a char.

How to read the whole line if a match was found?
You already have the line if you checked whether it matches.

Otherwise: Is there a good alternative to read in the data as table directly?
Try tabula

Other topic: Is it not possible to send attachments here?
No

Tilman

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to