Hi,

I am also been looking since some time for a solution to interpret the
text content of an pdf-invoice. But I don't think there's an easy
solution for now. Deep learning and neural networks are too complex to
quickly categorize the contents of an invoice.  Cloud solutions such as
Rossum <https://rossum.ai/> do this quite well. But all data is sent to
AWS first, which is quite questionable for business data....


===
Ralph

On 11.07.19 19:26, Chris Mattmann wrote:
>
> Tabula PDF is something I have been looking at for this as well as doing
> like Deep Neural Nets…
>
>  
>
>  
>
>  
>
> *From: *Sergey Beryozkin <[email protected]>
> *Reply-To: *"[email protected]" <[email protected]>
> *Date: *Thursday, July 11, 2019 at 10:25 AM
> *To: *"[email protected]" <[email protected]>
> *Subject: *[EXTERNAL] How to parse PDF more effectively
>
>  
>
> Hi
>
>  
>
> I've used Tika to parse this invoice PDF:
>
>  
>
> https://slicedinvoices.com/pdf/wordpress-pdf-invoice-plugin-sample.pdf
>
>  
>
> (AutoDetectParser, ToTextContentHandler), see below what is returned.
>
> The numbers like (1), (2) are added by myself, this is the preferred
> order (approximately).
>
>  
>
> Is it possible to hint somehow to Tika how to report the content ?
>
>  
>
> Thanks Sergey
>
>  
>
> PDF Invoice Example
> Invoice
>
> (5)Payment is due within 30 days from date of invoice. Late payment is
> subject to fees of 5% per month.
>
> Thanks for choosing DEMO - Sliced Invoices | [email protected]
> <mailto:[email protected]>
>
> Page 1/1
>
> (2)From:
>
> DEMO - Sliced Invoices
>
> Suite 5A-1204
>
> 123 Somewhere Street
>
> Your City AZ 12345
>
> [email protected] <mailto:[email protected]>
>
> (1)Invoice Number INV-3337
>
> Order Number 12345
>
> Invoice Date January 25, 2016
>
> Due Date January 31, 2016
>
> Total Due $93.50
>
> (3)To:
>
> Test Business
>
> 123 Somewhere St
>
> Melbourne, VIC 3000
>
> [email protected] <mailto:[email protected]>
>
> (4) Hrs/Qty Service Rate/Price Adjust Sub Total
>
> 1.00
> Web Design
> This is a sample description...
>
> $85.00 0.00% $85.00
>
> Sub Total $85.00
>
> Tax $8.50
>
> Total $93.50
>
> (5) ANZ Bank
>
> ACC # 1234 1234
>
> BSB # 4321 432 Pa
> id
>
-- 

Reply via email to