Hi Andreas, I'm not familiar with the format of those invoices, and I'm not aware of any projects using OpenNLP for that exact task.
Depending on the structure of the documents, it may be possible to use OpenNLP's document categorizer [1] to train a model to categorize the invoices. If you have a link to a sample invoice I can take a look and give you a better answer. [1] https://opennlp.apache.org/docs/1.7.2/manual/opennlp.html#tools.doccat Thanks, Jeff On Thu, Mar 17, 2022 at 4:40 PM Andreas Røsdal <andreas.ros...@gmail.com> wrote: > Hello! > I am interested in using OpenNLP to classify PEPPOL and EHF invoices. > The idea is to use the categorizer and machine learning functionality of > OpenNLP > to automatically categorize invoices according to government standard > accounting codes. > Are there any ongoing open source project using OpenNLP in this way? > > Regards, > Andreas R. >