> From:Glen Batchelor
> I hope that your PO number is visually unique, because 
> it will be difficult to locate and match by text in 
> any OCR unless it's:
> 
> 1) in the same physical place on every invoice
> 2) always starts or ends with the same unique string
> 
> Many newer OCR+PDF convertor software apps support 
> barcode detection, but the best I've made use of it is 
> to have the barcode show up better in the final PDF. 
> I've not used it to route paperwork, yet.

The responses so far explain my confusion.  There are email text
invoices, Emails with attachments in various formats, Fax, hard
copy, EDI, XML according to OASIS or proprietary standards, web
services...  The data acquisition will be unique for each medium,
so the notion of having one software package to do it all seemed
a bit magical.

Ignoring the electronic EDI/XML types, I think all of the other
document formats can be scanned physically if they're not in
digital format already, then as others suggest they can be OCR'd,
but the software needs to be "trainable" if the process is to be
automated, since all vendors have different document formatting
and you will find the invoice number in different places.

As to a solution that does that optical scanning part,
http://kofax.com/ has various offerings, and we have researched
something like this for a client and the best tools for
development that we found were on this site:
http://www.leadtools.com/home2/VertMkts/LTProdOvrvw.htm.  Since
those are components, a solution would need to be written around
them.  If you're looking to spend tens of thousands of dollars on
this (and that's not unusual in this area) then you might prefer
a less expensive DIY solution.  While I offered to write a
solution using LeadTools, our client never took it past the
investigation stage, so I can't comment as to whether making or
buying is more effective.

One thing that came loud and clear out of that research - don't
skimp on the scanning tools: a low-quality or low-resolution
scanner on the front-end will create a need for lots of manual
intervention to resolve errors later.  Get good equipment
up-front so the OCR-related software has good bits to work with.
And before purchasing a solution, do a trial with a wide variety
of your documents so that you can be sure it is fast enough and
accurate enough for your purpose.

As to document management with indexing after you have some
metadata, there are many products like http://docuxplorer.com/
which can be integrated with Universe, and http://www.1mage.com/
specializes in the MV market.

It may help to separate out the tasks in order to create a better
definition of what you're looking for.  These include data
acquisition, digital scanning, data scanning, indexing, and
retrieval.  Any solution should provide an API, and make the
indexing and retrieval part with Universe fairly trivial compared
to the rest of the package.

I hope that helps.

Tony Gravagno
Nebula Research and Development
TG@ remove.pleaseNebula-RnD.com
Nebula R&D sells mv.NET and other Pick/MultiValue products
worldwide,
and provides related development and training services

(Nebula R&D does not sell any of the offerings mentioned and has
no affiliation with any of the companies.)
-------
u2-users mailing list
u2-users@listserver.u2ug.org
To unsubscribe please visit http://listserver.u2ug.org/

Reply via email to