Hi Dave,
I wouldn't say that my requirements for OCR are totally aligned with those
of business users. For me, the DMS is mainly a (chaotic) storage for every
digital information I collect in my personal and business life. Getting rid
of paper as much as possible, but I am not looking to use any workflows for
example to manage those bills I receive. I am using the full text index to
find everything, and that usually gives quick results with only knowing a
few search terms, like account numbers, keywords ("bill", "food", account
statement",...), names and so on. And it's absolutely uncritical if i spend
two minutes instead of a few seconds for searching, because I have to try
some different stuff.
So for me, there are three, maybe four types of "information containers"
relevant:
- digital content I have created, like office docs, emails and stuff (not
photos, they are managed separately) - no OCR necessary
- PDFs I receive - no OCR neccessary
- PDF from scanned paper - always OCRed. I don't go too much for 100%
accuracy (while I would say that the results are very close), but sometimes
there are complex documents which get some "manual attention", an example
is that they might be bilingual.
You see, everything is about the full text index content, so I do not care
much about other metadata. But if I invested some time for better OCR
results then of course I wouldn't want to see it go wasted by having this
overwritten - if your question is targeted towards my initial requests
here. And of course in this case it's relevant to know how these
information are treated by the DMS.
Am Donnerstag, 4. Mai 2017 00:14:31 UTC+2 schrieb Dave S:
>
>
> Hi Andre,
>
> I am new here, though have years of experience with supporting and
> developing a commercial Enterprise Content Management system. I am
> installing Mayan now, so I apologize if I am missing something that will
> become obvious upon use, but I am curious about the need for OCR for the
> majority of documents. Would the inclusion of Document Type appropriate
> (manually entered) Metadata allow you to find the information you are
> searching for?
>
> OCR is a wonderful thing and something that I enjoy working with, though
> there can be challenges in getting the OCR'ed data (accurately) and then
> being able to use that information in a meaningful manner. Generally,
> unless I need to have that information and can consistently assign (some of
> the discreet) data to the Metadata - and I can afford the processing
> time/expense - manual indexing or reading barcodes (a whole other
> discussion! :-) ) meets 90+% of my needs.
>
> Perhaps once I start playing my question will answer itself, and I
> certainly don't mean any offense, but I am interested in how people are
> using the OCR'ed information (and related, has it been found to be accurate
> in the vast majority - 95+% - of the time).
>
> Thanks!
>
> dave
>
>
--
---
You received this message because you are subscribed to the Google Groups
"Mayan EDMS" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/d/optout.