Hi Dave,

I wouldn't say that my requirements for OCR are totally aligned with those 
of business users. For me, the DMS is mainly a (chaotic) storage for every 
digital information I collect in my personal and business life. Getting rid 
of paper as much as possible, but I am not looking to use any workflows for 
example to manage those bills I receive. I am using the full text index to 
find everything, and that usually gives quick results with only knowing a 
few search terms, like account numbers, keywords ("bill", "food", account 
statement",...), names and so on. And it's absolutely uncritical if i spend 
two minutes instead of a few seconds for searching, because I have to try 
some different stuff.

So for me, there are three, maybe four types of "information containers" 
relevant:

- digital content I have created, like office docs, emails and stuff (not 
photos, they are managed separately) - no OCR necessary
- PDFs I receive - no OCR neccessary
- PDF from scanned paper - always OCRed. I don't go too much for 100% 
accuracy (while I would say that the results are very close), but sometimes 
there are complex documents which get some "manual attention", an example 
is that they might be bilingual.

You see, everything is about the full text index content, so I do not care 
much about other metadata. But if I invested some time for better OCR 
results then of course I wouldn't want to see it go wasted by having this 
overwritten - if your question is targeted towards my initial requests 
here. And of course in this case it's relevant to know how these 
information are treated by the DMS.




Am Donnerstag, 4. Mai 2017 00:14:31 UTC+2 schrieb Dave S:
>
>
> Hi Andre,
>
> I am new here, though have years of experience with supporting and 
> developing a commercial Enterprise Content Management system.  I am 
> installing Mayan now, so I apologize if I am missing something that will 
> become obvious upon use, but I am curious about the need for OCR for the 
> majority of documents.  Would the inclusion of Document Type appropriate 
> (manually entered) Metadata allow you to find the information you are 
> searching for?  
>
> OCR is a wonderful thing and something that I enjoy working with, though 
> there can be challenges in getting the OCR'ed data (accurately) and then 
> being able to use that information in a meaningful manner.  Generally, 
> unless I need to have that information and can consistently assign (some of 
> the discreet) data to the Metadata - and I can afford the processing 
> time/expense - manual indexing or reading barcodes (a whole other 
> discussion! :-) ) meets 90+% of my needs.
>
> Perhaps once I start playing my question will answer itself, and I 
> certainly don't mean any offense, but I am interested in how people are 
> using the OCR'ed information (and related, has it been found to be accurate 
> in the vast majority - 95+% - of the time).
>
> Thanks!
>
> dave
>
>

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"Mayan EDMS" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to