Hey all,

I finally have time to experiment with Mayan-EDMS some more. So I'm back at 
trying to get https://gitlab.com/startmat/document_analyzer working the way 
I want.

Unfortunately, I can't seem to figure it out. 

I'm currently testing on a vagrant instance. See: 
https://gitlab.com/mayan-edms/mayan-edms-vagrant

I ended up copying the document_analyzer app into the apps directory to get 
it loading. 

I am using an Albertsons receipt to test with. The first two lines of OCR 
look like:

4S Albertsons
> It's just better.
>

 I made an analyzer and assigned the 'receipt' document type to it. (That's 
the type I added and that the albertsons receipt's properties page says it 
is.)

Parameter: 
first;(?ims)(?P<albertsons>(.*Albertsons.*))


This should cause document_analyzer to add a "albertsons" field to either 
the metadata or properties of the document. Am I wrong?

I also made an analyzer based on the document_analyzer's README.

Parameter:
first;(?i)(?P<Creator>Tele2|Apple|Microsoft|Billa|Albertsons)

I just added "Albertsons" to list of words to look for.


This should cause document_analyzer to add a "Creator" field to either the 
metadata or properties of the document. Am I wrong?


I used the menu item "Submit to analyze" 
http://localhost:8080/document_analyzer/analyzer/1/submit/ to run 
document_analyzer.


All I can see in the logs is that I clicked that menu item. The document's 
properties and metadata do not change.


Nothing is added to either the metadata or properties of the document.


If I test:


(?ims).*albertsons.*


on http://www.pyregex.com/ with the first two lines of the document, it 
reports a success.


/usr/share/mayan-edms/mayan/settings/local.py looks like:

from __future__ import absolute_import, unicode_literals

from .base import *

SECRET_KEY = '5(kv&ow31r2m9e^#c65v%ppiwiv9epu-hxa*1jsa1#m5bi!g7+'

DATABASES = {
    'default': {
        'ENGINE': 'django.db.backends.postgresql_psycopg2',
        'NAME': 'mayan_edms',
        'USER': 'mayan',
        'PASSWORD': 'test123',
        'HOST': 'localhost',
        'PORT': '5432',
    }
}
INSTALLED_APPS += (
    'document_analyzer',
)

BROKER_URL = 'redis://127.0.0.1:6379/0'
CELERY_RESULT_BACKEND = 'redis://127.0.0.1:6379/0'

LOGGING = {
    'version': 1,
    'disable_existing_loggers': True,
    'formatters': {
        'verbose': {
            'format': '%(levelname)s %(asctime)s %(name)s %(process)d 
%(thread)d %(message)s'
        },
        'intermediate': {
            'format': '%(name)s <%(process)d> [%(levelname)s] 
"%(funcName)s() %(message)s"'
        },
        'simple': {
            'format': '%(levelname)s %(message)s'
        },
    },
    'handlers': {
        'console':{
            'level':'DEBUG',
            'class':'logging.StreamHandler',
            'formatter': 'intermediate'
        }
    },
    'loggers': {
        #'documents': {
        #    'handlers':['console'],
        #    'propagate': True,
        #    'level':'DEBUG',
        #},
        #'common': {
        #    'handlers':['console'],
        #    'propagate': True,
        #    'level':'DEBUG',
        #},
        'document_analyzer': {
            'handlers':['console'],
            'propagate': True,
            'level':'DEBUG',
        },

    }
}


Does anyone have any tips? Am I missing a step somewhere? 

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"Mayan EDMS" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to mayan-edms+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to