Hey all,

I finally have time to experiment with Mayan-EDMS some more. So I'm back at 
trying to get https://gitlab.com/startmat/document_analyzer working the way 
I want.

Unfortunately, I can't seem to figure it out. 

I'm currently testing on a vagrant instance. See: 
https://gitlab.com/mayan-edms/mayan-edms-vagrant

I ended up copying the document_analyzer app into the apps directory to get 
it loading. 

I am using an Albertsons receipt to test with. The first two lines of OCR 
look like:

4S Albertsons
> It's just better.
>

 I made an analyzer and assigned the 'receipt' document type to it. (That's 
the type I added and that the albertsons receipt's properties page says it 
is.)

Parameter: 
first;(?ims)(?P<albertsons>(.*Albertsons.*))


This should cause document_analyzer to add a "albertsons" field to either 
the metadata or properties of the document. Am I wrong?

I also made an analyzer based on the document_analyzer's README.

Parameter:
first;(?i)(?P<Creator>Tele2|Apple|Microsoft|Billa|Albertsons)

I just added "Albertsons" to list of words to look for.


This should cause document_analyzer to add a "Creator" field to either the 
metadata or properties of the document. Am I wrong?


I used the menu item "Submit to analyze" 
http://localhost:8080/document_analyzer/analyzer/1/submit/ to run 
document_analyzer.


All I can see in the logs is that I clicked that menu item. The document's 
properties and metadata do not change.


Nothing is added to either the metadata or properties of the document.


If I test:


(?ims).*albertsons.*


on http://www.pyregex.com/ with the first two lines of the document, it 
reports a success.


/usr/share/mayan-edms/mayan/settings/local.py looks like:

from __future__ import absolute_import, unicode_literals

from .base import *

SECRET_KEY = '5(kv&ow31r2m9e^#c65v%ppiwiv9epu-hxa*1jsa1#m5bi!g7+'

DATABASES = {
    'default': {
        'ENGINE': 'django.db.backends.postgresql_psycopg2',
        'NAME': 'mayan_edms',
        'USER': 'mayan',
        'PASSWORD': 'test123',
        'HOST': 'localhost',
        'PORT': '5432',
    }
}
INSTALLED_APPS += (
    'document_analyzer',
)

BROKER_URL = 'redis://127.0.0.1:6379/0'
CELERY_RESULT_BACKEND = 'redis://127.0.0.1:6379/0'

LOGGING = {
    'version': 1,
    'disable_existing_loggers': True,
    'formatters': {
        'verbose': {
            'format': '%(levelname)s %(asctime)s %(name)s %(process)d 
%(thread)d %(message)s'
        },
        'intermediate': {
            'format': '%(name)s <%(process)d> [%(levelname)s] 
"%(funcName)s() %(message)s"'
        },
        'simple': {
            'format': '%(levelname)s %(message)s'
        },
    },
    'handlers': {
        'console':{
            'level':'DEBUG',
            'class':'logging.StreamHandler',
            'formatter': 'intermediate'
        }
    },
    'loggers': {
        #'documents': {
        #    'handlers':['console'],
        #    'propagate': True,
        #    'level':'DEBUG',
        #},
        #'common': {
        #    'handlers':['console'],
        #    'propagate': True,
        #    'level':'DEBUG',
        #},
        'document_analyzer': {
            'handlers':['console'],
            'propagate': True,
            'level':'DEBUG',
        },

    }
}


Does anyone have any tips? Am I missing a step somewhere? 

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"Mayan EDMS" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to