Just to clarify, I am working in a ECM solution

We are using Pytesser do make OCR over large documents(50000+ words) and
that is working very well!

So, we need in almost real time to give that results in to a serch page,
more than 200 companies with many users will have access to that search
page, and searches are made by person or companies names.

Example: I want to know and count how many times my name appears inside that
documents, and which documents is.

document path and the whole data content are stored in 5 different PostGre
dbs  and  web2py is the base of the front-end and now I am working in the
search engine pages.

I am testing Mincemeat and Disco, does anybody knows other ways?


2010/9/14 Bruno Rocha <[email protected]>

>
> I dont know if this was discussed here before,
> BTW, I leave the Tip.
>
> Map Reduce on Python ( Single Module, less than 13kb)
>
> http://remembersaurus.com/mincemeatpy/
>
> I am testing that, works very well on my search engine by now.
>
> Maybe, could be more documented or  integrate within web2py contrib.
>
> Should be great to put it in web2py API Level
>



-- 

http://rochacbruno.com.br

Reply via email to