[Zope] Full-text search in Office/PDF

2005-04-12 Thread Robert Sösemann
Hello,

I am using a ZOPE-based application on top of the ZOPE APE 
(http://opensource.ca.com/projects/zopeape) persitence mechanism (sort of 
binary storage). So the binaries are in the normal unix file system. I now need 
to extend my application to allow full-text search inside binaries document of 
the Office Word/Excel/PPT and PDF format.

In the first step, I would be happy to find a tool/python/zope module that just 
tells me that a certain string is in a document, without telling me the exact 
place of that occurence. 

Do you have solved a similar problem or know any tool to implement this 
functionality? I am looking forward to your questions.

Robert
PS: I have seen similar questions on other ZOPE lists, but they never had 
meaningful answers.
Konzeption


Gölz  Schwarz GmbH 
Waltherstr. 29, 80337 München, Germany 
phone: + 49 - (0)89 / 54 46 70 - 0 
fax: +49 - (0)89 / 54 46 70 - 10 
e-mail: [EMAIL PROTECTED] 
web: http://www.goelz.com 

Sie suchen den aktiven Dialog mit Ihren Kunden? Sie möchten neue Wege gehen, um
Ihre Zielgruppen online zu motivieren und zu binden? Dann haben wir genau das 
Richtige
für Sie: Marketing Suite - die Komplett-Lösung für intelligentes Online 
Marketing!
Informationen zur Marketing Suite erhalten Sie unter 
http://www.goelz.com/marketingsuite.
___
Zope maillist  -  Zope@zope.org
http://mail.zope.org/mailman/listinfo/zope
**   No cross posts or HTML encoding!  **
(Related lists -
 http://mail.zope.org/mailman/listinfo/zope-announce
 http://mail.zope.org/mailman/listinfo/zope-dev )


Re: [Zope] Full-text search in Office/PDF

2005-04-12 Thread Andreas Jung

--On Dienstag, 12. April 2005 9:16 Uhr +0200 Robert Sösemann 
[EMAIL PROTECTED] wrote:

Hello,
I am using a ZOPE-based application on top of the ZOPE APE
(http://opensource.ca.com/projects/zopeape) persitence mechanism (sort of
binary storage). So the binaries are in the normal unix file system. I
now need to extend my application to allow full-text search inside
binaries document of the Office Word/Excel/PPT and PDF format.
Look at TextIIndexNG 2.
-aj


pgpu3xcc7MsCs.pgp
Description: PGP signature
___
Zope maillist  -  Zope@zope.org
http://mail.zope.org/mailman/listinfo/zope
**   No cross posts or HTML encoding!  **
(Related lists - 
 http://mail.zope.org/mailman/listinfo/zope-announce
 http://mail.zope.org/mailman/listinfo/zope-dev )


Re: [Zope] Full-text search in Office/PDF

2005-04-12 Thread Marco Bizzarri
You could do that using the excellent TextIndexNG inside a ZCatalog. 
This will allow you to index all your documents inside the TextIndexNG, 
and then search them using ZCatalog.

TextIndexNG includes support for a number of plugins in order to convert 
from MS Office/PDF to text and then index them.

We're using the tool inside our project (PAFlow) and we think it is an 
excellent product.

Regards
Marco
Robert Sösemann wrote:
Hello,
I am using a ZOPE-based application on top of the ZOPE APE 
(http://opensource.ca.com/projects/zopeape) persitence mechanism (sort of 
binary storage). So the binaries are in the normal unix file system. I now need 
to extend my application to allow full-text search inside binaries document of 
the Office Word/Excel/PPT and PDF format.
In the first step, I would be happy to find a tool/python/zope module that just tells me that a certain string is in a document, without telling me the exact place of that occurence. 

Do you have solved a similar problem or know any tool to implement this 
functionality? I am looking forward to your questions.
Robert
PS: I have seen similar questions on other ZOPE lists, but they never had 
meaningful answers.
Konzeption

Gölz  Schwarz GmbH 
Waltherstr. 29, 80337 München, Germany 
phone: + 49 - (0)89 / 54 46 70 - 0 
fax: +49 - (0)89 / 54 46 70 - 10 
e-mail: [EMAIL PROTECTED] 
web: http://www.goelz.com 

Sie suchen den aktiven Dialog mit Ihren Kunden? Sie möchten neue Wege gehen, um
Ihre Zielgruppen online zu motivieren und zu binden? Dann haben wir genau das Richtige
für Sie: Marketing Suite - die Komplett-Lösung für intelligentes Online Marketing!
Informationen zur Marketing Suite erhalten Sie unter http://www.goelz.com/marketingsuite.
___
Zope maillist  -  Zope@zope.org
http://mail.zope.org/mailman/listinfo/zope
**   No cross posts or HTML encoding!  **
(Related lists - 
http://mail.zope.org/mailman/listinfo/zope-announce
http://mail.zope.org/mailman/listinfo/zope-dev )
 

___
Zope maillist  -  Zope@zope.org
http://mail.zope.org/mailman/listinfo/zope
**   No cross posts or HTML encoding!  **
(Related lists - 
http://mail.zope.org/mailman/listinfo/zope-announce
http://mail.zope.org/mailman/listinfo/zope-dev )