Re: [Zope] Indexing files
Sune Christiansen wrote at 2006-1-24 18:56 +0100: >when you say external PDF converter, do you mean the pdf converter I >created the pdf file with? I have tried to index a microsoft word file >also, but the result is the same: an empty index. You need converters from the media format (i.e. PDF, MS-Word, ...) to text (or maybe better named: text extraction utilities). The standard PDF converter is "XPDF" (which contains "pdftotext" (or similarly)). The standard Word converter is "wvware". -- Dieter ___ Zope maillist - Zope@zope.org http://mail.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope-dev )
Re: [Zope] Indexing files
when you say external PDF converter, do you mean the pdf converter I created the pdf file with? I have tried to index a microsoft word file also, but the result is the same: an empty index. - Sune > > > --On 24. Januar 2006 16:58:52 +0100 Sune Christiansen <[EMAIL PROTECTED]> > wrote: > >> Hei again. >> >> I have installed TextIndexNG and indexed my Zope DTML Methods objects >> and >> Zope Files objects, and enabled "Document converters (PDF, Word etc.)" >> As indexed attributes I use >> SearchableText,PrincipiaSearchSource,getFile, >> but the indexes related to the pdf files are still empty. >> Is it correct to upload my pdf document as a Zope File object? >> > > Is your external PDF converter installed _properly_? > > -aj > > ___ Zope maillist - Zope@zope.org http://mail.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope-dev )
Re: [Zope] Indexing files
--On 24. Januar 2006 16:58:52 +0100 Sune Christiansen <[EMAIL PROTECTED]> wrote: Hei again. I have installed TextIndexNG and indexed my Zope DTML Methods objects and Zope Files objects, and enabled "Document converters (PDF, Word etc.)" As indexed attributes I use SearchableText,PrincipiaSearchSource,getFile, but the indexes related to the pdf files are still empty. Is it correct to upload my pdf document as a Zope File object? Is your external PDF converter installed _properly_? -aj pgpXSzHHpLRQd.pgp Description: PGP signature ___ Zope maillist - Zope@zope.org http://mail.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope-dev )
Re: [Zope] Indexing files
Hei again. I have installed TextIndexNG and indexed my Zope DTML Methods objects and Zope Files objects, and enabled "Document converters (PDF, Word etc.)" As indexed attributes I use SearchableText,PrincipiaSearchSource,getFile, but the indexes related to the pdf files are still empty. Is it correct to upload my pdf document as a Zope File object? Thanks, Sune > > On 21 Jan 2006, at 13:02, Sune Christiansen wrote: > >> Hei All. >> >> I have the following problem: >> I am building up a ZCatalog and indexing my DTML methods. I use the >> index >> type ZCTextIndex and the object function PrincipiaSearchSource. It >> works >> fine. >> But when I try to index my Files (type File) with index type >> ZCTextIndex >> and the object function SearchableText it finds no words and the >> index is >> empty. Am I using the wrong object function? > > Zope File objects do not support indexing their textual content. You > will need to implement your own text retrieval or use some of the > other indices out there like Andreas Jung's TextIndexNG which come > with suitable modules that can pull text out of various file formats. > > jens > > ___ > Zope maillist - Zope@zope.org > http://mail.zope.org/mailman/listinfo/zope > ** No cross posts or HTML encoding! ** > (Related lists - > http://mail.zope.org/mailman/listinfo/zope-announce > http://mail.zope.org/mailman/listinfo/zope-dev ) > ___ Zope maillist - Zope@zope.org http://mail.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope-dev )
Re: [Zope] Indexing files
Jens Vagelpohl schrieb: > > On 21 Jan 2006, at 13:02, Sune Christiansen wrote: > >> Hei All. >> >> I have the following problem: >> I am building up a ZCatalog and indexing my DTML methods. I use the >> index >> type ZCTextIndex and the object function PrincipiaSearchSource. It works >> fine. >> But when I try to index my Files (type File) with index type ZCTextIndex >> and the object function SearchableText it finds no words and the >> index is >> empty. Am I using the wrong object function? > > > Zope File objects do not support indexing their textual content. You > will need to implement your own text retrieval or use some of the other > indices out there like Andreas Jung's TextIndexNG which come with > suitable modules that can pull text out of various file formats. > Newer Zopes have file-objects indexable via PrincipiaSearchSource if their content-type is text/* OFS/Image.py, 423ff: def PrincipiaSearchSource(self): """ Allow file objects to be searched. """ if self.content_type.startswith('text/'): return str(self.data) return '' HTH tino ___ Zope maillist - Zope@zope.org http://mail.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope-dev )
Re: [Zope] Indexing files
On 21 Jan 2006, at 13:02, Sune Christiansen wrote: Hei All. I have the following problem: I am building up a ZCatalog and indexing my DTML methods. I use the index type ZCTextIndex and the object function PrincipiaSearchSource. It works fine. But when I try to index my Files (type File) with index type ZCTextIndex and the object function SearchableText it finds no words and the index is empty. Am I using the wrong object function? Zope File objects do not support indexing their textual content. You will need to implement your own text retrieval or use some of the other indices out there like Andreas Jung's TextIndexNG which come with suitable modules that can pull text out of various file formats. jens ___ Zope maillist - Zope@zope.org http://mail.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope-dev )