Hi, On Wed, May 10, 2006 at 03:29:34PM -0500, Sreeram Raghav wrote:
[snip] > Initially the only files being indexed were "ZPT pages", but after writing > the adapter even text files were being indexed. > However the problem is that when I try to add a PDF of Word documents, the > files are not being indexed and showing an error that cannot decode files. This adapter was just a demonstration on how to index a content object containing a text field. It assumes that context.data contains just a plain string. To index pdf files, you'll have to somehow convert the pdf data to plain text: from ModuleYouHaveToWrite import MagicPdfToText class SearchableTextAdapter(object): [...] def getSearchableText(self): text=MagicPdfToText(context.pdfdata) return (text,) I don't know, if there's a pure python solution for extraction text from pdf files. But you might consider calling an external program like 'pdftotxt' to do the job. However, it's your adapters responsibility to act as define by the interface and 'ISearchableText' says, the adapter must provide plain indexable text. Regards, Frank _______________________________________________ Zope3-users mailing list Zope3-users@zope.org http://mail.zope.org/mailman/listinfo/zope3-users