Re: [Zope] ZCatalog attachments?
Simon Coles writes: We have binary files stored in Zope, for example Word documents (but could be any of a variety of document types). We would like to be able to index and search the contents of these files using ZCatalog. So if a Word file contains the word "Fred", then any search for "Fred" would include that file in the list of documents returned. Someone else already told you, that you must create a parameterless method (it need not necessary be named "PrincipiaSearchSource") that returns the files content. You may not need to keep the rendered version around but may be able to extract the plain text on demand. I think, there is a "word.dll" that provides access to MS Word from applications. Alternatively, you could control Word via COM. Ther is a Perl (I know, I know...) script to convert Word DOC files into HTML. That should work well enough to make the stuff searchable (I would use doc2html.pl | lynx -d to get a pure ASCII version, though). It is probably fast enough to just render on the fly (i.e., upon indexing). HTH, Jan ___ Zope maillist - [EMAIL PROTECTED] http://lists.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://lists.zope.org/mailman/listinfo/zope-announce http://lists.zope.org/mailman/listinfo/zope-dev )
Re: [Zope] ZCatalog attachments?
Simon Coles writes: We have binary files stored in Zope, for example Word documents (but could be any of a variety of document types). We would like to be able to index and search the contents of these files using ZCatalog. So if a Word file contains the word "Fred", then any search for "Fred" would include that file in the list of documents returned. Someone else already told you, that you must create a parameterless method (it need not necessary be named "PrincipiaSearchSource") that returns the files content. You may not need to keep the rendered version around but may be able to extract the plain text on demand. I think, there is a "word.dll" that provides access to MS Word from applications. Alternatively, you could control Word via COM. Dieter ___ Zope maillist - [EMAIL PROTECTED] http://lists.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://lists.zope.org/mailman/listinfo/zope-announce http://lists.zope.org/mailman/listinfo/zope-dev )
[Zope] ZCatalog attachments?
Hi, We have binary files stored in Zope, for example Word documents (but could be any of a variety of document types). We would like to be able to index and search the contents of these files using ZCatalog. So if a Word file contains the word "Fred", then any search for "Fred" would include that file in the list of documents returned. Is anyone doing something like this? If so, how? Simon -- - My opinions are my own, NIP's opinions are theirs -- Simon J. Coles Email: [EMAIL PROTECTED] New Information Paradigms Work Phone: +44 1344 753703 http://www.nipltd.com/ Work Fax: +44 1344 753742 === Life is too precious to take seriously === ___ Zope maillist - [EMAIL PROTECTED] http://lists.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://lists.zope.org/mailman/listinfo/zope-announce http://lists.zope.org/mailman/listinfo/zope-dev )
Re: [Zope] ZCatalog attachments?
On Fri, 4 Aug 2000, Simon Coles wrote: We have binary files stored in Zope, for example Word documents (but could be any of a variety of document types). We would like to be able to index and search the contents of these files using ZCatalog. So if a Word file contains the word "Fred", then any search for "Fred" would include that file in the list of documents returned. Is anyone doing something like this? If so, how? Simple search in binary data of course won't do it, because of complex format of Word documents. So: Try to keep beside every document its 'rendered' version - converted to plain text (created by saving them with Word in plain text format). Then create class representing your document. This class should provide parameterless method 'PrincipiaSearchSource' returning rendered version of document. However, it's untested - but seems to be a step in right direction ;) [EMAIL PROTECTED] /--\ | `long long long' is too long for GCC | \--/ ___ Zope maillist - [EMAIL PROTECTED] http://lists.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://lists.zope.org/mailman/listinfo/zope-announce http://lists.zope.org/mailman/listinfo/zope-dev )