Re: [Zope] Indexing files

2006-01-25 Thread Dieter Maurer
Sune Christiansen wrote at 2006-1-24 18:56 +0100:
>when you say external PDF converter, do you mean the pdf converter I
>created the pdf file with? I have tried to index a microsoft word file
>also, but the result is the same: an empty index.

You need converters from the media format (i.e. PDF, MS-Word, ...)
to text (or maybe better named: text extraction utilities).

The standard PDF converter is "XPDF" (which contains "pdftotext" (or
similarly)). The standard Word converter is "wvware".



-- 
Dieter
___
Zope maillist  -  Zope@zope.org
http://mail.zope.org/mailman/listinfo/zope
**   No cross posts or HTML encoding!  **
(Related lists - 
 http://mail.zope.org/mailman/listinfo/zope-announce
 http://mail.zope.org/mailman/listinfo/zope-dev )


Re: [Zope] Indexing files

2006-01-24 Thread Sune Christiansen
when you say external PDF converter, do you mean the pdf converter I
created the pdf file with? I have tried to index a microsoft word file
also, but the result is the same: an empty index.

- Sune

>
>
> --On 24. Januar 2006 16:58:52 +0100 Sune Christiansen <[EMAIL PROTECTED]>
> wrote:
>
>> Hei again.
>>
>> I have installed TextIndexNG and indexed my Zope DTML Methods objects
>> and
>> Zope Files objects, and enabled "Document converters (PDF, Word etc.)"
>> As indexed attributes I use
>> SearchableText,PrincipiaSearchSource,getFile,
>> but the indexes related to the pdf files are still empty.
>> Is it correct to upload my pdf document as a Zope File object?
>>
>
> Is your external PDF converter installed _properly_?
>
> -aj
>
>


___
Zope maillist  -  Zope@zope.org
http://mail.zope.org/mailman/listinfo/zope
**   No cross posts or HTML encoding!  **
(Related lists - 
 http://mail.zope.org/mailman/listinfo/zope-announce
 http://mail.zope.org/mailman/listinfo/zope-dev )


Re: [Zope] Indexing files

2006-01-24 Thread Andreas Jung



--On 24. Januar 2006 16:58:52 +0100 Sune Christiansen <[EMAIL PROTECTED]> 
wrote:



Hei again.

I have installed TextIndexNG and indexed my Zope DTML Methods objects and
Zope Files objects, and enabled "Document converters (PDF, Word etc.)"
As indexed attributes I use SearchableText,PrincipiaSearchSource,getFile,
but the indexes related to the pdf files are still empty.
Is it correct to upload my pdf document as a Zope File object?



Is your external PDF converter installed _properly_?

-aj



pgpXSzHHpLRQd.pgp
Description: PGP signature
___
Zope maillist  -  Zope@zope.org
http://mail.zope.org/mailman/listinfo/zope
**   No cross posts or HTML encoding!  **
(Related lists - 
 http://mail.zope.org/mailman/listinfo/zope-announce
 http://mail.zope.org/mailman/listinfo/zope-dev )


Re: [Zope] Indexing files

2006-01-24 Thread Sune Christiansen
Hei again.

I have installed TextIndexNG and indexed my Zope DTML Methods objects and
Zope Files objects, and enabled "Document converters (PDF, Word etc.)"
As indexed attributes I use SearchableText,PrincipiaSearchSource,getFile,
but the indexes related to the pdf files are still empty.
Is it correct to upload my pdf document as a Zope File object?

Thanks,

Sune

>
> On 21 Jan 2006, at 13:02, Sune Christiansen wrote:
>
>> Hei All.
>>
>> I have the following problem:
>> I am building up a ZCatalog and indexing my DTML methods. I use the
>> index
>> type ZCTextIndex and the object function PrincipiaSearchSource. It
>> works
>> fine.
>> But when I try to index my Files (type File) with index type
>> ZCTextIndex
>> and the object function SearchableText it finds no words and the
>> index is
>> empty. Am I using the wrong object function?
>
> Zope File objects do not support indexing their textual content. You
> will need to implement your own text retrieval or use some of the
> other indices out there like Andreas Jung's  TextIndexNG which come
> with suitable modules that can pull text out of various file formats.
>
> jens
>
> ___
> Zope maillist  -  Zope@zope.org
> http://mail.zope.org/mailman/listinfo/zope
> **   No cross posts or HTML encoding!  **
> (Related lists -
>  http://mail.zope.org/mailman/listinfo/zope-announce
>  http://mail.zope.org/mailman/listinfo/zope-dev )
>


___
Zope maillist  -  Zope@zope.org
http://mail.zope.org/mailman/listinfo/zope
**   No cross posts or HTML encoding!  **
(Related lists - 
 http://mail.zope.org/mailman/listinfo/zope-announce
 http://mail.zope.org/mailman/listinfo/zope-dev )


Re: [Zope] Indexing files

2006-01-21 Thread Tino Wildenhain
Jens Vagelpohl schrieb:
> 
> On 21 Jan 2006, at 13:02, Sune Christiansen wrote:
> 
>> Hei All.
>>
>> I have the following problem:
>> I am building up a ZCatalog and indexing my DTML methods. I use the 
>> index
>> type ZCTextIndex and the object function PrincipiaSearchSource. It  works
>> fine.
>> But when I try to index my Files (type File) with index type  ZCTextIndex
>> and the object function SearchableText it finds no words and the 
>> index is
>> empty. Am I using the wrong object function?
> 
> 
> Zope File objects do not support indexing their textual content. You 
> will need to implement your own text retrieval or use some of the  other
> indices out there like Andreas Jung's  TextIndexNG which come  with
> suitable modules that can pull text out of various file formats.
> 

Newer Zopes have file-objects indexable via PrincipiaSearchSource
if their content-type is text/*

OFS/Image.py, 423ff:

def PrincipiaSearchSource(self):
""" Allow file objects to be searched.
"""
if self.content_type.startswith('text/'):
return str(self.data)
return ''


HTH
tino
___
Zope maillist  -  Zope@zope.org
http://mail.zope.org/mailman/listinfo/zope
**   No cross posts or HTML encoding!  **
(Related lists - 
 http://mail.zope.org/mailman/listinfo/zope-announce
 http://mail.zope.org/mailman/listinfo/zope-dev )


Re: [Zope] Indexing files

2006-01-21 Thread Jens Vagelpohl


On 21 Jan 2006, at 13:02, Sune Christiansen wrote:


Hei All.

I have the following problem:
I am building up a ZCatalog and indexing my DTML methods. I use the  
index
type ZCTextIndex and the object function PrincipiaSearchSource. It  
works

fine.
But when I try to index my Files (type File) with index type  
ZCTextIndex
and the object function SearchableText it finds no words and the  
index is

empty. Am I using the wrong object function?


Zope File objects do not support indexing their textual content. You  
will need to implement your own text retrieval or use some of the  
other indices out there like Andreas Jung's  TextIndexNG which come  
with suitable modules that can pull text out of various file formats.


jens

___
Zope maillist  -  Zope@zope.org
http://mail.zope.org/mailman/listinfo/zope
**   No cross posts or HTML encoding!  **
(Related lists - 
http://mail.zope.org/mailman/listinfo/zope-announce

http://mail.zope.org/mailman/listinfo/zope-dev )