Ofcourse there is, otherwise I wouldn't dare calling the system a cms
:-)

If you search with target binaries (or root) you can just type the
search term in <d:contains> element.

In the repository you need to configure an extractor if not already
done. See [1] for possible extractors.

There you can see how to configure for example MSWord, Excel,
powerpoint, pdf etc etc. If you add them, reindexing has to be done.
This is simply done by removing the lucene index, but watch with this in
production environment obviously.

Examples:

<extractor classname="nl.hippo.slide.extractor.ImagePropertyExtractor"
uri="/files/yourproject.preview/binaries"/>

<extractor classname="nl.hippo.slide.extractor.OfficeExtractor"
uri="/files/yourproject.preview/binaries"
content-type="application/vnd.ms-excel">
    <configuration>
      <instruction property="author" namespace="http://hippo.nl/cms/1.0";
summary-information="4"/>
      <instruction property="application"
namespace="http://hippo.nl/cms/1.0"; summary-information="18"/>
      <instruction property="date" namespace="http://hippo.nl/cms/1.0";
date-format="yyyyMMdd" summary-information="13"/>
      <instruction property="creationdate"
namespace="http://hippo.nl/cms/1.0"; date-format="yyyyMMdd"
summary-information="12"/>
      <instruction property="caption"
namespace="http://hippo.nl/cms/1.0"; summary-information="2"/>
    </configuration>
  </extractor>

<extractor classname="org.apache.slide.extractor.PDFExtractor"
uri="/files/yourproject.preview/binaries"
content-type="application/pdf"/>
  
-Ard

[1]
http://www.hippocms.org/display/CMS/4.+Hippo+Repository+Configure+Extrac
tors

> 
> Hello, 
> 
>  
> 
> Is it may be possible (using a dasl query) to search for some 
> text inside a pdf or word document
> 
>  
> 
> Thanks in advance 
> 
> 
> Disclaimer
> 
> Dit bericht met eventuele bijlagen is vertrouwelijk en 
> uitsluitend bestemd voor de geadresseerde. Indien u niet de 
> bedoelde ontvanger bent, wordt u verzocht de afzender te 
> waarschuwen en dit bericht met eventuele bijlagen direct te 
> verwijderen en/of te vernietigen. Het is niet toegestaan dit 
> bericht en eventuele bijlagen te vermenigvuldigen, door te 
> sturen, openbaar te maken, op te slaan of op andere wijze te 
> gebruiken. Ordina N.V. en/of haar groepsmaatschappijen 
> accepteren geen verantwoordelijkheid of aansprakelijkheid 
> voor schade die voortvloeit uit de inhoud en/of de verzending 
> van dit bericht.
> 
> This e-mail and any attachments are confidential and are 
> solely intended for the addressee. If you are not the 
> intended recipient, please notify the sender and delete 
> and/or destroy this message and any attachments immediately. 
> It is prohibited to copy, to distribute, to disclose or to 
> use this e-mail and any attachments in any other way. Ordina 
> N.V. and/or its group companies do not accept any 
> responsibility nor liability for any damage resulting from 
> the content of and/or the transmission of this message.
> ********************************************
> Hippocms-dev: Hippo CMS development public mailinglist
> 
> Searchable archives can be found at:
> MarkMail: http://hippocms-dev.markmail.org
> Nabble: http://www.nabble.com/Hippo-CMS-f26633.html
> 
> 
********************************************
Hippocms-dev: Hippo CMS development public mailinglist

Searchable archives can be found at:
MarkMail: http://hippocms-dev.markmail.org
Nabble: http://www.nabble.com/Hippo-CMS-f26633.html

Reply via email to