Hi Abdeslam, Did you replace /files/yourproject.preview/binaries with the correct pathname for your project?
Ard: is removing the lucene index enough? Shouldn't the files be touched? Jasha Joachimsthal www.onehippo.com Amsterdam - Hippo B.V. Oosteinde 11 1017 WT Amsterdam +31(0)20-5224466 San Francisco - Hippo USA Inc. 101 H Street, suite Q Petaluma CA 94952-3329 +1 (707) 773-4646 > -----Original Message----- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] On Behalf Of > Irahhoten, Abdeslam > Sent: dinsdag 15 juli 2008 12:10 > To: Hippo CMS development public mailinglist > Subject: RE: [HippoCMS-dev] search voor some text in pdf or > word document > > Hello ard, > > I have tried the following but I still can't search for some > text inside the pdf documents > > may be I still miss some configuration; can you tell me what > exactly the problem is: > > I have added this exractors > <extractor classname="org.apache.slide.extractor.PDFExtractor" > uri="/files/yourproject.preview/binaries" > content-type="application/pdf"/> > > en then I have used the following in my dasl query > <d:contains>${param.zoekwoorden}</d:contains> > > when I'm looking for ${param.zoekwoorden} I see nothing > > Thanks in advance > -----Oorspronkelijk bericht----- > Van: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] Namens Ard Schrijvers > Verzonden: Monday, July 14, 2008 4:55 PM > Aan: Hippo CMS development public mailinglist > Onderwerp: RE: [HippoCMS-dev] search voor some text in pdf or > word document > > Ofcourse there is, otherwise I wouldn't dare calling the system a cms > :-) > > If you search with target binaries (or root) you can just > type the search term in <d:contains> element. > > In the repository you need to configure an extractor if not > already done. See [1] for possible extractors. > > There you can see how to configure for example MSWord, Excel, > powerpoint, pdf etc etc. If you add them, reindexing has to be done. > This is simply done by removing the lucene index, but watch > with this in production environment obviously. > > Examples: > > <extractor classname="nl.hippo.slide.extractor.ImagePropertyExtractor" > uri="/files/yourproject.preview/binaries"/> > > <extractor classname="nl.hippo.slide.extractor.OfficeExtractor" > uri="/files/yourproject.preview/binaries" > content-type="application/vnd.ms-excel"> > <configuration> > <instruction property="author" > namespace="http://hippo.nl/cms/1.0" > summary-information="4"/> > <instruction property="application" > namespace="http://hippo.nl/cms/1.0" summary-information="18"/> > <instruction property="date" namespace="http://hippo.nl/cms/1.0" > date-format="yyyyMMdd" summary-information="13"/> > <instruction property="creationdate" > namespace="http://hippo.nl/cms/1.0" date-format="yyyyMMdd" > summary-information="12"/> > <instruction property="caption" > namespace="http://hippo.nl/cms/1.0" summary-information="2"/> > </configuration> > </extractor> > > <extractor classname="org.apache.slide.extractor.PDFExtractor" > uri="/files/yourproject.preview/binaries" > content-type="application/pdf"/> > > -Ard > > [1] > http://www.hippocms.org/display/CMS/4.+Hippo+Repository+Config > ure+Extrac > tors > > > > > Hello, > > > > > > > > Is it may be possible (using a dasl query) to search for some text > > inside a pdf or word document > > > > > > > > Thanks in advance > > > > > > Disclaimer > > > > Dit bericht met eventuele bijlagen is vertrouwelijk en uitsluitend > > bestemd voor de geadresseerde. Indien u niet de bedoelde ontvanger > > bent, wordt u verzocht de afzender te waarschuwen en dit > bericht met > > eventuele bijlagen direct te verwijderen en/of te > vernietigen. Het is > > niet toegestaan dit bericht en eventuele bijlagen te > vermenigvuldigen, > > door te sturen, openbaar te maken, op te slaan of op andere > wijze te > > gebruiken. Ordina N.V. en/of haar groepsmaatschappijen > accepteren geen > > verantwoordelijkheid of aansprakelijkheid voor schade die > voortvloeit > > uit de inhoud en/of de verzending van dit bericht. > > > > This e-mail and any attachments are confidential and are solely > > intended for the addressee. If you are not the intended recipient, > > please notify the sender and delete and/or destroy this message and > > any attachments immediately. > > It is prohibited to copy, to distribute, to disclose or to use this > > e-mail and any attachments in any other way. Ordina N.V. and/or its > > group companies do not accept any responsibility nor > liability for any > > damage resulting from the content of and/or the > transmission of this > > message. > > ******************************************** > > Hippocms-dev: Hippo CMS development public mailinglist > > > > Searchable archives can be found at: > > MarkMail: http://hippocms-dev.markmail.org > > Nabble: http://www.nabble.com/Hippo-CMS-f26633.html > > > > > ******************************************** > Hippocms-dev: Hippo CMS development public mailinglist > > Searchable archives can be found at: > MarkMail: http://hippocms-dev.markmail.org > Nabble: http://www.nabble.com/Hippo-CMS-f26633.html > > > Disclaimer > > Dit bericht met eventuele bijlagen is vertrouwelijk en > uitsluitend bestemd voor de geadresseerde. Indien u niet de > bedoelde ontvanger bent, wordt u verzocht de afzender te > waarschuwen en dit bericht met eventuele bijlagen direct te > verwijderen en/of te vernietigen. Het is niet toegestaan dit > bericht en eventuele bijlagen te vermenigvuldigen, door te > sturen, openbaar te maken, op te slaan of op andere wijze te > gebruiken. Ordina N.V. en/of haar groepsmaatschappijen > accepteren geen verantwoordelijkheid of aansprakelijkheid > voor schade die voortvloeit uit de inhoud en/of de verzending > van dit bericht. > > This e-mail and any attachments are confidential and are > solely intended for the addressee. If you are not the > intended recipient, please notify the sender and delete > and/or destroy this message and any attachments immediately. > It is prohibited to copy, to distribute, to disclose or to > use this e-mail and any attachments in any other way. Ordina > N.V. and/or its group companies do not accept any > responsibility nor liability for any damage resulting from > the content of and/or the transmission of this message. > ******************************************** > Hippocms-dev: Hippo CMS development public mailinglist > > Searchable archives can be found at: > MarkMail: http://hippocms-dev.markmail.org > Nabble: http://www.nabble.com/Hippo-CMS-f26633.html > > ******************************************** Hippocms-dev: Hippo CMS development public mailinglist Searchable archives can be found at: MarkMail: http://hippocms-dev.markmail.org Nabble: http://www.nabble.com/Hippo-CMS-f26633.html
