> > Hi Abdeslam, > > Did you replace /files/yourproject.preview/binaries with the > correct pathname for your project? > > Ard: is removing the lucene index enough? Shouldn't the files > be touched?
Removing is enough. Touching is only need when extractors run that set a property Ard > > Jasha Joachimsthal > > www.onehippo.com > Amsterdam - Hippo B.V. Oosteinde 11 1017 WT Amsterdam > +31(0)20-5224466 San Francisco - Hippo USA Inc. 101 H Street, > suite Q Petaluma CA > 94952-3329 +1 (707) 773-4646 > > > > > -----Original Message----- > > From: [EMAIL PROTECTED] > > [mailto:[EMAIL PROTECTED] On Behalf Of > > Irahhoten, Abdeslam > > Sent: dinsdag 15 juli 2008 12:10 > > To: Hippo CMS development public mailinglist > > Subject: RE: [HippoCMS-dev] search voor some text in pdf or > > word document > > > > Hello ard, > > > > I have tried the following but I still can't search for some > > text inside the pdf documents > > > > may be I still miss some configuration; can you tell me what > > exactly the problem is: > > > > I have added this exractors > > <extractor classname="org.apache.slide.extractor.PDFExtractor" > > uri="/files/yourproject.preview/binaries" > > content-type="application/pdf"/> > > > > en then I have used the following in my dasl query > > <d:contains>${param.zoekwoorden}</d:contains> > > > > when I'm looking for ${param.zoekwoorden} I see nothing > > > > Thanks in advance > > -----Oorspronkelijk bericht----- > > Van: [EMAIL PROTECTED] > > [mailto:[EMAIL PROTECTED] Namens Ard > Schrijvers > > Verzonden: Monday, July 14, 2008 4:55 PM > > Aan: Hippo CMS development public mailinglist > > Onderwerp: RE: [HippoCMS-dev] search voor some text in pdf or > > word document > > > > Ofcourse there is, otherwise I wouldn't dare calling the > system a cms > > :-) > > > > If you search with target binaries (or root) you can just > > type the search term in <d:contains> element. > > > > In the repository you need to configure an extractor if not > > already done. See [1] for possible extractors. > > > > There you can see how to configure for example MSWord, Excel, > > powerpoint, pdf etc etc. If you add them, reindexing has to be done. > > This is simply done by removing the lucene index, but watch > > with this in production environment obviously. > > > > Examples: > > > > <extractor > classname="nl.hippo.slide.extractor.ImagePropertyExtractor" > > uri="/files/yourproject.preview/binaries"/> > > > > <extractor classname="nl.hippo.slide.extractor.OfficeExtractor" > > uri="/files/yourproject.preview/binaries" > > content-type="application/vnd.ms-excel"> > > <configuration> > > <instruction property="author" > > namespace="http://hippo.nl/cms/1.0" > > summary-information="4"/> > > <instruction property="application" > > namespace="http://hippo.nl/cms/1.0" summary-information="18"/> > > <instruction property="date" > namespace="http://hippo.nl/cms/1.0" > > date-format="yyyyMMdd" summary-information="13"/> > > <instruction property="creationdate" > > namespace="http://hippo.nl/cms/1.0" date-format="yyyyMMdd" > > summary-information="12"/> > > <instruction property="caption" > > namespace="http://hippo.nl/cms/1.0" summary-information="2"/> > > </configuration> > > </extractor> > > > > <extractor classname="org.apache.slide.extractor.PDFExtractor" > > uri="/files/yourproject.preview/binaries" > > content-type="application/pdf"/> > > > > -Ard > > > > [1] > > http://www.hippocms.org/display/CMS/4.+Hippo+Repository+Config > > ure+Extrac > > tors > > > > > > > > Hello, > > > > > > > > > > > > Is it may be possible (using a dasl query) to search for > some text > > > inside a pdf or word document > > > > > > > > > > > > Thanks in advance > > > > > > > > > Disclaimer > > > > > > Dit bericht met eventuele bijlagen is vertrouwelijk en > uitsluitend > > > bestemd voor de geadresseerde. Indien u niet de bedoelde > ontvanger > > > bent, wordt u verzocht de afzender te waarschuwen en dit > > bericht met > > > eventuele bijlagen direct te verwijderen en/of te > > vernietigen. Het is > > > niet toegestaan dit bericht en eventuele bijlagen te > > vermenigvuldigen, > > > door te sturen, openbaar te maken, op te slaan of op andere > > wijze te > > > gebruiken. Ordina N.V. en/of haar groepsmaatschappijen > > accepteren geen > > > verantwoordelijkheid of aansprakelijkheid voor schade die > > voortvloeit > > > uit de inhoud en/of de verzending van dit bericht. > > > > > > This e-mail and any attachments are confidential and are solely > > > intended for the addressee. If you are not the intended > recipient, > > > please notify the sender and delete and/or destroy this > message and > > > any attachments immediately. > > > It is prohibited to copy, to distribute, to disclose or > to use this > > > e-mail and any attachments in any other way. Ordina N.V. > and/or its > > > group companies do not accept any responsibility nor > > liability for any > > > damage resulting from the content of and/or the > > transmission of this > > > message. > > > ******************************************** > > > Hippocms-dev: Hippo CMS development public mailinglist > > > > > > Searchable archives can be found at: > > > MarkMail: http://hippocms-dev.markmail.org > > > Nabble: http://www.nabble.com/Hippo-CMS-f26633.html > > > > > > > > ******************************************** > > Hippocms-dev: Hippo CMS development public mailinglist > > > > Searchable archives can be found at: > > MarkMail: http://hippocms-dev.markmail.org > > Nabble: http://www.nabble.com/Hippo-CMS-f26633.html > > > > > > Disclaimer > > > > Dit bericht met eventuele bijlagen is vertrouwelijk en > > uitsluitend bestemd voor de geadresseerde. Indien u niet de > > bedoelde ontvanger bent, wordt u verzocht de afzender te > > waarschuwen en dit bericht met eventuele bijlagen direct te > > verwijderen en/of te vernietigen. Het is niet toegestaan dit > > bericht en eventuele bijlagen te vermenigvuldigen, door te > > sturen, openbaar te maken, op te slaan of op andere wijze te > > gebruiken. Ordina N.V. en/of haar groepsmaatschappijen > > accepteren geen verantwoordelijkheid of aansprakelijkheid > > voor schade die voortvloeit uit de inhoud en/of de verzending > > van dit bericht. > > > > This e-mail and any attachments are confidential and are > > solely intended for the addressee. If you are not the > > intended recipient, please notify the sender and delete > > and/or destroy this message and any attachments immediately. > > It is prohibited to copy, to distribute, to disclose or to > > use this e-mail and any attachments in any other way. Ordina > > N.V. and/or its group companies do not accept any > > responsibility nor liability for any damage resulting from > > the content of and/or the transmission of this message. > > ******************************************** > > Hippocms-dev: Hippo CMS development public mailinglist > > > > Searchable archives can be found at: > > MarkMail: http://hippocms-dev.markmail.org > > Nabble: http://www.nabble.com/Hippo-CMS-f26633.html > > > > > ******************************************** > Hippocms-dev: Hippo CMS development public mailinglist > > Searchable archives can be found at: > MarkMail: http://hippocms-dev.markmail.org > Nabble: http://www.nabble.com/Hippo-CMS-f26633.html > > ******************************************** Hippocms-dev: Hippo CMS development public mailinglist Searchable archives can be found at: MarkMail: http://hippocms-dev.markmail.org Nabble: http://www.nabble.com/Hippo-CMS-f26633.html
