> But may be I don't exactly understand what do you mean with > removing the lucene index. Where can I do that in the > configuration (in which file can I do that??)
You stop the repository, go to the location where the lucene indexes are stored (this is configured in your repository) and delete this directory (it is a directory containing files like .cfs , .del, segments) Ard > > Abdeslam > -----Oorspronkelijk bericht----- > Van: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] Namens Ard Schrijvers > Verzonden: Tuesday, July 15, 2008 12:58 PM > Aan: Hippo CMS development public mailinglist > Onderwerp: RE: [HippoCMS-dev] search voor some text in pdf or > word document > > > > > > Hi Abdeslam, > > > > Did you replace /files/yourproject.preview/binaries with > the correct > > pathname for your project? > > > > Ard: is removing the lucene index enough? Shouldn't the files be > > touched? > > Removing is enough. Touching is only need when extractors run > that set a property > > Ard > > > > > Jasha Joachimsthal > > > > www.onehippo.com > > Amsterdam - Hippo B.V. Oosteinde 11 1017 WT Amsterdam > > +31(0)20-5224466 San Francisco - Hippo USA Inc. 101 H Street, > > suite Q Petaluma CA > > 94952-3329 +1 (707) 773-4646 > > > > > > > > > -----Original Message----- > > > From: [EMAIL PROTECTED] > > > [mailto:[EMAIL PROTECTED] On Behalf Of > > > Irahhoten, Abdeslam > > > Sent: dinsdag 15 juli 2008 12:10 > > > To: Hippo CMS development public mailinglist > > > Subject: RE: [HippoCMS-dev] search voor some text in pdf or word > > > document > > > > > > Hello ard, > > > > > > I have tried the following but I still can't search for some text > > > inside the pdf documents > > > > > > may be I still miss some configuration; can you tell me > what exactly > > > the problem is: > > > > > > I have added this exractors > > > <extractor classname="org.apache.slide.extractor.PDFExtractor" > > > uri="/files/yourproject.preview/binaries" > > > content-type="application/pdf"/> > > > > > > en then I have used the following in my dasl query > > > <d:contains>${param.zoekwoorden}</d:contains> > > > > > > when I'm looking for ${param.zoekwoorden} I see nothing > > > > > > Thanks in advance > > > -----Oorspronkelijk bericht----- > > > Van: [EMAIL PROTECTED] > > > [mailto:[EMAIL PROTECTED] Namens Ard > > Schrijvers > > > Verzonden: Monday, July 14, 2008 4:55 PM > > > Aan: Hippo CMS development public mailinglist > > > Onderwerp: RE: [HippoCMS-dev] search voor some text in > pdf or word > > > document > > > > > > Ofcourse there is, otherwise I wouldn't dare calling the > > system a cms > > > :-) > > > > > > If you search with target binaries (or root) you can just > type the > > > search term in <d:contains> element. > > > > > > In the repository you need to configure an extractor if > not already > > > done. See [1] for possible extractors. > > > > > > There you can see how to configure for example MSWord, Excel, > > > powerpoint, pdf etc etc. If you add them, reindexing has > to be done. > > > This is simply done by removing the lucene index, but watch with > > > this in production environment obviously. > > > > > > Examples: > > > > > > <extractor > > classname="nl.hippo.slide.extractor.ImagePropertyExtractor" > > > uri="/files/yourproject.preview/binaries"/> > > > > > > <extractor classname="nl.hippo.slide.extractor.OfficeExtractor" > > > uri="/files/yourproject.preview/binaries" > > > content-type="application/vnd.ms-excel"> > > > <configuration> > > > <instruction property="author" > > > namespace="http://hippo.nl/cms/1.0" > > > summary-information="4"/> > > > <instruction property="application" > > > namespace="http://hippo.nl/cms/1.0" summary-information="18"/> > > > <instruction property="date" > > namespace="http://hippo.nl/cms/1.0" > > > date-format="yyyyMMdd" summary-information="13"/> > > > <instruction property="creationdate" > > > namespace="http://hippo.nl/cms/1.0" date-format="yyyyMMdd" > > > summary-information="12"/> > > > <instruction property="caption" > > > namespace="http://hippo.nl/cms/1.0" summary-information="2"/> > > > </configuration> > > > </extractor> > > > > > > <extractor classname="org.apache.slide.extractor.PDFExtractor" > > > uri="/files/yourproject.preview/binaries" > > > content-type="application/pdf"/> > > > > > > -Ard > > > > > > [1] > > > http://www.hippocms.org/display/CMS/4.+Hippo+Repository+Config > > > ure+Extrac > > > tors > > > > > > > > > > > Hello, > > > > > > > > > > > > > > > > Is it may be possible (using a dasl query) to search for > > some text > > > > inside a pdf or word document > > > > > > > > > > > > > > > > Thanks in advance > > > > > > > > > > > > Disclaimer > > > > > > > > Dit bericht met eventuele bijlagen is vertrouwelijk en > > uitsluitend > > > > bestemd voor de geadresseerde. Indien u niet de bedoelde > > ontvanger > > > > bent, wordt u verzocht de afzender te waarschuwen en dit > > > bericht met > > > > eventuele bijlagen direct te verwijderen en/of te > > > vernietigen. Het is > > > > niet toegestaan dit bericht en eventuele bijlagen te > > > vermenigvuldigen, > > > > door te sturen, openbaar te maken, op te slaan of op andere > > > wijze te > > > > gebruiken. Ordina N.V. en/of haar groepsmaatschappijen > > > accepteren geen > > > > verantwoordelijkheid of aansprakelijkheid voor schade die > > > voortvloeit > > > > uit de inhoud en/of de verzending van dit bericht. > > > > > > > > This e-mail and any attachments are confidential and are solely > > > > intended for the addressee. If you are not the intended > > recipient, > > > > please notify the sender and delete and/or destroy this > > message and > > > > any attachments immediately. > > > > It is prohibited to copy, to distribute, to disclose or > > to use this > > > > e-mail and any attachments in any other way. Ordina N.V. > > and/or its > > > > group companies do not accept any responsibility nor > > > liability for any > > > > damage resulting from the content of and/or the > > > transmission of this > > > > message. > > > > ******************************************** > > > > Hippocms-dev: Hippo CMS development public mailinglist > > > > > > > > Searchable archives can be found at: > > > > MarkMail: http://hippocms-dev.markmail.org > > > > Nabble: http://www.nabble.com/Hippo-CMS-f26633.html > > > > > > > > > > > ******************************************** > > > Hippocms-dev: Hippo CMS development public mailinglist > > > > > > Searchable archives can be found at: > > > MarkMail: http://hippocms-dev.markmail.org > > > Nabble: http://www.nabble.com/Hippo-CMS-f26633.html > > > > > > > > > Disclaimer > > > > > > Dit bericht met eventuele bijlagen is vertrouwelijk en > uitsluitend > > > bestemd voor de geadresseerde. Indien u niet de bedoelde > ontvanger > > > bent, wordt u verzocht de afzender te waarschuwen en dit > bericht met > > > eventuele bijlagen direct te verwijderen en/of te > vernietigen. Het > > > is niet toegestaan dit bericht en eventuele bijlagen te > > > vermenigvuldigen, door te sturen, openbaar te maken, op > te slaan of > > > op andere wijze te gebruiken. Ordina N.V. en/of haar > > > groepsmaatschappijen accepteren geen verantwoordelijkheid of > > > aansprakelijkheid voor schade die voortvloeit uit de > inhoud en/of de > > > verzending van dit bericht. > > > > > > This e-mail and any attachments are confidential and are solely > > > intended for the addressee. If you are not the intended > recipient, > > > please notify the sender and delete and/or destroy this > message and > > > any attachments immediately. > > > It is prohibited to copy, to distribute, to disclose or > to use this > > > e-mail and any attachments in any other way. Ordina N.V. > and/or its > > > group companies do not accept any responsibility nor > liability for > > > any damage resulting from the content of and/or the > transmission of > > > this message. > > > ******************************************** > > > Hippocms-dev: Hippo CMS development public mailinglist > > > > > > Searchable archives can be found at: > > > MarkMail: http://hippocms-dev.markmail.org > > > Nabble: http://www.nabble.com/Hippo-CMS-f26633.html > > > > > > > > ******************************************** > > Hippocms-dev: Hippo CMS development public mailinglist > > > > Searchable archives can be found at: > > MarkMail: http://hippocms-dev.markmail.org > > Nabble: http://www.nabble.com/Hippo-CMS-f26633.html > > > > > ******************************************** > Hippocms-dev: Hippo CMS development public mailinglist > > Searchable archives can be found at: > MarkMail: http://hippocms-dev.markmail.org > Nabble: http://www.nabble.com/Hippo-CMS-f26633.html > > > Disclaimer > > Dit bericht met eventuele bijlagen is vertrouwelijk en > uitsluitend bestemd voor de geadresseerde. Indien u niet de > bedoelde ontvanger bent, wordt u verzocht de afzender te > waarschuwen en dit bericht met eventuele bijlagen direct te > verwijderen en/of te vernietigen. Het is niet toegestaan dit > bericht en eventuele bijlagen te vermenigvuldigen, door te > sturen, openbaar te maken, op te slaan of op andere wijze te > gebruiken. Ordina N.V. en/of haar groepsmaatschappijen > accepteren geen verantwoordelijkheid of aansprakelijkheid > voor schade die voortvloeit uit de inhoud en/of de verzending > van dit bericht. > > This e-mail and any attachments are confidential and are > solely intended for the addressee. If you are not the > intended recipient, please notify the sender and delete > and/or destroy this message and any attachments immediately. > It is prohibited to copy, to distribute, to disclose or to > use this e-mail and any attachments in any other way. Ordina > N.V. and/or its group companies do not accept any > responsibility nor liability for any damage resulting from > the content of and/or the transmission of this message. > ******************************************** > Hippocms-dev: Hippo CMS development public mailinglist > > Searchable archives can be found at: > MarkMail: http://hippocms-dev.markmail.org > Nabble: http://www.nabble.com/Hippo-CMS-f26633.html > > ******************************************** Hippocms-dev: Hippo CMS development public mailinglist Searchable archives can be found at: MarkMail: http://hippocms-dev.markmail.org Nabble: http://www.nabble.com/Hippo-CMS-f26633.html
