The ConstantExtractor is not the one responsible for the text indexing.
I must admit it is a little confusing, but if you only want the dutch
part to be text indexed, you change the XMLContentExtractor
configuration. Take a look at [1], it is all explained there.

Also note, that for just reindexing, you do not need to touch all
documents again. It is only when you have changed an extractor that
actually sets a property (the constantExtractor does by the way, but not
the XMLContentExtractor (this is confusing, I know)). 

But, what you should do AFAICS, is just keep the extractors the way they
are, and just change the target/scope you are searching from in the
frontend: the frontend knows wether to search in the 'en' or 'nl' part.
Just adjust the search scope, and you need to change nothing in the
extractors.

-Ard

> 
> Thank you for these useful tips.
> I'm starting to use two different branch, one for english one 
> for dutch.
> But I encountered the following  problem.
> I change URI attribute in all  my extractors, from
> 
>  <extractor classname="nl.hippo.slide.extractor.ConstantExtractor"
> uri="/files" content-type="text/xml">
> to
>  <extractor classname="nl.hippo.slide.extractor.ConstantExtractor"
> uri="/files/default.preview/content/nl" content-type="text/xml">
> 
> I removed the directory that contains  "slide_index", then I 
> restarted the REPO. I verified the regeneration of all index. 
> (I also used the RepoTouch tool) But, when I search my 
> document I find  also document in the EN branch 
> (/files/default.preview/content/en).
> Why?
> Is very strange, beacause I change correctly all my extractors.
> How Can I debug this situation?
> 
> thanks in advance,
> Alessandro
> 
> 
> 2008/5/5 Ard Schrijvers <[EMAIL PROTECTED]>:
> > Hello Alessandro,
> >
> >  If you distinguish your content hierarchically, why would 
> you need  
> > different extractors? Normally, when we have multiple 
> languages, and 
> > it  is seperated by structure, all you need, is to account 
> for it in 
> > the  frontend you are using.
> >
> >  If you have multiple languages within one document, you can take a 
> > look  at [1]
> >
> >  Regards Ard
> >
> >  [1]
> >  
> > 
> http://www.hippocms.org/display/CMS/Hippo+Repository+ConfigurableXMLCo
> > nt
> >  entExtractor
> >
> >
> >
> >  >
> >  > I'm adding multi-language in an existing site.
> >  > The site was built  with Hippo CMS and Hippo repository.
> >  >
> >  > I'm focusing on the repository:
> >  >
> >  > The original structure contains only this branch:
> >  >
> >  > /default/files/default.preview/content/nl
> >  >
> >  > and  I've just added a new English branch  >  > 
> > /default/files/default.preview/content/en
> >  >
> >  > The actual extractors index all the content because is like this:
> >  > <extractor
> >  > classname="nl.hippo.slide.extractor.HippoSimpleXmlExtractor"
> >  >  uri="/files" content-type="text/xml">  >  > The attribute 
> > uri="/files"  is too generic therefore the  > property like this:
> >  >
> >  > <instruction property="title" namespace="http://hippo.nl/cms/1.0";
> >  > xpath="/document/content/general-item/title"/>
> >  >
> >  > contains english and dutch element.
> >  >
> >  > Moreover, I found in indexer.xml several <property> that use  > 
> > org.apache.lucene.analysis.nl.DutchAnalyzer.
> >  >
> >  > Hope someone can help me with some tips on 
> multi-languages repository.
> >  >
> >  > Thanks in advance
> >  >
> >  > Alessandro
> >  > ********************************************
> >  > Hippocms-dev: Hippo CMS development public mailinglist  >
> >  ********************************************
> >  Hippocms-dev: Hippo CMS development public mailinglist
> >
> ********************************************
> Hippocms-dev: Hippo CMS development public mailinglist
> 
********************************************
Hippocms-dev: Hippo CMS development public mailinglist

Reply via email to