[EMAIL PROTECTED] wrote:
Hi,
I just try to get the new lucene indexer to work, but up to now without
success. I'm working with the Slide head/trunk.
I followed the steps in the Wiki to configure DASL/Lucene in the Domain.xml
and the indexes get created when server starts up.
But the content is never updated, it just contains a single segment file.
Is there something that I missed?
Daniel
BTW: Is there a way to search the mailing list archives? The layout changed
and no search anymore?
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
Try with this configuration for your domain.xml :
inside the store element :
<contentindexer
classname="org.apache.slide.index.TextContentIndexer">
<parameter
name="indexpath">${filespath}index/content</parameter>
</contentindexer>
<propertiesindexer
classname="org.apache.slide.index.lucene.LucenePropertiesIndexer">
<parameter
name="indexpath">${filespath}index/metadata</parameter>
<configuration name="indexed-properties">
<property name="author" namespace="DAV:">
<text
analyzer="org.apache.lucene.analysis.WhitespaceAnalyzer"/>
<is-defined/>
</property>
</configuration>
</propertiesindexer>
And use extractors :
<!-- Extractor configuration -->
<extractors>
<extractor
classname="org.apache.slide.extractor.SimpleXmlExtractor"
uri="/files/articles/test.xml">
<configuration>
<instruction property="title"
xpath="/article/title/text()" />
<instruction property="summary"
xpath="/article/summary/text()" />
</configuration>
</extractor>
<extractor
classname="org.apache.slide.extractor.OfficeExtractor" uri="/files/docs/">
<configuration>
<instruction property="author"
id="SummaryInformation-0-4" />
<instruction property="application"
id="SummaryInformation-0-18" />
</configuration>
</extractor>
<extractor
classname="org.apache.slide.extractor.TextContentExtractor"
uri="/files/spaces">
</extractor>
<extractor
classname="org.apache.slide.extractor.XmlContentExtractor"
uri="/files/spaces">
</extractor>
<extractor
classname="org.apache.slide.extractor.MSWordExtractor" uri="/files/spaces">
</extractor>
<extractor
classname="org.apache.slide.extractor.MSExcelExtractor" uri="/files/spaces">
</extractor>
<extractor
classname="org.apache.slide.extractor.MSPowerPointExtractor"
uri="/files/spaces">
</extractor>
<extractor classname="org.apache.slide.extractor.PDFExtractor"
uri="/files/spaces">
</extractor>
This should do the trick
regards,
Fabrice
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]