There is a document somewhere in the Slide head which contains settings for the indexers (property and content) and the extractors (pdf,office...)
it's called Extractor-Domain.xml. You could take a look at that. Here are my working settings (from my Domain.xml):


.... (at the end of my <store> definition)
<contentindexer classname="org.apache.slide.index.lucene.LuceneContentIndexer">
<!-- indexpath is the name of the folder that Lucene creates under (if using tomcat) your tomcat/bin/ folder, unfortunately Slide does not support storing the Lucene index in the Slide repository :( so you must change this parameter if you have multiple contexts running so they don't collide -->
<parameter name="indexpath">store/index/content</parameter>
<!-- asynchron makes the indexing happen in its own thread, not needed really unless you need really rapid writes to the slide store but not simultanious indexing -->
<parameter name="asynchron">true</parameter>
</contentindexer>
<propertiesindexer classname="org.apache.slide.index.lucene.LucenePropertiesIndexer">
<parameter name="indexpath">store/index/metadata</parameter>
<parameter name="asynchron">true</parameter>


<!-- Here you define all your custom properties (if you have any) you want to index that you add to a resource with propPatchMethod, we have one extra property -->
<configuration name="indexed-properties">
<property name="ContentType" namespace="IW:">
<text/>
<is-defined/>
</property>
</configuration>
</propertiesindexer>
</store>


And then later in Domain.xml I add the extractors and set them to the paths we want to index, in our case just everything under /files
....
<parameter name="versioncontrol-exclude"/>
<parameter name="checkout-fork">forbidden</parameter>
<parameter name="checkin-fork">forbidden</parameter>



<!-- Extractor configuration -->
<extractors>
<!--XML extractors-->
<extractor classname="org.apache.slide.extractor.SimpleXmlExtractor" uri="/files">
<!-- BTW this extractor does NOT work if you have a namespace in your xml document, I have these settings here because someday it will...-->
<configuration>
<instruction namespace="http://xmlns.idega.com/block/article/xml"; property="headline" xpath="/article/headline/text()" />
<instruction namespace="http://xmlns.idega.com/block/article/xml"; property="teaser" xpath="/article/teaser/text()" />
<instruction namespace="http://xmlns.idega.com/block/article/xml"; property="body" xpath="/article/body/text()" />
<instruction namespace="http://xmlns.idega.com/block/article/xml"; property="author" xpath="/article/author/text()" />
<instruction namespace="http://xmlns.idega.com/block/article/xml"; property="source" xpath="/article/source/text()" />
<instruction namespace="http://xmlns.idega.com/block/article/xml"; property="comment" xpath="/article/comment/text()" />
</configuration>
</extractor>
<extractor classname="org.apache.slide.extractor.XmlContentExtractor" uri="/files"/>
<!--XML extractors-->


<!--PDF extractors-->
<extractor classname="org.apache.slide.extractor.PDFExtractor" uri="/files" />
<!--PDF extractors-->


<!--Text extractors-->
<extractor classname="org.apache.slide.extractor.TextContentExtractor" uri="/files" />
<!--Text extractors-->


<!--Office extractors-->
<extractor classname="org.apache.slide.extractor.OfficeExtractor" uri="/files">
<configuration>
<instruction property="author" id="SummaryInformation-0-4" />
<instruction property="application" id="SummaryInformation-0-18" />
</configuration>
</extractor>
<extractor classname="org.apache.slide.extractor.MSWordExtractor" uri="/files"/>
<extractor classname="org.apache.slide.extractor.MSExcelExtractor" uri="/files"/>
<extractor classname="org.apache.slide.extractor.MSPowerPointExtractor" uri="/files"/>
<!--Office extractors-->


    </extractors>

<!-- Event configuration -->
<events>
<event classname="org.apache.slide.webdav.event.WebdavEvent" enable="true" />
<event classname="org.apache.slide.event.ContentEvent" enable="true" />
...


And that's how we shave!

Best Regards

Eirikur S. Hrafnsson, [EMAIL PROTECTED]
Chief Software Engineer
Idega Software
http://www.idega.com



On 13.4.2005, at 09:19, Edmund Urbani wrote:

Eirikur Hrafnsson wrote:
On 12.4.2005, at 13:19, Edmund Urbani wrote:
Bertrand Tignon wrote:

Thank u for replying Edmund.
Well, I'm using Slide 2.1
I didn't manage to get the Slide 2.2 via cvs. I read the "how-to" but I
don't see the 2.2 version, is it called "Slide_HEAD_PRE_MERGE", or
"SLIDE_HEAD_AFTER_EVENTS" or something like that ?
About the wiki "DASL Configuration", I don't know how to get the lucene
library needed (package org.apache.slide.index.*).
thanx for your help
Bertrand.

There is no 2.2 release, yet. The closest you get to 2.2 is the current
CVS HEAD.
That org.apache.slide.index package is in slide-stores-2.x.jar. It's there
even in 2.1, even though it appearantly does not work.


Maybe I should ask a different question on this list:
Does the LuceneIndexer that is currently in CVS HEAD work?
It does, the version in 2.1 does not.
-Eiki
Thanks. That's good to hear. I was about to give up.

Now I'd like to good back to the question I had earlier:
Do I need to add anything to my Domain.xml other than the <contentindexer ..>
element (as explained in the Wiki) to make the lucene indexer work?


 Edmund

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]






Reply via email to