Forgot some things:
Am Mittwoch, 25. Februar 2004 13:37 schrieb Daniel Florey:
> Hi,
> I just checked in some classes for the extractor thing.
> I've implemented a very simple demo extractor that extracts data from xml
> documents by doing some configurable xpath queries. If you want to test
> this you have to enable the extractor trigger.
> This is done in the Domain.xml file in the event section:
>
> <listener classname="org.apache.slide.extractor.ExtractorTrigger">
> <configuration>
> <extractor
> classname="org.apache.slide.extractor.SimpleXmlExtractor"
> uri="/files/articles/test.xml">
You can match by exact uris as well as uri substrings. (e.g.
uri="/files/articles/") and you can do content-type base matching. If you
want to extract something from all word docs it would look somehow like this:
<listener classname="org.apache.slide.extractor.ExtractorTrigger">
<configuration>
<extractor classname="org.apache.slide.extractor.MSWordExtractor"
content-type="application/ms-word">
Or whatever the content type may be. Uri matching and content-type matching
can be combined.
You can configure your extractor by implementing the Configuration interface.
Have a look at the demo extractor. This is simple.
Regards,
Daniel
> <configuration>
> <instruction property="title" xpath="/article/title/text()"
> /> <instruction property="summary"
> xpath="/article/summary/text()" />
> </configuration>
> </extractor>
> </configuration>
> </listener>
>
> In this example only the document with uri = /files/articles/test.xml will
> be processed. If the content would be:
>
> <?xml version="1.0" encoding="UTF-8" ?>
> <article>
> <title>Title of article</title>
> <summary>The summary of this article</summary>
> </article>
>
> there are some new properties (title, summary) available containing the
> text. If some error occurs, the file cannot be uploaded. This is done by
> throwing an extractor exception.
> Any comments are welcome.
> Regards,
> Daniel
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]