ok to add a uri attribute to document tag (in fact my new version use this
feature , these version is old but i can update with the new version)
With the official LuceneIndexTransformer we can't specify user field name
and type field (very important to indexation process):
-text (Tokenized, indexed)
-keyword (not Tokenized,indexed)
-date (not Tokenized,indexed , allow date special search) (note: the date
type is not available in the official luceneIndexTransformer)
so we can't really index XML data: Imagine we have to index XML document
with this stucture:
<name>Nicolas Maisonneuve</name>
<date>03/11/979</date>
<keywords>
<keyword>keyword1</keyword>
<keyword>keyword3</keyword>
<keyword>keyword2</keyword>
<descriptions>
<description1>1qsd qsdqs dsqd</description1>
<description1>2qsd qsdqs dsqd</description1>
<description2> qsd qdsq dqs dsq </description2>
</descriptions>
with mine you can index this kind of document with XSL , the result can be
for example:
....
<lucene:document uri="http://cocoon/mydocument.xml>
<lucene:field name="name" type="text">Nicolas
Maisonneuve</lucene:field>
<lucene:field name="keyword" type="keyword">keyword1</lucene:field>
<lucene:field name="keyword" type="keyword">keyword2</lucene:field>
<lucene:field name="keyword" type="keyword">keyword3</lucene:field>
<lucene:field name="description1" type="keyword">1qsd qsdqs
dsqd</lucene:field>
<lucene:field name="description2" type="keyword">2qsd qsdqs
dsqd</lucene:field>
<lucene:field name="date" type="date"
dateformat="MM/dd/yyyy">11/03/1979</lucene:field> </lucene:document>
Nicolas Maisonneuve
----- Original Message -----
From: "Conal Tuohy" <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Sent: Thursday, November 04, 2004 02:55
Subject: RE: Extend Lucene sample to RDBMS?
Nicolas Maisonneuve wrote:
> see also a different version of LuceneIndexTransformer
> index XML data and delete available
> http://marc.theaimsgroup.com/?l=xml-cocoon-dev&m=107821889332237&w=2
Indexing XML data is not new. The current LuceneIndexTransformer does this
already.
The current version doesn't do deletion quite the same way, but it will
over-write an existing record, which allows for "incremental update". This
is almost the same as the deletion feature in the variant version, given
that you can specify a <lucene:document> element with no content. It looks
to me as if with the variant version you can either delete a record, or add
it, but not update a record.
The big difference is that the variant version does not recognise documents
by URI - a document does not necessarily have a unique key - it simply has a
collection of fields. One or more of these fields may be a URI, but it may
not. So it will allow you to add 2 records with the same URI, whereas the
current version identifies each record (document) with a URI, and it
automatically deletes the any record with a given URI before adding a new
record with that URI.
However, the variant version does have some really good features (such as
the "boost" factor) which are certainly worth having.
I think it would be good to merge the two, actually!
Con
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]