We evaluated Verity for several weeks and had a consultant on site helping
us for a few days.  We were favorably impressed with the product but ended
up choosing Lucene after a week or so of comparing the two head-to-head.
Had our requirements been different, I can see Verity as being a superior
choice in many instances.  For starters, it does a whole lot more 'out of
the box'.  However, we have had great success with Lucene thus far - no
regrets.

We are indexing a large corpus of XML documents (~10M).  One thing that
Verity does with XML notes is that it indexes each XML tag as a zone.*
What's cool about it is that the zones are nested so that it mirrors the
schema of your XML document.  You can limit your search to any part of the
document by searching on specific zones.  A Verity zone is analogous to a
Lucene field.  Verity also has 'field' indexes - but these are a different
kind of index that Lucene does not have.  Verity fields allow you to index
various numeric types, date types etc. side-by-side with your textual index.


The edge that Verity zones have over Lucene fields is that they are nested.
However, nested fields can be simulated quite easily in Lucene by doing
redundant indexing.  I have a hunch this is what Verity does anyways because
their indexes are HUGE.  

Verity Zones may mean different things for different kinds of indexed
documents.

Incidentally, we found that the indexing speed of Lucene was much faster.
The K2Spider could spend days optimizing an index.  Verity seemed to be
faster for retrievals but they compared well.  We ran a lot of tests, but in
the end our results were sort of 'touchy feely'.  We decided that Lucene was
plenty fast for us.

Regards,
Philip

*not each instance of a tag, but rather a zone for each kind of tag.

-----Original Message-----
From: Joe Lerner [mailto:[EMAIL PROTECTED]]
Sent: Friday, January 25, 2002 1:51 PM
To: [EMAIL PROTECTED]
Subject: Zones



Hi,

We use Verity, a commercial vendor, for our Search, but were in serious
trouble
with its performance, and looking for a solid, more economical, open source
alternative, like Lucene.

A prototype we built using Lucene compared favorably with Verity, but then
along came "zones".  Verity tech support helped us re-configure our indices
with "zones", giving us a fivefold increase in performance.  Note, "Zones"
are
a separate, non-fielded, word list with addressing maps (each word mapped to
an
address/document).

Is anyone familiar with Verity "zones"?  Does Lucene implement "zones" in
its
own way?  How?


-Joe




--
To unsubscribe, e-mail:
<mailto:[EMAIL PROTECTED]>
For additional commands, e-mail:
<mailto:[EMAIL PROTECTED]>

--
To unsubscribe, e-mail:   <mailto:[EMAIL PROTECTED]>
For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>

Reply via email to