Marvin,
Several posts back on this thread, I talked about an
algorithm of impact-sorted posting list for
conjunctive boolean query. Your concerns on
impact-sorting in boolean retrieval model is valid.
But practically, the approximation (as in my original
post) should work well enough for large corp
The idea of "impact" and "impact-sorted posting list"
should practically work with boolean model by
approximation in the following way:
(1) Index Structure
Inverted-Index : *
posting-list: + (sorted
by impact)
occurrence: position
(2) Retrieval Algorithm for boolean query "a AND b"
set an impa
the aggregated significance into
"impact". Then you can do away fields in a
vector-space model of retreival.
But there is usually some semantics of fields in a
boolean model and semi-structured information
retrieval, which you can not get rid of.
Michael
--- Ming Lei <[EMAIL PROT
Just my two cents,
I think what he meant by "single field" is the
following:
If the concept of "field" was introduced to
differentiate the significance of term occurrences in
difference regions of a document, (eg, the occurence
in title is more important than in body, etc), that
significance can b
I have a couple of questions about the original post
of the new index design:
(1) Question on the posting list
> > f. ,],...[docN, freq
> > > > ,])
What is the "impact" per posting list? I am under the
impression that "impact" or "frequency" is per pair of
doc and term.
And it seem that "impact
Can anyone help answer the question or at least point
out if the question is vague or should be directed to
some other place.
Thanks
--- Ming Lei <[EMAIL PROTECTED]> wrote:
> Can I solely rely on RMI's remote object cleanup
> mechanism for this?
> It seems that RemoteSeac
Can I solely rely on RMI's remote object cleanup
mechanism for this?
It seems that RemoteSeachable.close() has to be called
separately. Should we add a finalize() to
RemoteSearchable to call close()?
Am I missing anything here?
Please shed some light on this.
Thanks
Question 1:
A search server runs on an index that are periodically
refreshed with newer versions. For example, it starts
with c:/lucene/ind_dir_0, then later on the indexer
creates c:/lucene/ind_dir_1 and so on. I would like
the search server to automatically pick up the latest
version when it is a