I like your idea and think you are quite right. I see quite some
people are using lucene to the extreme such that relational database
functionalities are replaced by lucene.
However, storing everything in lucene and use it as a relational type
of database will be kind of re-inventing the wheel.
Hi,
I was searching using google and just found that there was a new
feature called google mini. Initially I thought it was another free
service for small companies. Then I realized that it costs quite some
money ($4,995) for the hardware and software. (I guess the proprietary
software costs a
to text? Is it true that Lucene's index is about 500 times the
original text size (not including image size)? I don't have one
installed, so I cannot measure.
Best,
Sharon
jian chen [EMAIL PROTECTED] wrote:
Hi,
I was searching using google and just found that there was a new
Hi,
Just to continue this discussion. I think right now Lucene's retrieval
algorithm is based purely on Vector Space Model, which is simple and
efficient.
However, there maybe cases where folks like me want to use another set
of completely different ranking algorithms, those which do not even
[EMAIL PROTECTED] wrote:
jian chen [EMAIL PROTECTED] writes:
Just to continue this discussion. I think right now Lucene's retrieval
algorithm is based purely on Vector Space Model, which is simple and
efficient.
As I understand it, it's indeed a tf-idf vector space approach, except
Hi,
I think setting boost to the recent document is tricky. There is no
clear cut except trial and error to make the boost value right.
Could you let the user specify a date range and sort the documents
within that range by relevance? This way, the users get what they
exactly specified, and
Hi,
If it is really the case that every 128th term is loaded into memory.
Could you use a relational database or b-tree to index to do the work
of indexing of the terms instead?
Even if you create another level of indexing on top of the .tii fle,
it is just a hack and would not scale well.
I
Hi,
I am not sure. However I see that the book has an electronic version
you can buy online...
Cheers,
Jian
On Sun, 23 Jan 2005 10:30:24 +0800, ansi [EMAIL PROTECTED] wrote:
hi,all
Does anyone know how to buy Lucene in Action in China?
Ansi
Hi,
One thing to point out. I think Lucene is not using LSI as the
underlying retrieval model. It uses vector space model and also
proximity based retrieval.
Personally, I don't know much about LSI and I don't think the fancy
stuff like LSI is workable in industry. I believe we are far away from