Hi,
Does anyone know if it is possible to show related searches with lucene, for
example if someone searched for "car insurance" you could bring back the
results and related searches like these
Automobile Insurance
Car Insurance Quote
Car Insurance Quotes
Auto Insurance
Cheap Car Insurance
Car
Hi All,
We have a lucene index of over 10 000 000 docs at this time.
When we try and run a search we get
java.lang.OutOfMemoryError: Java heap space
We have tried setting the xmx settings to 1gb but to no avail (the box has
4gb of memory available) . IS there any guidance on handling memory or
i Leon,
I had a similar problem when doing a test import which I believe was
actually down to object churn in parsing the data to create the Documents.
I achieved a quick fix by calling System.gc() every thousand documents.
Cheers,
Nick
____
From: Leon Chaddoc
issue if
you have many indexed fields. FieledNorms take up one byte per doc per
indexed field -- even if a doc doens't have a value for that field, it
still gets a norm for that field. There are options when indexing to
prevent norms from being calculated, which can save a lot of space.
Hi,
we are having tremendous problems building a large lucene index and querying
it.
The programmers are telling me that when the index file reaches 3.5 gb or 5
million docs the index file can no longer grow any larger.
To rectify this they have built index files in multiple directories. Now
]>
To:
Sent: Tuesday, February 14, 2006 6:38 PM
Subject: RE: Size + memory restrictions
Yes. We have the same problem. It is mainly because TermInforReader.java
that takes memory space to keep *.tii.
Eugene
-Original Message-
From: Leon Chaddock [mailto:[EMAIL PROTECTED]
Sent: Tues
you will need enough memory to store a full
sorting of your documents in memory. If you're trying
to sort on a string or anything other than an int or
float, this could require a lot of memory.
I've used indices much bigger than 5 mil. docs/3.5 gb
with less than 4GB of RAM and had no prob
tch (IOException e) {
log.error(ClassTool.getClassNameOnly(e) + ": " + e.getMessage(), e);
}
}
mSearcher = new MultiSearcher(srs);
changeTime = System.currentTimeMillis();
}
}
return mSearcher;
}
- Original Message -
From: "Leon Chaddock" <[EMAIL PROTECTED]>
iSearcher(srs);
: changeTime = System.currentTimeMillis();
:}
: }
: return mSearcher;
: }
: - Original Message -
: From: "Leon Chaddock" <[EMAIL PROTECTED]>
: To:
: Sent: Wednesday, February 15, 2006 9:28 AM
: Subject: Re: Size + memory restrictions
:
:
: > Hi G
Hi,
I am very interested in this aswell, as I wish to display related searches
for users.
Does anyone know if this work is open source and is there an api available?
Thanks
Leon
- Original Message -
From: "Pasha Bizhan" <[EMAIL PROTECTED]>
To:
Sent: Tuesday, March 07, 2006 12:39 PM
Hi,
At present lucene seems to rank very short documents over longer documents
where the phrase occurs more regularily for instance which the search term
"cat"
"the cat went home"
ranks higher than
"the black cat when home past some other cats, on cat street"
Is there anyway I can change lu
Hi Chris,
You said:
" 5 word occurances in a 10 word document would probably score the same as
those 5 words in a 20 word document"
OK so If I set this option would this mean no of occurences was a major
factor so that:
A phrase occurs 1 time in a 3 word document would be a lower rank than A
12 matches
Mail list logo