Bill Janssen wrote:
I'm not sure this solution is very robust
I think I already sent an email with a better code...
Sergiu
Thanks to something Doug said when I first opened this discussion, I
went back and looked at my implementation. He said, Can't we just do
this in getFieldQuery?.
Hi,
I'm curious about your strategy to backup indexes based on FSDirectory.
If I do a file based copy I suspect I will get corrupted data because of
concurrent write access.
My current favorite is to create an empty index and use
IndexWriter.addIndexes() to copy the current index state. But I'm
Christoph Kiehl wrote:
I'm curious about your strategy to backup indexes based on FSDirectory.
If I do a file based copy I suspect I will get corrupted data because of
concurrent write access.
My current favorite is to create an empty index and use
IndexWriter.addIndexes() to copy the current
Christiaan Fluit wrote:
I have no practical experience with backing up an online index, but I
would try to find out the details of the write lock mechanism used by
Lucene at the file level. You can then create a backup component that
write-locks the index and does a regular file copy of the
Hello,
I am working on Lucene and tried to understand the calculation of the score
value. As far as I understand it works as follows:
(1) idf = ln(numDocs/(docFreq+1))
(2) queryWeight = idf * boost
(3) sumOfSquaredWeights = queryWeight * queryWeight
(4) norm = 1/sqrt(sumOfSquaredWeights)
James,
How do you kick off your reindex? Could it be a session timeout?
cheers,
Aad
Hello,
I am a Java/Lucene/Tomcat newbie I know that does not bode well as a
start
to a post but I really am in dire straits as far as Lucene goes so bear
with
me. I am working on indexing and replacing
Aad,
D'oh forgot to mention that mildly important info. Rather than
re-index I am just creating a new index each time, this makes things easier
to roll-back etc (which is what my boss wants). the command line is
something like java com.lucene.IndexHTML -create -index indexstore/ .. I
have
So, are you creating the indexes from inside the tomcat runtime, or are you creating
them on the command line (which would be in a different runtime than tomcat)?
What happens to tomcat? Does it hang - still running but not responsive? Or does it
crash?
If it hangs, maybe you are running
Wouldn't it make more sense if the constructor for the IndexWriter always created an
index if it doesn't exist - and the boolean parameter should be clear (instead of
create)
So instead of this (from javadoc):
IndexWriter
public IndexWriter(Directory d,
Analyzer a,
You could always modify your own local copy if you want to change the
behavior of the parameter.
or just do:
IndexWriter w = new IndexWriter(indexDirectory,
new StandardAnalyzer(),
http://www.peerfear.org/rss/permalink/2004/10/26/PoorLuceneRankingForShortText/
--
Use Rojo (RSS/Atom aggregator). Visit http://rojo.com. Ask me for an
invite! Also see irc.freenode.net #rojo if you want to chat.
Rojo is Hiring! - http://www.rojonetworks.com/JobsAtRojo.html
If you're
On Wednesday 27 October 2004 20:20, Kevin A. Burton wrote:
http://www.peerfear.org/rss/permalink/2004/10/26/PoorLuceneRankingForSho
rtText/
(Kevin complains about shorter documents ranked higher)
This is something that can easily be fixed. Just use a Similarity
implementation that extends
Is there way to include stopwords in an exact phrase search? For
example, when I search on Melbourne IT, Lucene only searches for
Melbourne ignoring IT.
Thanks,
Ravi.
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For
On Oct 27, 2004, at 3:36 PM, Ravi wrote:
Is there way to include stopwords in an exact phrase search? For
example, when I search on Melbourne IT, Lucene only searches for
Melbourne ignoring IT.
But you want stop words removed for general term queries?
Have a look at how Nutch does its thing - it
your analyzer will have removed the stopword when you indexed your documents, so
lucene won't be able to do this for you.
You will need to implement a second pass over the results returned by lucene and
check to see if the stopword is included, perhaps with String.indexOf()
On Wed, 27 Oct 2004
Hello,
i'm trying to use highlighter from sandbox and
actually i've got a problem with some results getting
from highlighter.
normaly when i search in my index for ex. motor i
get
circa 150 results -- this results are ok.
but when i use highlighter i get some results as
null values from the
Daniel Naber wrote:
(Kevin complains about shorter documents ranked higher)
This is something that can easily be fixed. Just use a Similarity
implementation that extends DefaultSimilarity and that overwrites
lengthNorm: just return 1.0f there. You need to use that Similarity for
indexing and
I'm not sure this solution is very robust
Thanks, but I'm pretty sure it *is* robust. Can you please offer a
specific critique? Always happy to learn and improve :-).
I think I already sent an email with a better code...
Pretty vague. Can you send a URL for that message in the
Suggestions
[a]
Try invoking the VM w/ an option like -XX:CompileThreshold=100 or even
a smaller number. This encourages the hotspot VM to compile methods
sooner, thus the app will take less time to warm up.
http://java.sun.com/docs/hotspot/VMOptions.html#additional
You might want to search
Can I give weights on different indexes when I search against multiple
indexes. The final score of a document should be a linear combination of
the weights on each index and the individual score for that index. Is
this possible in Lucene?
Thanks
Ravi.
Hi,
I'm getting:
java.io.IOException: Lock obtain timed out
I have
a writer service that opens the index to delete and add docs. I have a reader
service that opens the index for searching only.
This error occurs when
the reader service opens the index (this takes about 500ms).
On Wednesday 27 October 2004 22:47, Kevin A. Burton wrote:
If the current behavior is all that happens this is fine... this way I
can just get this behavior for new documents that are added.
You'll have to try it out, I'm not sure what exactly will happen.
Also... why isn't this the default?
Hello
I wrote the following test programs:
I index 150,000 documents in Lucene and I build each document using
this method.
private Document buildDocument(String documentID, String body)
{
Document document = new Document();
document.add(Field.Keyword(docID, documentID));
WRT to my blog post:
It seems the problem is that the distribution for lengthNorm() starts at
1 and moves down from there. 1.0f would work but HUGE documents would
be normalized and so would distort the results.
What would you think of using this implementation for lengthNorm:
public float
24 matches
Mail list logo