: Stored = as-is value stored in the Lucene index
:
: Tokenized = field is analyzed using the specified Analyzer - the tokens
: emitted are indexed
:
: Indexed = the text (either as-is with keyword fields, or the tokens
: from tokenized fields) is made searchable (aka inverted)
:
: Vectored =
Hello, Otis.
Interesting. Nutch doesnt use RemoteSearchable b/b RemoteSearchable is not
very useful? I mean does it suitable for distibuting index process in
parallel on many services or not? Will it give us good performance.
We have RemoteSearchable in the sources, but anyone does not use it.
There is no API for this, but I recall somebody
talking about adding support for this a few months
back
See
http://marc.theaimsgroup.com/?l=lucene-devm=109485996612177w=2
This implementation was working on a version of Lucene
before compression was introduced so things may have
changed a
Thanks guys for the info!
After looking at the patch code I have two problems:
1) The patch implementation doesn't help with performance. It still
reads the data for every field in the document. Just not storing all
of them. So this implementation helps if there are memory
restrictions, but not
Hi Chuck:
Trying to follow up on this thread. Do you know if this feature
will be incorporated in the next Lucene release?
How would someone find out which patches will go into the next release?
Thanks
-John
On Mon, 15 Nov 2004 13:05:36 -0800, Chuck Williams [EMAIL PROTECTED]
It still reads the data for every field in the
document
No, not if your fields are positioned in the right
order. It stops reading fields after it has got what
is needed.
If your doc has fields in the order:
smallFrequentlyReadField, largeRarelyReadField
then the patch will not read
Hi,
I have a application where I know I will have duplicate ID's. When I search
these duplicate ID's will it search content in both the files ?
For Example :
Id = Mahaveer, Content = Jain India
Id = Mahaveer, Content = Lucene Test
Now when I search for India Test will it return both the
On Jan 7, 2005, at 4:26 AM, John Wang wrote:
Trying to follow up on this thread. Do you know if this feature
will be incorporated in the next Lucene release?
How would someone find out which patches will go into the next
release?
CVS commit messages are sent to the lucene-dev e-mail
Interesting article:
http://www.javaworld.com/javaworld/jw-01-2005/jw-0103-search_p.html
I don't agree with the use of QueryParser for non-human-entered
queries, though, but otherwise its a reasonable approach for a
light-weight object store.
Erik
Hello,
If you search for India OR Test, you will find both, if you use AND,
you will find none. Lucene can search any text, not just files. It
sounds like you are using Lucene's demo as a real application (not a
good practise). I suggest you take a look at the Resources page on the
Lucene Wiki
Hello Jac;
If you have verified that the index folder is indeed being create and their
is a segment(s) file(s) in it, check that the IndexSearcher in the demo is
pointing to that location. This is a easy error to make and would account
for the error message no segments folder.
Luke
-
Hi,
Probably this is trivial question.
How can you enforce the order of the fields when you index them ?
Thanks,
Mariella
At 09:32 AM 1/7/2005 +, mark harwood wrote:
It still reads the data for every field in the
document
No, not if your fields are positioned in the right
order. It stops
On Jan 7, 2005, at 10:03 AM, Mariella Di Giacomo wrote:
Probably this is trivial question.
How can you enforce the order of the fields when you index them ?
By the order in which you add them to a document.
Erik
Thanks,
Mariella
At 09:32 AM 1/7/2005 +, mark harwood wrote:
It still
At 10:24 AM 1/7/2005 -0500, Erik Hatcher wrote:
On Jan 7, 2005, at 10:03 AM, Mariella Di Giacomo wrote:
Probably this is trivial question.
How can you enforce the order of the fields when you index them ?
By the order in which you add them to a document.
So when you do the following:
On Jan 7, 2005, at 10:34 AM, Mariella Di Giacomo wrote:
At 10:24 AM 1/7/2005 -0500, Erik Hatcher wrote:
On Jan 7, 2005, at 10:03 AM, Mariella Di Giacomo wrote:
Probably this is trivial question.
How can you enforce the order of the fields when you index them ?
By the order in which you add them to
On Fri, 2005-01-07 at 08:05, Erik Hatcher wrote:
Interesting article:
http://www.javaworld.com/javaworld/jw-01-2005/jw-0103-search_p.html
Sort of off-topic, but does this mean JavaWorld is publishing again? I
had read Bill Venners's post from back in January '04 that they shut
down.
Hi,
we are currently implementing a search engine for a news site. Our goal
is to have a search result that uses the publish date of the documents
to boost the score of the documents.
I took a look at nutch to see how it implements pagerank and it seems
like this is done at index time by
: we are currently implementing a search engine for a news site. Our goal
: is to have a search result that uses the publish date of the documents
: to boost the score of the documents.
: have to use something that boosts the scores at _search_ time.
1) There is a way to boost individual Query
Hello,
Lucene is great! I just have a question.
Is there a simple way to check and see if an index is already optimized?
What happens if optimize is called on an already optimized index - does
the call basically do a noop? Or is it still and expensive call?
Regards,
Michael
My application for Lucene involves updating an existing index with a
mixture of new and revised documents. From what I've been able to
dicern from reading I'm going to have to delete the old versions of the
revised documents before indexing them again. Since this indexing will
probably take
I've read as much as I could find on the highlighting that is now in the
sandbox. I didn't find the javadocs. I found a link to them, but it
redirected my to a cvs tree.
Do I assume that you have to store the content of the document for the
highlighting to work? Otherwise I don't see how it
Jim Lynch wrote:
I've read as much as I could find on the highlighting that is now in the
sandbox. I didn't find the javadocs.
I have a copy here:
http://www.searchmorph.com/pub/jakarta-lucene-sandbox/contributions/highlighter/build/docs/api/overview-summary.html
I found a link to them, but it
This may not be a simple way, but you could just do a quick check on the
folder to see if there is more than one file containing the name segment.
Luke
- Original Message -
From: Crump, Michael [EMAIL PROTECTED]
To: lucene-user@jakarta.apache.org
Sent: Friday, January 07, 2005 2:24 PM
Crump, Michael writes:
Is there a simple way to check and see if an index is already optimized?
What happens if optimize is called on an already optimized index - does
the call basically do a noop? Or is it still and expensive call?
Why don't you just try that? E.g. using luke. Or three
On Fri, 2005-01-07 at 13:24, Crump, Michael wrote:
Is there a simple way to check and see if an index is already optimized?
What happens if optimize is called on an already optimized index - does
the call basically do a noop? Or is it still and expensive call?
If an index has no deletions,
If an index has no deletions, it does not need to be optimized. You can
find out if it has deletions with IndexReader.hasDeletions.
Is that true? An index that has just been created (with no deletions)
can still have multiple segments that could be optimized. I'm not
sure your statement is
Based on the method sent earlier, it looks like Lucene first checks to
see if optimization is even necessary.
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
Hi,
I'm new to Lucene, so I apologize if this issue has been discussed
before (I'm sure it has), but I had a hard time finding an answer using
google. (Maybe this would be a good candidate for the FAQ!) :)
Is it possible to enable stem queries on a per-query basis? It doesn't
seem to be possible
OK, thanks. That clears things up. I'll play with it once I get
something indexed.
Jim.
David Spencer wrote:
Jim Lynch wrote:
I've read as much as I could find on the highlighting that is now in
the sandbox. I didn't find the javadocs.
I have a copy here:
From what I've read, if you want to have a choice, the easiest way is
to index the documents twice. Once with stemming on and once with it off
placing the results in two different indexes. Then at query time,
select which index you want to use based on whether you want stemming on
or off.
: Is it possible to enable stem queries on a per-query basis? It doesn't
: seem to be possible since the stem tokenizing is done during the
: indexing process. Are people basically stuck with having all their
: queries stemmed or none at all?
: From what I've read, if you want to have a choice,
31 matches
Mail list logo