Fernando,
On Thursday 12 August 2004 17:44, Wermus Fernando wrote:
Luceners
I have to search a string in 30 fields. I know how to do it in a long
way. I wanna know if exists a shorter way.
String for searching: what's your name?
Long way: +firstname:what's your name? OR +lastname: what's
On Wednesday 07 July 2004 08:25, Ype Kingma wrote:
For a single term query, one can iterate through
IndexReader.termDocs(Term) and store the document numbers by
TermDocs.docFreq().
That should be TermDocs.freq()
Oops,
Ype
On Thursday 03 June 2004 07:10, Karthik N S wrote:
Hey
Ype the Query of range
+button +shirt +filename:[b10181_p100 TO b10181_p200]
did not work for me but on other way around
+(button OR shirt) +filename:[b10181_p100 TO b10181_p200]
resulted to me in 2 hits with either one
On Wednesday 02 June 2004 14:46, Erik Hatcher wrote:
On Jun 2, 2004, at 6:20 AM, Karthik N S wrote:
...
I still have 3 small Questions.
1)While creating the Range Query Is it possible for Lucene to do
somthing
similar..
+(button AND shirt) +filename:[b10181_p100 TO b10181_p200]
Karthik,
On Monday 31 May 2004 06:12, Karthik N S wrote:
Hey Ype
...
My Question now is, If I want to Use Range Query to get search hits
between
fileName B10181_P702 and B10181_P355 only Instead of all the 67 hits
,
In this case there is no need to override range query, just use
On Monday 31 May 2004 11:09, Karthik N S wrote:
...
I re indexed my folder 10181 [Seem's to be corrupted]
Was the index writer closed?
Now I am getting the hits as
D:\JAVA\lucene\src\demojava org.lucene.src.indexer.search.SearchFiles
Search Keyword : +button+filename:[B10181_P702 TO
Karthik,
On Monday 31 May 2004 13:47, Karthik N S wrote:
Hey Ype...
1) I switched Off the Multi search Senerio.
2) Changing the Field type from Text to Keyword
will fail When I search for the the Field type filename
so,I still maintained it to be Text
Just make sure the file name
-Original Message-
From: Ype Kingma [mailto:[EMAIL PROTECTED]
Sent: Thursday, May 27, 2004 11:03 PM
To: [EMAIL PROTECTED]
Subject: Re: Range Query Sombody HELP please
On Thursday 27 May 2004 09:37, Karthik N S wrote:
Hi
Lucene -Developer My main intention was
Search for an word hit
On Friday 28 May 2004 10:54, Karthik N S wrote:
Hey ype
Thx for the advice but still I need to get the exact situation working ,
1) I have a unique Field [ called filename ] which is indexed of type Text.
It accepts the name of the HTML files as the indexing parameter ,
Also there
On Thursday 27 May 2004 07:00, Karthik N S wrote:
Hi
Lucene developers
Is it possible to do Search and retrieve relevant information on the
Indexed Document
within in specific range settings which may be similar to an
Query in SQL = select * from BOOKSHELF where book1 between 100
On Thursday 27 May 2004 09:37, Karthik N S wrote:
Hi
Lucene -Developer My main intention was
Search for an word hit in a Unique Field between ranges say
book100 - book 200 indexed numbers
It's something like creating a SUBSEARCH with in the SEARCHINDEX.
You don't need to
David,
On Tuesday 25 May 2004 03:05, you wrote:
I have an application using Lucene 1.3 final.
In this application, I am loading data where the main text for each
document is stored into a body field, a couple of other internal fields,
and basically some meta-data fields driven by the data
On Tuesday 18 May 2004 19:38, Claude Devarenne wrote:
Hi,
I have over 60,000 documents in my index which is slightly over a 1 GB
in size. The documents range from the late seventies up to now. I
have indexed dates as a keyword field using a string because the dates
are in MMDD format.
Alex, Otis,
On Friday 14 May 2004 13:58, Otis Gospodnetic wrote:
Moving to lucene-user list.
Hello,
Didn't I already answer these questions?
1. No :(
There is bit more to say, see below.
...
--- Alex Aw Seat Kiong [EMAIL PROTECTED] wrote:
Hi!
Some question about lucene:
1. Are
Paul,
On Thursday 13 May 2004 22:03, Paul wrote:
Stephane James Vaucher wrote:
On Thu, 13 May 2004, Matt Quail wrote:
do you know of any method to reduce the memory consumption of lucene
when searching?
Avoid prefix queries and wildcards, since they can be rewritten into
large
On Tuesday 11 May 2004 17:26, Gerard Sychay wrote:
Eric Jain [EMAIL PROTECTED] 05/11/04 04:47AM
Hits hits = searcher.search(new TermQuery(text, foo)
Set hitPKs = new Set();
for each doc in hits:
hitPKs.put(doc.getField(pk))
Retrieving even one custom field for every document
On Monday 10 May 2004 14:13, David Townsend wrote:
We have a number of small indices and also an uber-index made up of all the
smaller indices. We need to get do a search across a number of the
sub-indices and get back a hit count from each. Currently we search each
index, we've also tried
On Thursday 06 May 2004 18:11, David Spencer wrote:
Otis Gospodnetic wrote:
Sure.
On click, get document Id (not internal docId, but something you use as
s surrogate primary key) of the clicked document. Retrieve the
document. Pull out the value of 'clickCount' field. +1 it. Delete
the
On Thursday 06 May 2004 23:26, Boris Goldowsky wrote:
On Thu, 2004-05-06 at 13:58, Ype Kingma wrote:
Changing the click count this way is ok, but along with that you could
change the (field) norm for the document to increase it's score
in subsequent queries.
You can use Document.setBoost
On Thursday 29 April 2004 08:14, Nader S. Henein wrote:
Tricky, scoring has to do with the frequency of the occurrence of the word
as opposed to the amount of words in the file in general (Somebody correct
me if I'm wrong) , so short of an educated approximation, you could hack
Lucene uses two
On Thursday 29 April 2004 20:09, Matthew W. Bilotti wrote:
I can't help you with your first question about coordination
of disjunctions in conjunctions.
Actually, I would like to have the possibility to provide
all terms in an OR query with the same idf weight, eg. some
avarage of their IDF's,
Greg,
On Wednesday 28 April 2004 21:44, Greg Conway wrote:
Hello. Apologies if this has come up before, I'm new to the list and
didn't see anything in the archives that exactly matched my situation.
It has, but each situation is different. Try this:
Greg,
Yes, see RemoteSearchable and MultiSearcher in org.apache.lucene.search.
(See the javadoc on the website)
I meant ParallelMultiSearcher.
Good night,
Ype
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional
On Wednesday 14 April 2004 20:55, Armbrust, Daniel C. wrote:
I should have remembered that.
Here are the 3 explanations for the top 3 documents returned (contents
below)
3.3513687 = product of:
6.7027373 = weight(preferred_designation:renal calculus in 48270),
product of: 0.8114604 =
On Friday 09 April 2004 21:18, [EMAIL PROTECTED] wrote:
Hi!
I implemented a VLH pattern Lucene's search hits but noticed that
hits.doc() is quite slow (3000+ hits took about 500ms).
So, I want to ask people here for a solution. I tought about something like
a wrapper for the VO
On Tuesday 23 March 2004 16:05, Joachim Schreiber wrote:
Hallo,
I run in following problem. Perhaps somebody can help me.
I have a index with different ids in the same field
something like
s
s45678565
s87854546
Situation: I have different documents with the entry s in
Joachim,
...
you think its possible to order by e.g. date field without retrieving all
the values from the index??
Yes, the new sorting feature from CVS does that, see Doug's
last note on the subject. (It might have been on lucene-dev,
I didn't keep a copy).
Have fun,
Ype
Stefan,
I didn't provide the patch, I just remembered the code from
a recent reading.
I took another look whether there are more such cases
in the Term() method, but I couldn't find anything clear
in the .jj file. The generated .java file didn't help much either.
Could you provide a line number
On Tuesday 09 December 2003 17:58, Ype Kingma wrote:
Stefan,
I didn't provide the patch, I just remembered the code from
a recent reading.
I took another look whether there are more such cases
in the Term() method, but I couldn't find anything clear
in the .jj file. The generated .java
Stefan,
It's a bug, and there is a fix for this in the latest CVS
near the end of the QueryParser.jj file:
// avoid boosting null queries, such as those caused by stop words
if (q != null) {
q.setBoost(f);
}
Kind regards,
Ype
On Monday 08 December 2003 20:20, Stefan
Ralph,
On Monday 01 December 2003 04:11, [EMAIL PROTECTED] wrote:
Hi,
is it possible to use a real boolean model in lucene for searching. When
one is using the Queryparser with a boolean query (i.e. dog AND horse)
one does get a list of documents from the Hits object. However these
On Monday 01 December 2003 05:38, Ralph wrote:
Hi,
does somebody has an example of how to use another similarity class
implementation for searching? Assuming I have implemented MySimilarity
class MySimilarity implements Similarity{
how do I have to plug it in to acutally use it for a
Kent, Erik,
On Saturday 29 November 2003 17:20, Erik Hatcher wrote:
I enjoy at least attempting to answer questions here, even if I'm half
wrong, so by all means correct me if I misspeak
Me too, :)
On Saturday, November 29, 2003, at 06:37 PM, Kent Gibson wrote:
All I would like to
Erik,
On Sunday 23 November 2003 12:51, Erik Hatcher wrote:
On Saturday, November 22, 2003, at 06:33 PM, Dion Almaer wrote:
3. I have some fields suck as title, owner, etc as well as the content
blob which I index and use as
the default search field. Is there an easy way to extend the
on you data, so
you might experiment a bit. You might eg. index all fields
seperately, and also index a default concatenated field.
Kind regards,
Ype Kingma
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e
Hui,
On Tuesday 07 October 2003 19:31, hui wrote:
Hi,
When I use the Mutliple index seach on one large index and one small index,
look like sometimes the documents from the small index get higher score
compared the documents from the big index. But when I look at the score
formular, this
On Tuesday 23 September 2003 00:12, Chris Hennen wrote:
Hi,
what is the purpose of tf_q * idf_t / norm_q in Lucene's scoring
algorithm:
score_d = sum_t( tf_q * idf_t / norm_q * tf_d * idf_t / norm_d_t)
I dont understand, why the score has to be higher, when the frequency of a
term in the
fields
anyway, this doesn't hurt performance.
Kind regards,
Ype Kingma
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
Reece,
On Friday 11 July 2003 16:05, Wilton, Reece wrote:
Hi,
I'm having a bit of trouble figuring out the logic for deleting
documents from an index. Any advice is appreciated!
snip 75% of the experiments
4) I created an index with an IndexWriter and then optimized it and
closed it.
Claes,
On Thursday 03 July 2003 05:36, Claes Holmerson wrote:
Hi,
In my job, I have become the new maintainer of a search feature that
uses Lucene. I am trying to understand how it works by examining the
index it produces.
When I list index fields by opening an IndexReader, looping over
.
Lucene gives you the balance in your hands.
Kind regards,
Ype Kingma
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
On Thursday 05 June 2003 14:12, Jim Hargrave wrote:
Our application is a string similarity searcher where the query is an input
string and we want to find all fuzzy variants of the input string in the
DB. The Score is basically dice's coefficient: 2C/Q+D, where C is the
number of terms
because you need have to retrieve the stored field(s)
for each document only once. However, it's not
as flexible.
Kind regards,
Ype Kingma
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
Morus,
On Wednesday 19 March 2003 00:44, Morus Walter wrote:
Hi,
we are currently evaluating lucene.
The data we'd like to index consists of ~ 80 collections of documents
(a few hundred up to 20 documents per collection, ~ 1.5 million
documents total; medium document size is in the
On Friday 14 February 2003 15:10, you wrote:
Hi,
I am using Lucene right now to index several semi-structured documents. I
recently had to implement a method 'getFrequencyVector()' to simply return
a mapping of keyword - frequency from the information already in the
lucene index.
I
On Tuesday 04 February 2003 09:12, you wrote:
Hi all,
I'm trying to gather information about my non-searched (ie not used for
the search) fields.
Let's take an index with 2 fields: 'artist' (for the artist name) an
'type' (for his type of music).
I need to perform a search on the 'artist'
On Monday 03 February 2003 22:35, you wrote:
Is there an existing API that allows you to conduct a search such that only
hits with a score greater than X are returned?
Not directly, but it's straightforward to compose from
Searcher.search(query, hitcollector)
and a hitcollector that implements
William,
On Wednesday 20 November 2002 21:14, you wrote:
I would like to buy a book about Lucene.
Who could write it ? : )
AFAIK there is no book, but some articles might help:
http://citeseer.nj.nec.com/cs?q=doug+cuttingsubmit=Search+Documentscs=1
Optimizations for Dynamic Inverted Index
On Friday 15 November 2002 14:40, Rob Outar wrote:
That is exactly what is happening, I was using the QueryParser class
because I wanted to do stuff like this:
field1 = value and field2 = value2 or field2 = value3
But from what you are telling me I cannot use the Query Parser class
because
On Thursday 14 November 2002 19:36, you wrote:
Hello all,
I am storing the field in this fashion:
doc.add(new Field(releaseability, releaseability, true, true,
false));
so it is indexed and stored but not tokenized.
The value is Test Releaseability;
On Tuesday 12 November 2002 18:58, Rob Outar wrote:
Enumeration fields()
Returns an Enumeration of all the fields in a document.
Yes, but it seems there is no such enumerator for a complete index.
Regards,
Ype
Thanks,
Rob
-Original Message-
From: Christoph Kiehl
On Friday 01 November 2002 15:05, Rob Outar wrote:
All,
I have what I think is an interesting problem. I am working on a
distributed system where all repositories on each node have to be kept in
sync. I am using Lucene on each node to index the data. Users are allowed
to associate
On Thursday 31 October 2002 17:21, Felipe Schnack wrote:
What you mean with Jyton? Lucene isn't java?
Lucene is written in java, and Jython is also written in java.
Jython is an implementation of the python scripting language
that allows very easy access to java and to Lucene.
Jython ideal
On Thursday 31 October 2002 18:45, [EMAIL PROTECTED] wrote:
Hi,
My application requires a facility to have security build into the
documents so that when i search for a given word depending on the security
credentials stored in a field in the document the results are filtered .
Now the
On Tuesday 15 October 2002 08:50, you wrote:
I want to write a function countIndexEntries(key) to find out how many
entries are there in the index database for a key. I read the faq entry
about counting number of hits, but somehow it doesnt work as expected,
please help:
I create entries
On Sunday 13 October 2002 04:18, you wrote:
What is the cleanest way in Lucene to add documents to
an index, if the entire document is not readily
available at one time?
E.g., I want to index the text as well as the
anchor-text of a stream of html pages, where the
anchor-text terms get
Eoin,
Get the cvs version and have a look at:
org/apache/lucene/search/PhrasePrefixQuery.java
It sais:
/**
* PhrasePrefixQuery is a generalized version of PhraseQuery, with an added
* method {@link #add(Term[])}.
* To use this class, to search for the phrase Microsoft app* first use
*
Hello,
I just downloaded the lucene-1.2-src jar but to my suprise it only contains
the analysis and queryParser packages in org/apache/lucene.
Is the source jar incomplete or am I looking in the wrong place?
Regards,
Ype
--
To unsubscribe, e-mail: mailto:[EMAIL PROTECTED]
For additional
-1.2-src.jar file is included?
I looked for an explanation, but couldn't find one.
Regards,
Ype
--- Ype Kingma [EMAIL PROTECTED] wrote:
Hello,
I just downloaded the lucene-1.2-src jar but to my suprise it only
contains
the analysis and queryParser packages in org/apache/lucene
,
Ype
Many Thanks,
Fanny
From: Ype Kingma [EMAIL PROTECTED]
Reply-To: Lucene Users List [EMAIL PROTECTED]
To: Lucene Users List [EMAIL PROTECTED]
Subject: Re: Sorting
Date: Wed, 19 Jun 2002 19:40:59 +0100
Fanny,
I want to implement search function using Lucene. As I need to sort the result
Laura
Hi all,
I'm using Jobo for spidering web sites and lucene for indexing. The
problem is that I'd like spidering only Italian web sites.
How can I see discover the country of a web site?
Dou you know some method that tou can suggest me?
The best method I know is using n-grams of
Aruna,
Hi,
I am looking for ways to cancel a search in response to a cancel from a user
interface. I don't see any thing like a timeout on the Searcher.search()
method. Is there a way to terminate a search request?
You can use the low level search api with a collector that checks for
cancelling
Joe,
Hi,
I am using Lucene for indexing a relatively large article based system where articles
change from time to time so i have to reindex them. reindexing had the effekt that a
query would return the hit for a file multiple times (according to the number of
updates.
The only solution to
of docs on the queue can be limited by eg. the total size of the docs.
I assumed you need to delete old docs while adding new ones. In case
you don't need to delete old docs, you you might not need an
index reader at all.
Ype
Regards,
Kelvin
- Original Message -
From: Ype Kingma [EMAIL
Otis,
You can remove the .lock file and try re-indexing or continuing
indexing where you left off.
I am not sure about the corrupt index. I have never seen it happen,
and I believe I recall reading some messages from Doug Cutting saying
that index should never be left in an inconsistent
Kelvin,
I've got a little problem with indexing that I'd like to throw to everyone.
My objects have a unique identifier. When indexing, before I create a new
document, I'd like to check if a document has already been created with this
identifier. If so, I'd like to retrieve the document
Grim,
I am looking at using lucene to index a large set of documents. In
order to be able to search a subset of documents, I've added a
path-field to each document (indexed, not stored, not tokenized).
Using a prefix-query seems to work fine.
My problem: Our documents can have several
Philipp,
Hi! I was trying the lucene web-app (lucene-1.2-rc5-dev.jar). I've created
and indexed a simple html document with both english and russian words. it
was ANSI encoded, if I check _3.fdt from created index, I can see my
document indexed and both russian and english terms indexed (it
Kelvin,
In the case where indexing takes a non-trivial amount of time, what is the expected
behaviour when a search is performed while indexing is still going on?
Would it be a good solution to index in a temporary location, then copying the index
files over to the final location when done?
terms.
Thanks in advance,
Ype Kingma
--
--
To unsubscribe, e-mail: mailto:[EMAIL PROTECTED]
For additional commands, e-mail: mailto:[EMAIL PROTECTED]
Paula,
I came across a tutorial which had some details on the static factory Field
methods. But none of the factory methods return a Field object with the
following settings:
Store = false
Index = true
Tokenize = false
I'm beginning to think this is a bug - that this combination is handled
71 matches
Mail list logo