Any opinion on this.
- Original Message -
From: "Ganesh"
To:
Sent: Wednesday, December 17, 2008 4:28 PM
Subject: IndexReader delete
When i perform a delete, i am getting the following exception
org.apache.lucene.store.LockObtainFailedException: Lock obtain timed out:
SimpleFSLock
: In-Reply-To: <13158.43731...@web45306.mail.sp1.yahoo.com>
: Subject: What are the best document edit options?
http://people.apache.org/~hossman/#threadhijack
Thread Hijacking on Mailing Lists
When starting a new discussion on a mailing list, please do not reply to
an existing message, instead
: Let me expound more on the question. Will the q1 be run on the
: BooleanQuery q2 and append the results that are not equal to the result
: of the first query of q2?
i really have no idea what you mean by: "q1 be run on the BooleanQuery q2"
the query structure suggested will ensure that you o
Steve,
Thanks for the helpful information, the addition of the new document
methods makes things much better.
One more question, is there JSON support in Lucene? JSON is more fat-
free compared to XML and would be preferred. Digester works well for
indexing XML but something along the same
Geoff Hendrey wrote:
((POINameType)name).getText().split("\\s"); //tokenize manually. (gosh,
I thought the analyser would do this)
The analyser does do this... but related to this, the Right Way to do it
in your case would be to write your own analyser specifically for that
field, and do all
Thanks Erick and Michael.
I will try out these suggestions and post my findings.
~preetham
Erick Erickson wrote:
Well, maybe if I'd read the original post more carefully I'd have figured
that out,
sorry 'bout that.
I *think* I remember reading somewhere on the email lists that your indexing
sp
Apache commons codec library has double metaphone algorithm. I tried a
series of experiments around storing the double metaphone
representations of strings in the index itself, and searching using
doublemetaphone version of search terms when the field I am searching
against is stored as double meta
Hi,
I would like to have a Phrase Query in which the Terms are matched using
the DoubleMetaphone algorithm.I found this link:
http://www.tropo.com/techno/java/lucene/metaphone.html
Which describes a DoubleMetaphoneQuery, and indeed this query works
amazingly well for misspellings, but only for
The javadocs state
"This requires ... and the upper bound* of those segment doc counts not exceed
maxMergeDocs."
Can one of the gurus please explain what that means and what needs to be done to
find out whether an index being merged fits that criteria.
Thanks
Antony
Are you measuring only the time to execute the searcher.search line or
are you measuring the time it takes to iterate the Hits object? The reason
I ask is that something like
for (int idx = 0; idx < hits.length(); ++idx) {
}
will re-execute the query every 100 documents examined or so. For
ex
Just an FYI in case anyone runs into something similar.
Essentially I had indexes that I have been searching from a java stored
procedure in Oracle without issue for awhile. All of a sudden, I started
getting the error I alluded to above when there were more than a certain
number of terms (4,5, o
Are you warming the searcher first, and then testing the sort
performance? (The first query is slow because it populates the
FieldCache, internally, which is then reused for subsequent queries as
long as you don't close that reader/searcher).
Mike
Chris Salem wrote:
Hello,
I have an i
Hello,
I have an index with ~400 documents and some 200 fields. Searching without
sorting takes around 300 - 500 ms, when sorting on dates (formated as
'-mm-dd') searching time takes on average 15 seconds. Here's the code that
does the search:
hits = searcher.search(query, new Sort(new
Hi Thomas,
On 12/17/2008 at 11:52 AM, Thomas J. Buhr wrote:
> Where can I see how IndexWriter.updateDocument works without getting
> into Lucene all over again until this important issue is resolved?
> Is there a sample of its usage for updating specific fields in a
> given document?
The updateDo
Hi:
This solution have a problem.
the results are sorted bye the year criteria but I need that after sort by
year criteria it sort by the scoring criteria two.
How can I do this ???
I hope you can help me.
Greetings
Ariel
On Wed, Nov 19, 2008 at 5:28 PM, Erick Erickson wrote:
> Well, MultiSearch
On Wed, Dec 17, 2008 at 12:49 PM, Annette Tisdale
wrote:
> I've noticed in our lucene app that subsequent identical searches are faster
> than the first search. So if I search for "things you know" the first
> response time will be 160ms, the second will be 23ms. Then if I search for
> "something
I'll leave those details to the experts who are up to speed .
On Wed, Dec 17, 2008 at 11:52 AM, Thomas J. Buhr wrote:
> Erick,
>
> Thanks for the good news, my question was still lingering from months ago
> when I initially looked at an older Lucene.
>
> Now I need a bit more specific info, since
Well, maybe if I'd read the original post more carefully I'd have figured
that out,
sorry 'bout that.
I *think* I remember reading somewhere on the email lists that your indexing
speed goes up pretty linearly as the number of indexing tasks approaches
the number of CPUs. Are you, perhaps, on a dua
I've noticed in our lucene app that subsequent identical searches are faster
than the first search. So if I search for "things you know" the first
response time will be 160ms, the second will be 23ms. Then if I search for
"something else" the first response time will be 133ms and the second will
b
On Wed, Dec 17, 2008 at 10:32 AM, Patrick Johnstone
wrote:
> As I said in the original email, my issue is that I don't
> think Lucene is returning the fields in the original order
> anymore.
Hmmm, you're right.
http://wiki.apache.org/jakarta-lucene/LuceneFAQ
states "
What is the order of field
Erick,
Thanks for the good news, my question was still lingering from months
ago when I initially looked at an older Lucene.
Now I need a bit more specific info, since much in my architecture
rests on this ability to modify document fields dynamically. Where
can I see how IndexWriter.upda
Have you tested your indexing throughput with two threads sharing one
IndexWriter (one index)?
Mike
Preetham Kajekar wrote:
Hi Erick,
Thanks for the response. Replies inline.
Erick Erickson wrote:
The very first question is always "are you opening a new searcher
each time you query"? But
Hi Erick,
Thanks for the response. Replies inline.
Erick Erickson wrote:
The very first question is always "are you opening a new searcher
each time you query"? But you've looked at the Wiki so I assume not.
This question is closely tied to what kind of latency you can tolerate.
A few more deta
> -Original Message-
> From: Yonik Seeley [mailto:ysee...@gmail.com]
> Sent: Wednesday, December 17, 2008 10:07 AM
> To: java-user@lucene.apache.org
> Subject: Re: Order of fields returned by Document.getFields()
>
> Lucene guarantees the order of all stored fields returned.
> Solr gua
On Dec 17, 2008, at 9:26 AM, Rajiv2 wrote:
Because, the search term is provided by a user, and that user would
explicity
have to put quotes around "marietta ga" when I beleive the search
text as it
is : fleming roofing inc., marietta ga -- should score higher for
"marietta
ga"
Just
The very first question is always "are you opening a new searcher
each time you query"? But you've looked at the Wiki so I assume not.
This question is closely tied to what kind of latency you can tolerate.
A few more details, please. What's slow? Queries? Indexing?
How slow? 100ms? 100s? What ar
Thanks Danil - I'd missed that.
Danil ŢORIN wrote:
According to
http://lucene.apache.org/java/2_4_0/api/org/apache/lucene/search/TopDocCollector.html
it does.
After search, simple retrieve TopDocs and read documens you need:
List result = new ArrayList(10);
for( ScoreDoc sDoc :collector.topDo
Well, you could also do a simple test of removing IDF from the scoring
equation and seeing if the query then reacts the way you want it to.
Simply write your own custom similarity that does this, and test out to
see how it works.
Handily enough, I've already done this, so here's some code you
Lucene guarantees the order of all stored fields returned.
Solr guarantees the order of all values in a *specific* field, but not
the fields themselves.
-Yonik
On Tue, Dec 16, 2008 at 10:00 AM, Patrick Johnstone
wrote:
>
> I'm using Lucene via Solr and recently upgraded from an early Summer nig
> >
> > I'm using Lucene via Solr and recently upgraded from an
> early Summer
> > nightly build to the released version of Solr 1.3 (which
> seems to use
> > something in the neighborhood of Lucene 2.3). I'm posting
> this here
> > because I believe that my issue is with Lucene, not Solr
Hi Grant,
Thanks four response. Replies inline.
Grant Ingersoll wrote:
On Dec 17, 2008, at 12:57 AM, Preetham Kajekar wrote:
Hi,
I am new to Lucene. I am not using it as a pure text indexer.
I am trying to index a Java object which has about 10 fields (like
id, time, srcIp, dstIp) - most of
Because, the search term is provided by a user, and that user would explicity
have to put quotes around "marietta ga" when I beleive the search text as it
is : fleming roofing inc., marietta ga -- should score higher for "marietta
ga"
rajiv
Grant Ingersoll-6 wrote:
>
>
> On Dec 16, 2008, at
What version of Lucene are you using? The more recent ones have
IndexWriter.updateDocument..
Best
Erick
On Wed, Dec 17, 2008 at 2:20 AM, Thomas J. Buhr wrote:
> Hello Lucene,
>
> Looking at the document object it seems like each time I want to edit its
> contents I need to do the following:
On Dec 17, 2008, at 12:57 AM, Preetham Kajekar wrote:
Hi,
I am new to Lucene. I am not using it as a pure text indexer.
I am trying to index a Java object which has about 10 fields (like
id, time, srcIp, dstIp) - most of them being numerical values.
In order to speed up indexing, I figured t
You could also think about a filter. Just run q1 as a regular query.
Use one of the Collector methods to create a Filter. At the end,
invert the Filter and use it as a parameter for your second query.
Best
Erick
On Wed, Dec 17, 2008 at 12:23 AM, Jay Malaluan wrote:
>
> Hi,
>
> Anyone knowledgeab
On Dec 16, 2008, at 10:00 AM, Patrick Johnstone wrote:
I'm using Lucene via Solr and recently upgraded from an early Summer
nightly
build to the released version of Solr 1.3 (which seems to use
something in
the neighborhood of Lucene 2.3). I'm posting this here because I
believe
that m
It might be faster to use FieldCache.DEFAULT.getStrings(reader,
"empid"), assuming empid is indexed but is not analyzed (or always
analyzes to one token).
Though, that then persists the resulting array in the FieldCache.
We are wanting to create "column stride fields" (LUCENE-1231) to make
Right, it returns the best 10 documents by score (not the first 10
docs it sees).
You could also simply use the search(Query, int) method too (which
just creates the TopDocCollector under the hood).
Mike
Danil ŢORIN wrote:
According to
http://lucene.apache.org/java/2_4_0/api/org/apach
On Dec 16, 2008, at 8:19 PM, Rajiv2 wrote:
Hello,
I'm using the default lucene Queryparser on the search text : fleming
roofing inc., marietta ga
Also, I don't want to modify the search text by putting quotes around
"marietta ga" which forces the query parser to make a phrase query.
Why no
According to
http://lucene.apache.org/java/2_4_0/api/org/apache/lucene/search/TopDocCollector.html
it does.
After search, simple retrieve TopDocs and read documens you need:
List result = new ArrayList(10);
for( ScoreDoc sDoc :collector.topDocs().scoreDocs) {
result.add(contentSearcher.doc(s
Thanks-
Yes in my use-case there are never any deleted documents when the search
is run- (deletion takes place in a pre-processing stage)
Toke Eskildsen wrote on 12/17/2008 08:16:31 AM:
> On Mon, 2008-12-08 at 15:17 +0100, Donna L Gresh wrote:
> > public Vector getIndexIds() throws Exce
I have ported the Java version of the Arabic analyzer recently committed to
Lucene.Net
Is there any work been done on a Farsi Analyzer (Persian Language)
Thanks,
Ian
On Mon, 2008-12-08 at 15:17 +0100, Donna L Gresh wrote:
> public Vector getIndexIds() throws Exception {
>
> Vector vec = new Vector();
> IndexReader ireader = IndexReader.open(directoryName);
> int numdocs = ireader.numDocs();
>
Hi
In a search I am doing, there may be thousands of hits, of which I only
want the 10 with the highest score. Will the following code do this for
me, or will it simply return the first 10 it finds?
TopDocCollector collector = new TopDocCollector(10);
contentSearcher.search(q, collector);
If
When i perform a delete, i am getting the following exception
org.apache.lucene.store.LockObtainFailedException: Lock obtain timed out:
SimpleFSLock@/
org.apache.lucene.store.Lock.obtain(Lock.java:85)
org.apache.lucene.index.DirectoryIndexReader.acquireWriteLock(DirectoryIndexR
Well, you could use the queryparser wildcard searches (flash*), but
it doesn't use stemming logic, it just returns all the words that
start with that string.
You must be aware that the queryparser rewrite the query with every
term that match the wildcard, so if your prefix is short it's easy to
g
46 matches
Mail list logo