I'm pretty sure it doesn't solve the problem in general (it isn't a
thread-save solution for sure, you mentioned the memory barrier, I'd add
compiler optimizations). If it works it must be something
application-specific, maybe synchronization isn't really needed there,
or you just don't do an
> We've been using this in production for a while and it fixed the
> extremely slow searches when there are deleted documents.
Who was the caller of isDeleted()? There may be an opportunity for an easy
optimization to grab the BitVector and reuse it instead of repeatedly
calling isDeleted() on the
I'm not sure that looks like a safe patch.
Synchronization does more than help prevent races... it also introduces
memory barriers.
Removing synchronization to objects that can change is very tricky business
(witness the double-checked locking antipattern).
-Yonik
Now hiring -- http://tinyurl.com/
Thanks for the reply.
Here is what happens...
I have 2 boxes A and B. And the indices are created on Machine C.
The directory of the index is mounted on both the machines A and B.
We have quartz using JDBCJobStore. So index creation runs on either one of the
box. SO when index creation job is
Hi Peter,
I observed the same issue on a multiprocessor machine. I included a
small fix for this in the NIO patch (against the 1.9 trunk) here:
http://issues.apache.org/jira/browse/LUCENE-414#action_12322523
The change amounts to the following methods in SegmentReader.java, to
remove the need s
On Mittwoch 12 Oktober 2005 00:15, Robert Watkins wrote:
> Wonderful! But what about wildcards? I realised after I had sent the
> last message that my pattern should have been written:
Have a look at the test cases: you need to expand the terms yourself, i.e.
it doesn't matter if there's a prefi
> If the index is in 'search/read-only' mode, is there a way around this
bottleneck?
The obvious answer (to answer my own question) is to optimize the index.
But the question remains: why is the docMap created and never used?
Peter
Wonderful! But what about wildcards? I realised after I had sent the
last message that my pattern should have been written:
( term | term as prefix | wildcard term )+
-- Robert
On Tue, 11 Oct 2005, Daniel Naber wrote:
On Dienstag 11 Oktober 2005 22:53, Robert Watkins wrote:
I was under th
On a multi-cpu system, this loop to build the docMap array can cause severe
thread thrashing because of the synchronized method 'isDeleted'. I have
observed this on an index with over 1 million documents (which contains a
few thousand deleted docs) when multiple threads perform a search with
either
On Dienstag 11 Oktober 2005 22:53, Robert Watkins wrote:
> I was under the impression that PhrasePrefixQuery only worked in the
> special case of the term that would otherwise be used in a PrefixQuery
> coming at the end of the sequence of terms, as in:
No, the test cases show that the prefix ter
I was under the impression that PhrasePrefixQuery only worked in the
special case of the term that would otherwise be used in a PrefixQuery
coming at the end of the sequence of terms, as in:
( term )+ ( term as prefix )
but not where either a WildcardQuery or a PrefixQuery occurs anywhere
in t
1) A FileNotFound Exception isn't a Lucene issue as much as it's a file system
issue, which file is "not found"? What's in the logs
2) As for simultaneous indexing on two seperates indecies, there should be
absolutly no problem, we simultaneously index 10 parallel indecies using quartz
and it'
thanks again!
Doug Cutting wrote:
Marc Hadfield wrote:
In the SpanNear (or for that matter PhraseQuery), one can set a slop
value where 0 (zero) means one following after the other.
How can one differentiate between Terms at the **same** position vs.
one after the other?
The followi
Just use the Sort option in the searcher
http://lucene.apache.org/java/docs/api/org/apache/lucene/search/Searcher
.html#search(org.apache.lucene.search.Query,%20org.apache.lucene.search.
Sort)
Aviran
http://www.aviransplace.com
-Original Message-
From: Daniel Cortes [mailto:[EMAIL PROTECT
Marc Hadfield wrote:
In the SpanNear (or for that matter PhraseQuery), one can set a slop
value where 0 (zero) means one following after the other.
How can one differentiate between Terms at the **same** position vs. one
after the other?
The following queries only match "x" and "y" at the sa
Hi everybody, I have a problem when I find all the documents added in
the last days in my index. It works good but I want show this results
sorted. What I have to do?
My code is this:
private RangeQuery findINTODates(int days) {
Term from;
Term to;
Calendar calendar =
Well there isn't really much difference. If you have large amount of
data then I would suggest 2 indexes, but not then one index will work
too.
HTH
Aviran
http://www.aviransplace.com
-Original Message-
From: Sharma, Siddharth [mailto:[EMAIL PROTECTED]
Sent: Tuesday, October 11, 2005 2:
Hiya
Given that I have two high level business entities, catalog (containing
product information) and contract (containing filter criteria about which
products are available for sale and which are not), what is a better
approach?
1. To have two different indices and query them separately.
OR
2. H
On Dienstag 11 Oktober 2005 15:32, Robert Watkins wrote:
> The only idea that comes to mind is to try to combine a PhraseQuery and
> a PrefixQuery
Yes, PhrasePrefixQuery already supports that.
Regards
Daniel
--
http://www.danielnaber.de
---
Hello -
a quick follow-up to my previous post.
In the SpanNear (or for that matter PhraseQuery), one can set a slop
value where 0 (zero) means one following after the other.
How can one differentiate between Terms at the **same** position vs. one
after the other?
ie:
(Token)/Position
(A)
You might want to check your analyzer, it might trims or ignore these
names.
Aviran
http://www.aviransplace.com
-Original Message-
From: Dan Quaroni [mailto:[EMAIL PROTECTED]
Sent: Tuesday, October 11, 2005 2:22 PM
To: java-user@lucene.apache.org
Subject: Can't find record when I'm sure
Try Luke, see exactly what is indexed for these companies.
: Date: Tue, 11 Oct 2005 14:21:54 -0400
: From: Dan Quaroni <[EMAIL PROTECTED]>
: Reply-To: java-user@lucene.apache.org
: To: java-user@lucene.apache.org
: Subject: Can't find record when I'm sure I should
:
: I have a set of indexes co
I have a set of indexes containing business information (name, address, phone,
etc). There are a couple particular companies that don't come up when people
search for them. I've used our debugging app that allows lucene queries to be
executed directly, and I have confirmed this.
I can find t
On Oct 11, 2005, at 10:04 AM, Hugo Lafayette wrote:
First of all, add maybe I make a false assumption here, but if you
strip
leading "j'", "t'" and so on, that means that if you make a search
like:
+text:"il m'aime"
you will get documents with the sentence "il m'aime" (french for "he
lov
Marvin Humphrey wrote:
> I'm curious: are there any cases in French where a string with an
> apostrophe in it ought to be split into two searchable tokens? I
> know of no such cases in English: you never want to search for the ll
> in you'll, or the O in O'Reilly, etc.
First of all, add ma
On Oct 11, 2005, at 7:52 AM, Hugo Lafayette wrote:
Why do not include that in the FrenchStemFilter "next()" method
itself ?
It will be a bad design ?
I agree with your assessment. Conceptually, this is a stemming
problem. By extension, it's not a tokenizing problem, and the
behavior o
Hi,
I have the indexing process running in an quartz environment. (on a clustered
two boxes)
I made sure that the Indexing doesnt runs simultaneously on both the boxes.
But suddenly I am start getting "FileNotFoundException" on the indexing
process. From that pont on the indexes are of no use.
On Oct 11, 2005, at 10:52 AM, Hugo Lafayette wrote:
Erik Hatcher wrote:
Rather than changing StandardAnalyzer, you could create a custom
Analyzer that is something along the lines of StandardTokenizer ->
custom apostrophe splitting filter -> ISOLatinFilter.
Why do not include that in the
Erik Hatcher wrote:
> Rather than changing StandardAnalyzer, you could create a custom
> Analyzer that is something along the lines of StandardTokenizer ->
> custom apostrophe splitting filter -> ISOLatinFilter.
Why do not include that in the FrenchStemFilter "next()" method itself ?
It wil
On Oct 11, 2005, at 9:22 AM, Hugo Lafayette wrote:
- accentuated characters: The french analyzer keep accents, which
could
be useful, but may also become boring. I just have to add the
ISOLatinFilter.java to correct that, but maybe adding an option to
keep
them or not could be useful.
- ap
I've been trying to figure out the best way to support queries of the ilk:
"going to he* in a hand-basket"
such that it's almost a PhraseQuery, except that the third term (in this
case) is a PrefixQuery.
The only idea that comes to mind is to try to combine a PhraseQuery and
a PrefixQuery (or
Hi there,
I just test the french analyzer, which works well for most part of it
(Stemmer particulary). But ATM, I have two unexpected behavior with the
default configuration:
- accentuated characters: The french analyzer keep accents, which could
be useful, but may also become boring. I just have
search a query using a filter that's a query-filter ??
paul
Le 11 oct. 05, à 13:36, Trond Aksel Myklebust a écrit :
I need to be able to intersect the result of two queries based on a
field
"ID". So if I do a search:
Content2 = "something totally" and a search: Content1 = "something" I
wan
I need to be able to intersect the result of two queries based on a field
"ID". So if I do a search:
Content2 = "something totally" and a search: Content1 = "something" I want
to return only Document 2 based on the field ID being the same.
Any tip on how to do this in Lucene, or should I go for
Yes some time it creates lock file in tomcat.
But nowadays i am not able to index even after deleting the lock files.
I checked tomcat's temp folder and java.io.tmpdir , nothing is there.
Even if I am not closing the index it should index after deleting the lock
files , (Correct me if I am wrong
Do you keep open IndexReader, IndexWriter or IndexSearcher? Try closing them
suring shutdown
On 9/29/05, M å n i s h <[EMAIL PROTECTED]> wrote:
>
>
> Hi,
> I am having trouble indexing files sometimes,
> My application is deployed in tomcat and some times when I try to stop and
> restart indexing
Paul,
Thank you very much for your explanation.
However, in case you have different experience, we'd like to know.
I don't have it. I'm just curious.
Thank you again,
Koji
- Original Message -
From: "Paul Elschot" <[EMAIL PROTECTED]>
To:
Sent: Tuesday, October 11, 2005 4:16 PM
Koji,
On Sunday 09 October 2005 14:12, Koji Sekiguchi wrote:
> Hello,
>
> What is MMapDirectory?
>
> I've searched mailing list archive, but cannot find it.
> I could find the following explanation at Lucene 1.9 CHANGES.txt:
>
> 8. Add MMapDirectory, which uses nio to mmap input files. This i
38 matches
Mail list logo