--
View this message in context:
http://www.nabble.com/Error-Tolerant-tf3524057.html#a9831495
Sent from the Lucene - Java Users mailing list archive at Nabble.com.
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional
Sorry for dual posting. I've just inadvertently submit form before writing
the body :)
Is there any error tolerant query parser ever written for Lucene? What is
the way websites use for advanced searching with Lucene?
--
View this message in context:
Dear all,
My problem is a little bit strange. Instead of parsing the content of the
document to the indexer. I am adding one by one. Here is a piece of my code
:
Document doc = new Document();
doc.add(Field.Text(Features, blue);
doc.add(Field.Text(Features,beautiful);
don't index the city names with the zip codes.
indexed text - Stored Value
---
94941 - 94941 Mill Vallley
94114 - 94114 Mill Vallley
Mill Vallley - Mill Vallley
29715 - 29715 Fort Mill
29708 - 29708 Fort
Hi
I wonder why there is not setter method for the lazy member variable in
Field class. Does that mean the propoerty is nominal and setting it does not
have any effect, or am I missing some point?
Any way, is there any way to tell lucene that a field is to be lazy-loaded,
from the very beginning
On 4/4/07, jafarim [EMAIL PROTECTED] wrote:
Any way, is there any way to tell lucene that a field is to be lazy-loaded,
from the very beginning of field construction?
No, that data is not stored in the index.
Lazy field loading is specified only when retrieving the stored fields
of a document,
Lazy loading is handled through the FieldSelector interface on
IndexReader.doc() and some variations. There is nothing special that
need be done during indexing to mark a field as lazy. The isLazy
method merely lets you know later, after loading a Document, if the
field is, indeed, lazy.
So, what's the usage of this propoerty in the Field class?
On 4/4/07, Yonik Seeley [EMAIL PROTECTED] wrote:
On 4/4/07, jafarim [EMAIL PROTECTED] wrote:
Any way, is there any way to tell lucene that a field is to be
lazy-loaded,
from the very beginning of field construction?
No, that data is
hello,
you can try this code :
IndexReader ISer= IndexReader.open(C:/Testindex);
TermEnum te=ISer.terms(new Term(Features,blue));
Term te1= te.term();
System.out.println(Frequency of blue +ISer.docFreq(te1));
regards,
-LM
On 4/4/07, Sengly Heng [EMAIL
The default operator for QueryParser is OR, so what you may really
be getting is hits on Mill, and Vally is irrelevant.
But this is just a guess, it'd be way more helpful if you told us what
your index structure was and what query you actually submitted,
for which query.toString is really
See below
On 4/4/07, Sengly Heng [EMAIL PROTECTED] wrote:
Dear all,
My problem is a little bit strange. Instead of parsing the content of the
document to the indexer. I am adding one by one. Here is a piece of my
code
:
Document doc = new Document();
doc.add(Field.Text(Features, blue);
Kvailis [EMAIL PROTECTED] wrote:
I'm pretty new to Lucene (2.0.0) and and having an issue with the
IndexWriter: if I set the boolean argument to 'true' it goes ahead and
writes indexes that turn out to be perfectly usable; taking the same exact
code and swithing the boolean to 'false'
I'm hoping someone can offer some insight into the FunctionQuery. I've just
discovered this, and I think it's exactly what I've been looking for, but I'm
having some trouble getting it to work. I can create and execute the query, but
if I try to see the Explanation, I get an
Thanks so much for your explaination. But there is one thing that I want to
make sure is that in case that i add the same token to the same field,
internally is it redundancy?
And in case, that I have many fields. What is the best way to list up the
frequency of all the tokens from different
Thank you.
But i found that the result is always 1. Even i input the token that I dont
even have in the doc. What happened?
Best,
Sengly
On 4/4/07, Laxmilal Menaria [EMAIL PROTECTED] wrote:
hello,
you can try this code :
IndexReader ISer= IndexReader.open(C:/Testindex);
Hm, error tolerant query parser? How do you want to handle queries with
invalid syntax?
Here is one way:
try {
QueryParser qp = new QueryParser(.);
Query q = qp.parse();
} catch (Throwable t) {
// tolerate any exception
}
;)
Bad but quite tolerant.
Otis
. . . . . . . . . . . .
I am using an RMI architecture for calling a remote service which uses an
IndexSearcher in its own JVM. I am starting the service with the following
provisions for memory allocation and garbage collection: java -server -Xmx1024m
-XX:+UseConcMarkSweepGC -XX:+UseParNewGC
After about 1000 search
No reason that I can think of. What makes you think the problem is with the
IndexSearcher? Maybe it's something else in your code, for instance.
Make sure you have the same version of Java on both ends of the call. Also,
Java 6 made our RMI calls a lot more stable than even 1.5.
Otis
. . .
On Wednesday 04 April 2007 01:32, Erick Erickson wrote:
I thought you could simply add a ConstantScoreQuery (whose
constructor takes a Filter) to a BooleanQuery. It seems that doing
this at the very top level with a MUST would do the trick.
I have not tried this myself, but indeed this
I'm looking for some advice on dealing with malformed queries.
If a user searches for yow! then I get an exception from the query
parser. I can get round this by using QueryParser.escape(query) first
but then that prevents them from searching using other bits of the the
query syntax such as
About all you can do is roll your own. I suspect a decent regular
expression would work, or you could let Lucene escape the
query and then re-replace all \: with :
Erick
On 4/4/07, Simon Wistow [EMAIL PROTECTED] wrote:
I'm looking for some advice on dealing with malformed queries.
If a user
Is there an efficient way to know how many distinct terms there are
for a given field name?
I know I can walk through a TermEnum and put them into a hash, but it
would be useful to know beforehand if you are going to get 4 distinct
values or 40,000
I don't need to know what the terms are, just
On 4/4/07, Ryan McKinley [EMAIL PROTECTED] wrote:
Is there an efficient way to know how many distinct terms there are
for a given field name?
I know I can walk through a TermEnum and put them into a hash
No hash needed... just walk through the TermEnum and count.
-Yonik
Sorry if this is a double post, but my last attempt failed..
Not that I know of, but I think you'll be surprised how fast TermEnum
will walk the list of terms.
I think you misunderstand TermEnum. It will NOT enumerate a term
twice, so there's no need for a hash, just a simple increment of
Andy,
MemoryCachedRangeFilter looks nice, can't wait for it to be
included with other goodies in the next 2.x point release!
Numeric range search questions come up often for Lucene,
best practices probably include working with BitSets directly
(which I have been unable to grok), using queries
Hi Doron,
Yes, this was great help, thanks! I've got my:
1. MatchTask (just like ReadTask, but with searcher.match(Query, new
MatchCollector() ))
2. SearchMatchTask (just like SearchTask, but extends MatchTask), so I was able
to use SearchMatch in the alg file where Search was before.
I
TermEnum works like a charm, no need to optimize (yet).
Enjoy the Merlot!
On 4/4/07, Erick Erickson [EMAIL PROTECTED] wrote:
Sorry if this is a double post, but my last attempt failed..
Not that I know of, but I think you'll be surprised how fast TermEnum
will walk the list of terms.
I
27 matches
Mail list logo