Has anyone dealt with the problem of constructing sub-queries given a
multi-word query ?
Here is an example to illustrate what I mean:
user queries for - A B C D
right now I change that query to A B C D A B C D to give phrase
matches higher weightage.
What might happen though, is that the user
Doron Cohen wrote:
Hi Antony, you cannot instruct the query parser to do that. Note that an
Thanks, I suspected as much. I've changed it to make the field tokenized.
field name. This is an application logic to know that a certain query is
not to be tokenized. In this case you could create
On Oct 16, 2006, at 2:44 AM, Antony Bowesman wrote:
Doron Cohen wrote:
Hi Antony, you cannot instruct the query parser to do that. Note
that an
Thanks, I suspected as much. I've changed it to make the field
tokenized.
field name. This is an application logic to know that a certain
Hi,
I know that I can index pdf-files (using a third-party library).
Is it possible to search the index for a phrase, getting not only the
document, but also the page number in the (pdf-)document?
Or is it even possible to get a bookmark, leading to this page?
I am thankful for any information
Hello All,
If I am not mistaken the process of locking the Index by different
objects like IndexReader or Indexwriter, theoratically only one Thread
can access the index at a time.
When we do search on the index it creates a commit lock so the other
thread does not modify the index, so
Michael McCandless wrote:
Supriya Kumar Shyamal wrote:
If I am not mistaken the process of locking the Index by different
objects like IndexReader or Indexwriter, theoratically only one
Thread can access the index at a time.
Actually, only one writer can write to the index at once.
Hi Jong,
Jong Kim wrote:
I'm looking for a stemmer that is capable of returning all morphological
variants of a query term (to be used for high-recall search). For example,
given a query term of 'cares', I would like to be able to generate 'cares',
'care', 'cared', and 'caring'.
To
Hi Bill,
Bill Taylor wrote:
On Oct 16, 2006, at 5:44 AM, Christoph Pächter wrote:
I know that I can index pdf-files (using a third-party library).
Could you please tell me where to find this library?
There are several PDF extraction packages listed here (look under the
Lucene Document
Hi - Can someone explain the reason why I'm getting the TooManyClauses
exception? I have a general understanding of the issue based on my
reading, but I don't understand the mechanics of the it. Specifically
how is my query being expanded to cause this problem? How am I
exceeding the default
RangeQueries expand to a boolean query containing all terms in the range,
so it doesn't matter if you search on a course grain range, if you store
the dates with high granulatiry -- the number of terms will be high.
this wiki page discusses some of the merrits of using multiple date fields
with
turns out i needed a seek method.
i ended up modeling it after the RAM Directory.
i turned the RAMFile into an @Entity.
the directory accesses the EntityManager.
and i am using JBossCache.
preliminary testing shows comparable response times.
I have a few questions regarding writing a custom analyzer.
My situation is that I would like to use the StandardAnalyzer but
with some data-specific rules. I was wondering if there was a way of
telling the StandardAnalyzer to treat a string of text, that would
normally be tokenized into
Hi Ryan,
StandardAnalyzer should already be smart about keeping email addresses as a
single token:
// email addresses
| EMAIL: ALPHANUM ((.|-|_) ALPHANUM)* @ ALPHANUM ((.|-)
ALPHANUM)+
(this is from StandardAnalyzer.jj)
As for changing the text you feed to Lucene, that's all up to you.
It is not THAT hard to write a custom analyzer, that is what I did. I
found that there is a bug in the setup, however, in that there are two
incompatible definitions of Token. The generated file
xxTokenizer.java refers to the wrong definition of Token so I ahve to
patch it before it will
Sorry, I wasn't really concerned with email addresses - I was just
using that as an example. How would I tell the StandardAnalyzer that
I want a certain phrase to be tokenized as a token? Surround by
quotes or ..? Also, how would you recommend manipulating the Reader
object? You said
Otis Gospodnetic [EMAIL PROTECTED] wrote on 16/10/2006 14:32:13:
Hi Ryan,
StandardAnalyzer should already be smart about keeping email
addresses as a single token:
// email addresses
| EMAIL: ALPHANUM ((.|-|_) ALPHANUM)* @ ALPHANUM
((.|-) ALPHANUM)+
(this is from StandardAnalyzer.jj)
Hi,
I have have multiple fields that I need to search on. All these fields need to
support wildcard search. I am ANDing these search fields using BooleanQuery.
There is no need for score in my search.
How do I implement these. I have seen PrefixFilter and it sounds promising. But
then how do
hi Vasu, how about using ChainedFilter(yourPrefixFilters[],
ChainedFilter.AND)?
vasu shah [EMAIL PROTECTED] wrote on 16/10/2006 17:50:27:
Hi,
I have have multiple fields that I need to search on. All these
fields need to support wildcard search. I am ANDing these search
fields using
Well, depending on what you mean by wildcard, a prefixfilter isn't
necessarily what you want. If wildcard means abc*, then prefixfilter is
right. If it means ab*cd?fg, a prefix filter isn't useful unless you want to
do some fancy indexing.
Think about writing your own filter. Wrap it in a
Hi Guys
How do you reload an index. I have a webapp which might need to be
redeployed but whenever i test FSDirectory.list(), nothing is returned. The
segments and .cfs file is in the directory but those aren't recognized
either.
--
talk trash and carry a small stick.
PAUL KRUGMAN (NYT)
Can someone tell me how read an index into memory, or how to open an
existing index for reading?
--
talk trash and carry a small stick.
PAUL KRUGMAN (NYT)
Read
*org.apache.lucene.index.IndexReader *And
*org.apache.lucene.search.IndexSearcher
There are description available in these docs.
*
On 10/17/06, EDMOND KEMOKAI [EMAIL PROTECTED] wrote:
Can someone tell me how read an index into memory, or how to open an
existing index for reading?
22 matches
Mail list logo