On Thursday 28 September 2006 10:12, Stuart Grimshaw wrote:
We have an existing lucene based search, and a recent change to the way we
organise our products has caused a bit of a problem for search results.
Our products are arranged into subcategories, categories stores. A
product can only
Can't you just add several values to the Store field?
I.E:
doc.addField(field.text(STOREFIELD, val1)
doc.addField(field.text(STOREFIELD, val2)
-Ursprungligt meddelande-
Från: Stuart Grimshaw [mailto:[EMAIL PROTECTED]
Skickat: den 2 oktober 2006 10:09
Till: java-user@lucene.apache.org
I want to modify the PrefixQuery so that it instead of casting the
TooManyBooleanClause exception takes out the most frequent N terms
matching the prefix and only searches for those. Is this possible?
/
Regards
Marcus
Hello!
I've indexed HTML pages and stored html codes as UN_TOKENIZED fields. So, I
need to search for specific tags in those documents,
for example: option name=test
Do I need to write some custom analyzer or something like that?
Please help me!
I'm using DateTools with Resolution.DAY.
I know that dates internally are converted to GMT.
Converting dates 2006-10-01 00:00 and 2006-10-01 15:00 from
Etc/GMT-2 timezone will give us
20060930 and 20061001 respectively.
But these dates are identical with day resolution.
Is this bug or I'm
Volodymyr Bychkoviak wrote:
I'm using DateTools with Resolution.DAY.
I know that dates internally are converted to GMT.
Converting dates 2006-10-01 00:00 and 2006-10-01 15:00 from
Etc/GMT-2 timezone will give us
20060930 and 20061001 respectively.
But these dates are identical with day
John Haxby wrote:
I ran across the problem with DateTools not using UTC when I tried to
use an index created in California from the UK: I was looking for
documents with a particular date stamp but I found documents with a
date stamp from the wrong day. Even more interesting and bizarre
: I have a custom-built Analyzer where I tokenize all non-whitespace
: characters as well available in the field TERM (which is the only
: field being tokenised).
: If I now query my index file for a term 6/12 for instance, I get back
: only ONE result
: instead of TWO. There is another token in
: This should solve most of my heartache.
: Whats the suggested way to use this ? Copy a solr jar ? Or just copy
: the code for this 1 query ?
that's entirely up to you, it depends on what kind of source management
you want to have -- the suggested way to use it is to run Solr and use it
via the
: Is my only option here really going to be to add some more colums? I've slept
: on it over the weekend, and not had any more bright ideas ... ?
I have to admit, i dont't relaly udnerstand your problem ... you speak of
Products and Stores and Categories and Primary Categories and wondering
how
: I want to modify the PrefixQuery so that it instead of casting the
: TooManyBooleanClause exception takes out the most frequent N terms
: matching the prefix and only searches for those. Is this possible?
It should be ... look at the rewrite method of PrefixQuery and the docFreq
method of
On Oct 2, 2006, at 2:08 PM, Los Morales wrote:
I'm new to Lucene and IR in general. I'm a bit confused on the
concept of fields. From what I've read, a field does not have to
be indexed but its value can be stored in an index. Likewise a
field can be indexed but its value is not stored
SSN actually is a common situation.
Assume you have a (relational) database with a table of products with three
columns :
- SSN, which is also a primary key for that table,
- DESCRIPTION, which has free text (i.e. unformatted text) describing the
product.
- OTHER - additional info.
Also assume
Another Erick (note the correct spelling G). See below..
On 10/2/06, Los Morales [EMAIL PROTECTED] wrote:
Hi Erik,
Thanks for the response.
Consider the index in the back of a book. You could tear that out and
still use it to tell what page something is on, but you have no actual
content
I guess the thundering silence is rooted in the problem statement. I have a
hard time understanding how this index is used. By storing things this way,
you'll force the user to know the *exact* format of anything she's looking
for. That is, it's hard to search for option name=test value=32 and
I have an existing index which was created with DefaultSimilarity. I
want to update the index to use my own Similarity class (need to change
the lengthNorm). I wrote a quick script which creates a new index,
calls setSimilarity(new MySimilarity) for that indexes IndexWriter, and
then calls
Hi,
can anybody be so kind to tell me if it is possible to search a Term by its
position?
I search a term (for excample soccer) and get back the DocId's and
positions as follows:
TermPositions termPos = reader.termPositions(new Term(contents,soccer));
while(termPos.next()){
int
: Initially, I had anticipated that doing this would updated the
: Similarity as part of the add process. But after running some tests,
: this does not appear to be the case.
fieldNorms are computed when the document is added to the index ...
merging indexes doesn't affect them.
: Is there
You can store TermVectors with position info, but I don't think this would
be enough for what you are asking, because it is not meant for direct
access to a term by its position, and because TermVectors store tokens,
i.e. the indexed form of the word, which I am not sure is what you need.
It
I am indexing individual pages of books.
I get no results from the query
accurate AND book:first title
Each lucene document which represents one page of one book gets a field
book which is indexed, stored, and not tokenized to store the title
of the book.
The word accurate appears on page
The problem stems from using the query parser for searching a non tokenized
field (book).
You can either create a term query for searching in that field, like this:
new TermQuery(new Term(book,first title));
Or tokenize the field book and keep using QueryParser.
Decision is based on how you
Hi all,
In some situation, index files may throw read past EOF exception so that
the index cannot be used any more. I wonder how to recover the index files
in such situation?
--
Thanks,
Jiang
Hi,
I have a question about ParallelMultiSearcher performance.
I want to search documents on about 10 gigabytes of index.
(The index has 10,000,000 documents.)
I get very slow performance using IndexSearcher with ONE index normally.
Then I tried to use ParallelMultiSearcher with 10 servers of
Hi,
I have a multi-threaded indexing application that indexes documents into a set
of Lucene index databases (I have millions of documents to index, hence the
split DB) . When a thread gets an index request, it determines the index DB to
index the data in. It grabs the IndexWriter for that
24 matches
Mail list logo