I would be tempted to index the text fields but not save them. Since
Lucene returns everything as Otis pointed out, it's inefficent to keep
rarely used data in as content in the index. Put the text fields in a
database or a file tree somewhere and keep a pointer to it as a field in
the index.
Use one index, working with a single index is simpler. Also, once you
pull a Document from Hits object, all Fields are read off of the disk.
There was some discussion about selective Field reading about a week
ago, check the list archives. Also keep in mind Field compression is
now possible (onl
Hello,
If I have large text fields that are rarely retrieved but need to be
searched often - Is it better to create 2 indices, one for searching and
one for retrieval, or just one index and put everything in it?
Or are there other recommendations?
Regards,
Michael
Based on the nature of our documents, we sometimes
experience extremely long response times when executing
NEAR operations against a document (sometimes well over
minutes - even though the operation is restricted
to a single document).
Our analysis of the code indicates (we think):
It looks up
List
Subject: Re: Performance question
Dror Matalon wrote:
>On Wed, Jan 07, 2004 at 07:24:22PM -0700, Scott Smith wrote:
>
>
>>After two rather frustrating days, I find I need to apologize to
>>Lucene. My last run of 225 messages averaged around 25 milliseconds
>>per
On Wednesday 07 January 2004 20:48, Dror Matalon wrote:
> On Wed, Jan 07, 2004 at 07:24:22PM -0700, Scott Smith wrote:
...
> > Thanks for the suggestions. I wonder how much faster I can go if I
> > implement some of those?
>
> 25 msecs to insert a document is on the high side, but it depends of
>
Dror Matalon wrote:
On Wed, Jan 07, 2004 at 07:24:22PM -0700, Scott Smith wrote:
After two rather frustrating days, I find I need to apologize to Lucene. My
last run of 225 messages averaged around 25 milliseconds per message--that's
parsing the xml, creating the Document, and putting it in th
talon" <[EMAIL PROTECTED]>
To: "Lucene Users List" <[EMAIL PROTECTED]>
Sent: Wednesday, January 07, 2004 10:48 PM
Subject: Re: Performance question
> On Wed, Jan 07, 2004 at 07:24:22PM -0700, Scott Smith wrote:
> > After two rather frustrating days, I find I need t
I believe that there are other parsers
that are faster than xerces, you might want to look at these. You might
want to look at http://dom4j.org/.
Dror
>
> Regards
>
> Scott
>
> -Original Message-
> From: Terry Steichen [mailto:[EMAIL PROTECTED]
> Sent: Tuesday,
of those?
Regards
Scott
-Original Message-
From: Terry Steichen [mailto:[EMAIL PROTECTED]
Sent: Tuesday, January 06, 2004 5:48 AM
To: Lucene Users List
Subject: Re: Performance question
Scott,
Here are some figures to use for comparision. Using the latest Lucene
release, I index ab
ssage -
From: "Scott Smith" <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Sent: Monday, January 05, 2004 10:26 PM
Subject: Performance question
> I have an application that is reading in XML files and indexing them.
Each
> XML file is 3K-6K bytes. This application prel
--- Scott Smith <[EMAIL PROTECTED]> wrote:
> I have an application that is reading in XML files and indexing them.
> Each
> XML file is 3K-6K bytes. This application preloads a database that I
> will
> add to "on the fly" later. However, all I want it to do initially is
> take
> some existing f
I have an application that is reading in XML files and indexing them. Each
XML file is 3K-6K bytes. This application preloads a database that I will
add to "on the fly" later. However, all I want it to do initially is take
some existing files and create the initial index as quick as I can.
Si
Is it possible that there's some combo of:
- the index of your data set being small relative to the Solaris disk
cache/RAM
- stringA being rare
such that it would explain some of your results?
Harry Foxwell wrote:
I have a project for which I want to characterize Lucene query
performance
on di
-4536
Web : www.nexusedge.com
> -Original Message-
> From: Harry Foxwell [mailto:[EMAIL PROTECTED]
> Sent: Sunday, March 02, 2003 10:49 AM
> To: Lucene Users List
> Subject: lucene performance question
>
>
> I have a project for which I want to characterize Lucene que
Lucene is not doing any caching, but maybe your OS is.
Otis
--- Harry Foxwell <[EMAIL PROTECTED]> wrote:
> I have a project for which I want to characterize Lucene query
> performance
> on different size archives of my XML files. I have created archives
> and indices of 1000, 2000, 4000, 8000, a
I have a project for which I want to characterize Lucene query performance
on different size archives of my XML files. I have created archives
and indices of 1000, 2000, 4000, 8000, and 16000 XML files (average
file size about 10K) generated from
my DTD and containing mostly random string content
17 matches
Mail list logo