On Fri, 2011-05-13 at 12:11 +0200, Samarendra Pratap wrote:
> Comparison between - single index Vs 21 indexes
> Total Size - 18 GB
> Queries run - 500
> % improvement - roughly 18%
I was expecting a lot more. Could you test whether this is an IO-issue
by selecting a slow query and performing the e
Hi Tom,
Thanks for pointing me to something important (phrase queries) which I
wasn't thinking of.
We are using synonyms which gets expanded at run time. I'll have to give it
a thought.
We are not using synonyms at indexing time due to lack of flexibility of
changing the list. We are not using
Hi Samar,
Have you looked at top or iostat or other monitoring utilities to see if you
are cpu bound vs I/O bound?
With 225 term queries, it's possible that you are I/O bound.
I suspect you need to think about seek time and caching. For each unique
field:term combination lucene has to look up
I'm sure that you should try building one large index and convert to
NumericField wherever you can. I'm convinced that will be faster -
but as ever, the proof will be in the numbers.
On repeated terms, I believe that lucene will search multiple times.
If so, I'd guess it is just something that ha
ion to the file will not cause more IO as it has to skip
> those bytes and write it at the end of file.
>
> Regards
> Ganesh
>
> - Original Message -
> From: "Burton-West, Tom"
> To:
> Sent: Tuesday, May 10, 2011 9:46 PM
> Subject: RE: Sharding T
Hi Tom,
the more i am getting responses in this thread the more i feel that our
application needs optimization.
350 GB and less than 2 seconds!!! That's much more than my expectation :-)
(in current scenario).
*characteristics of slow queries:*
there are a few reasons for greater search time
in GB's. Small addition or
deletion to the file will not cause more IO as it has to skip those bytes and
write it at the end of file.
Regards
Ganesh
- Original Message -
From: "Burton-West, Tom"
To:
Sent: Tuesday, May 10, 2011 9:46 PM
Subject: RE: Sharding Techni
Hi Samar,
>>Normal queries go fine under 500 ms but when people start searching
>>"anything" some queries take up to > 100 seconds. Don't you think
>>distributing smaller indexes on different machines would reduce the average
>>.search time. (Although I have a feeling that search time for smaller
Hi Mike,
*"I think the usual approach is to create multiple mirrored copies (slaves)
rather than sharding"*
This is where my eyes stuck.
We do have mirrors and in-fact a good number of those. 6 servers are being
used for serving regular queries (2 are for specific queries that do take
time) and e
Down to basics, Lucene searches work by locating terms and resolving
documents from them. For standard term queries, a term is located by a
process akin to binary search. That means that it uses log(n) seeks to
get the term. Let's say you have 10M terms in your corpus. If you stored
that in a si
Thanks
to Johannes - I am looking into katta. Seems promising.
to Toke - Great explanation. That's what I was looking for.
I'll come back and share my experience.
Thank you very much.
On Tue, May 10, 2011 at 1:31 PM, Toke Eskildsen wrote:
> On Mon, 2011-05-09 at 13:56 +0200, Samarendra Prata
On Mon, 2011-05-09 at 13:56 +0200, Samarendra Pratap wrote:
> We have an index directory of 30 GB which is divided into 3 subdirectories
> (idx1, idx2, idx3) which are again divided into 21 sub-subdirectories
> (idx1-1, idx1-2, , idx2-1, , idx3-1, , idx3-21).
So each part is about ½ G
and write it at the end of
>> file.
>>
>> Regards
>> Ganesh
>>
>>
>>
>> - Original Message -
>> From: "Samarendra Pratap"
>> To:
>> Sent: Monday, May 09, 2011 5:26 PM
>> Subject: Sharding Techniques
>>
end of
> file.
>
> Regards
> Ganesh
>
>
>
> - Original Message -
> From: "Samarendra Pratap"
> To:
> Sent: Monday, May 09, 2011 5:26 PM
> Subject: Sharding Techniques
>
>
> > Hi list,
> > We have an index directory of 30 GB whic
ginal Message -
From: "Samarendra Pratap"
To:
Sent: Monday, May 09, 2011 5:26 PM
Subject: Sharding Techniques
> Hi list,
> We have an index directory of 30 GB which is divided into 3 subdirectories
> (idx1, idx2, idx3) which are again divided into 21 sub-subdirectories
> (i
> ...
> 1. I've not tested my application with single index as initially (a few
> years back) we thought smaller the index size (7 indexes for default 80%
> searches) the faster the search time would be ...
Possibly. Maybe it will be acceptable to make some searches a bit
slower in order to make
and search time of more than 1 second is
> > considered bad (is it really bad?) as per the business requirement.
> >
> > Since past few months we are experiencing issues (load and search time)
> on
> > our search servers, due to which I am looking for sharding techniques.
it really bad?) as per the business requirement.
>
> Since past few months we are experiencing issues (load and search time) on
> our search servers, due to which I am looking for sharding techniques. Can
> someone guide or give me pointers where i can read more and test?
>
>
second is
considered bad (is it really bad?) as per the business requirement.
Since past few months we are experiencing issues (load and search time) on
our search servers, due to which I am looking for sharding techniques. Can
someone guide or give me pointers where i can read more and test
19 matches
Mail list logo