subject:"Sharding Techniques"

Re: Sharding Techniques

2011-05-13 Thread Toke Eskildsen

On Fri, 2011-05-13 at 12:11 +0200, Samarendra Pratap wrote: > Comparison between - single index Vs 21 indexes > Total Size - 18 GB > Queries run - 500 > % improvement - roughly 18% I was expecting a lot more. Could you test whether this is an IO-issue by selecting a slow query and performing the e

Re: Sharding Techniques

2011-05-13 Thread Samarendra Pratap

Hi Tom, Thanks for pointing me to something important (phrase queries) which I wasn't thinking of. We are using synonyms which gets expanded at run time. I'll have to give it a thought. We are not using synonyms at indexing time due to lack of flexibility of changing the list. We are not using

RE: Sharding Techniques

2011-05-12 Thread Burton-West, Tom

Hi Samar, Have you looked at top or iostat or other monitoring utilities to see if you are cpu bound vs I/O bound? With 225 term queries, it's possible that you are I/O bound. I suspect you need to think about seek time and caching. For each unique field:term combination lucene has to look up

Re: Sharding Techniques

2011-05-11 Thread Ian Lea

I'm sure that you should try building one large index and convert to NumericField wherever you can. I'm convinced that will be faster - but as ever, the proof will be in the numbers. On repeated terms, I believe that lucene will search multiple times. If so, I'd guess it is just something that ha

Re: Sharding Techniques

2011-05-11 Thread Ian Lea

ion to the file will not cause more IO as it has to skip > those bytes and write it at the end of file. > > Regards > Ganesh > > - Original Message - > From: "Burton-West, Tom" > To: > Sent: Tuesday, May 10, 2011 9:46 PM > Subject: RE: Sharding T

Re: Sharding Techniques

2011-05-11 Thread Samarendra Pratap

Hi Tom, the more i am getting responses in this thread the more i feel that our application needs optimization. 350 GB and less than 2 seconds!!! That's much more than my expectation :-) (in current scenario). *characteristics of slow queries:* there are a few reasons for greater search time

Re: Sharding Techniques

2011-05-10 Thread Ganesh

in GB's. Small addition or deletion to the file will not cause more IO as it has to skip those bytes and write it at the end of file. Regards Ganesh - Original Message - From: "Burton-West, Tom" To: Sent: Tuesday, May 10, 2011 9:46 PM Subject: RE: Sharding Techni

RE: Sharding Techniques

2011-05-10 Thread Burton-West, Tom

Hi Samar, >>Normal queries go fine under 500 ms but when people start searching >>"anything" some queries take up to > 100 seconds. Don't you think >>distributing smaller indexes on different machines would reduce the average >>.search time. (Although I have a feeling that search time for smaller

Re: Sharding Techniques

2011-05-10 Thread Samarendra Pratap

Hi Mike, *"I think the usual approach is to create multiple mirrored copies (slaves) rather than sharding"* This is where my eyes stuck. We do have mirrors and in-fact a good number of those. 6 servers are being used for serving regular queries (2 are for specific queries that do take time) and e

Re: Sharding Techniques

2011-05-10 Thread Mike Sokolov

Down to basics, Lucene searches work by locating terms and resolving documents from them. For standard term queries, a term is located by a process akin to binary search. That means that it uses log(n) seeks to get the term. Let's say you have 10M terms in your corpus. If you stored that in a si

Re: Sharding Techniques

2011-05-10 Thread Samarendra Pratap

Thanks to Johannes - I am looking into katta. Seems promising. to Toke - Great explanation. That's what I was looking for. I'll come back and share my experience. Thank you very much. On Tue, May 10, 2011 at 1:31 PM, Toke Eskildsen wrote: > On Mon, 2011-05-09 at 13:56 +0200, Samarendra Prata

Re: Sharding Techniques

2011-05-10 Thread Toke Eskildsen

On Mon, 2011-05-09 at 13:56 +0200, Samarendra Pratap wrote: > We have an index directory of 30 GB which is divided into 3 subdirectories > (idx1, idx2, idx3) which are again divided into 21 sub-subdirectories > (idx1-1, idx1-2, , idx2-1, , idx3-1, , idx3-21). So each part is about ½ G

Re: Sharding Techniques

2011-05-10 Thread Johannes Zillmann

and write it at the end of >> file. >> >> Regards >> Ganesh >> >> >> >> - Original Message - >> From: "Samarendra Pratap" >> To: >> Sent: Monday, May 09, 2011 5:26 PM >> Subject: Sharding Techniques >>

Re: Sharding Techniques

2011-05-10 Thread Samarendra Pratap

end of > file. > > Regards > Ganesh > > > > - Original Message - > From: "Samarendra Pratap" > To: > Sent: Monday, May 09, 2011 5:26 PM > Subject: Sharding Techniques > > > > Hi list, > > We have an index directory of 30 GB whic

Re: Sharding Techniques

2011-05-09 Thread Ganesh

ginal Message - From: "Samarendra Pratap" To: Sent: Monday, May 09, 2011 5:26 PM Subject: Sharding Techniques > Hi list, > We have an index directory of 30 GB which is divided into 3 subdirectories > (idx1, idx2, idx3) which are again divided into 21 sub-subdirectories > (i

Re: Sharding Techniques

2011-05-09 Thread Ian Lea

> ... > 1. I've not tested my application with single index as initially (a few > years back) we thought smaller the index size (7 indexes for default 80% > searches) the faster the search time would be ... Possibly. Maybe it will be acceptable to make some searches a bit slower in order to make

Re: Sharding Techniques

2011-05-09 Thread Samarendra Pratap

and search time of more than 1 second is > > considered bad (is it really bad?) as per the business requirement. > > > > Since past few months we are experiencing issues (load and search time) > on > > our search servers, due to which I am looking for sharding techniques.

Re: Sharding Techniques

2011-05-09 Thread Ian Lea

it really bad?) as per the business requirement. > > Since past few months we are experiencing issues (load and search time) on > our search servers, due to which I am looking for sharding techniques. Can > someone guide or give me pointers where i can read more and test? > >

Sharding Techniques

2011-05-09 Thread Samarendra Pratap

second is considered bad (is it really bad?) as per the business requirement. Since past few months we are experiencing issues (load and search time) on our search servers, due to which I am looking for sharding techniques. Can someone guide or give me pointers where i can read more and test

Re: Sharding Techniques

Re: Sharding Techniques

RE: Sharding Techniques

Re: Sharding Techniques

Re: Sharding Techniques

Re: Sharding Techniques

Re: Sharding Techniques

RE: Sharding Techniques

Re: Sharding Techniques

Re: Sharding Techniques

Re: Sharding Techniques

Re: Sharding Techniques

Re: Sharding Techniques

Re: Sharding Techniques

Re: Sharding Techniques

Re: Sharding Techniques

Re: Sharding Techniques

Re: Sharding Techniques

Sharding Techniques

19 matches

Site Navigation

Mail list logo

Footer information