Hello Anusham,

My intention is to shard the index after every 7 days (week). After 30 days, (4th week) the first DB may get deleted. At any point of time i will be maitaining 3 to 4 DB.

I want to know the pros and cons of the MultiSearcher or Index sharding approach. Any web links would be helpful.

Regards
Ganesh

----- Original Message ----- From: "Anshum" <[EMAIL PROTECTED]>
To: <java-user@lucene.apache.org>
Sent: Tuesday, October 07, 2008 12:18 AM
Subject: Re: Single searcher vs Multi Searcher


Hi Ganesh,
About the memory consumption while sorting, it would end up using similar
amounts, perhaps even more.. like in the case of regular parallel
programming algorithms (hoping that you intend to search using a parallel
multi searcher). Would you have to query particular indexes only for a
particular search or would you be searching over all the indexes and then
follow it up by merger (which the parallel multi searcher would do
efficiently).?
Also, I guess 30 indexes would be a little too many, haven't really tried
out those many indexes for a multisearcher.
As far as maintenance of DB is concerned, it might be easy as long as you
don't have any document updates, in which case you'd have to shift the
documents from one DB/index to another (which includes creating an entry in
the latest index/DB and deleting the record from the older DB).
I guess you'd have to pilot it, in case memory is an issue in your case and
not speed, you could try a regular multisearcher instead of a parallel
multisearcher.
I guess when you say maintenance of the DB gets easier, you mean that the
data in each individual table is controlled (but remember there could be
other bigger hassles like the one mentioned above about moving data between
indexes/DB).

--
Anshum Gupta
Naukri Labs!
http://ai-cafe.blogspot.com

The facts expressed here belong to everybody, the opinions to me. The
distinction is yours to draw............


On Mon, Oct 6, 2008 at 10:06 AM, Ganesh <[EMAIL PROTECTED]> wrote:

Hello Anshum,

My index is growing 1 million documents per day. Initially i planned to
have a single database but the sorting of one or more fields consumes more
RAM. Whether sharding the index would also consume the same.

My application should co-exist with other application of my product and my
app could get 1 GB of RAM. Search speed is fine but i need to display the
result in the sorted order.

I thought to keep 7 days of documents in one index and create one more
after the 7 days. After 30 days the first index may get deleted. I need to
keep the documents in the index DB for 30 days. My Index DB is in HDD.

I want to the pros and cons of sharding. I think maintance of the DB
becomes easier.

It would be very much helpful, if you share some of your thoughts.

Regards
Ganesh


----- Original Message ----- From: "Anshum" <[EMAIL PROTECTED]>
To: <java-user@lucene.apache.org>
Sent: Friday, October 03, 2008 9:48 PM
Subject: Re: Single searcher vs Multi Searcher



 Hi Ganesh,

I have experimented with sharded indexes and they seem to benefit
me(atleast
in my case). I would like to know a few things before I answer your
question:
1. Do you have a reasonable criteria ( a calculated one) to shard the
indexes?
2. How do you plan to split the index? Is it going to be document based
(which I guess it should be as otherwise you would have to build a
complete
distributed system)
3. Do you plan to put your indexes on the RAM or on (physically) seperate
HDDs?

Though all said and done, sharded indexes are a good approach, if done the
right way.
--
Anshum Gupta
Naukri Labs!
http://ai-cafe.blogspot.com

The facts expressed here belong to everybody, the opinions to me. The
distinction is yours to draw............


On Fri, Oct 3, 2008 at 3:01 PM, Ganesh <[EMAIL PROTECTED]> wrote:

 Hello all,

My indexing is growing by 1 million records per day and the memory
consumption of the searcher object is quite high.

There are different opinion in the groups. Few suggest to use single
database and few to use sharding. My Database has 10 million records now
and
it might go till 30 million or more. I plan to shard the index. but
Multisearcher will give me benifit.

Regards
Ganesh


Send instant messages to your online friends
http://in.messenger.yahoo.com
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




Send instant messages to your online friends http://in.messenger.yahoo.com
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




Send instant messages to your online friends http://in.messenger.yahoo.com
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to