Re: Lucene Replication Support

Glyn Darkin Thu, 26 Feb 2009 10:24:45 -0800

Great stuff, thankyou for the thoughts.

Glyn


2009/2/26 Jokin Cuadrado <joki...@gmail.com>:
> we use the fsdirectory, ramdirectory could be better on some edge cases,
> especially when you have a small index or do a lot of inserts and updates to
> the index. If you have enough memory on the machine, Lucene does a good job
> caching the termvectors and some relevant info, after some time the disk
> usage should be very low. The ramdirectory has a big penalty on load time
> has it have to load all the info in the memory, even that info that will
> never be used, that is a big cost with such a big index, also it doesn't
> ensure that it wouldn't be swapped to disk if the OS needs memory.
>
> On Thu, Feb 26, 2009 at 6:07 PM, Glyn Darkin <g...@darkinsystems.com> wrote:
>
>> Thanks for the quick response Jokin,
>>
>> We are currently building a search solution that has a 3.6 g index
>> Max concurrent reads at the moment is 54, but we are hoping that this
>> will increase significantly as the website traffic increases
>> This will be a read only index, with a deployment of a new index nightly.
>>
>> On another note,
>> We are considering running searches against the index using a
>> FSDirectory. Is this how you have your search or do you load the index
>> into a RAMDirectory?
>>
>> Cheers
>>
>> Glyn
>>
>>
>>
>>
>>
>>
>> 2009/2/26 Jokin Cuadrado <joki...@gmail.com>:
>> > i have a 3 Gb index  and another one of 800 Mb, the update rate is small
>> and
>> > the number or concurrent search right now it's also small, but i made the
>> > synchronization during a stress test without any problem, you just have
>> to
>> > copy the cfs file before the segments one, and as the most copy software
>> > list the directories in alphabetically order, it always happen. (cfs
>> files
>> > start with an "_" character, so they are always the first to be copied).
>> > in my case this works well because i don't have big restrictions, for
>> > example the result of the search could be diferent between 2 machines
>> during
>> > an small amount of time, and the incremental updates are done every
>> couple
>> > hours.
>> >
>> > If you tell us your constraints, I could suggest you more complex
>> approaches
>> > .
>> >
>> >
>> >
>> >
>> > On Thu, Feb 26, 2009 at 4:44 PM, Glyn Darkin <g...@darkinsystems.com>
>> wrote:
>> >
>> >> Jokin,
>> >>
>> >> What sort of index size are you dealing with and how many concurrent
>> >> searches would be running against your index when you Robocopy?
>> >>
>> >> Cheers
>> >>
>> >> Glyn
>> >>
>> >>
>> >> 2009/2/26 Jokin Cuadrado <joki...@gmail.com>:
>> >> > Robocopy  /Mir works smoothly, just make sure that you copy first the
>> >> index
>> >> > files (.cfs) and after the segments.* ones.
>> >> >
>> >> > On Thu, Feb 26, 2009 at 5:35 AM, Nitin Shiralkar <
>> nit...@coreobjects.com
>> >> >wrote:
>> >> >
>> >> >> Hi All,
>> >> >>
>> >> >> Do we have any in-built replication support? Today, we are building
>> the
>> >> >> index and generating a periodic backup through the same builder
>> service.
>> >> >> This is working fine for couple of years. But I would like to know if
>> >> there
>> >> >> are any better options within Lucene library. Though we are using
>> >> >> Lucene.NET, but would also like to know if there is any support on
>> Java
>> >> side
>> >> >> if not on .NET.
>> >> >>
>> >> >>
>> >> >> Thanks & regards,
>> >> >>
>> >> >> Nitin Shiralkar
>> >>
>> >
>> > --
>> > Jokin
>> >
>>
>>
>>
>> --
>> Glyn Darkin
>>
>> Darkin Systems Ltd
>> Mob: 07961815649
>> Fax: 08717145065
>> Web: www.darkinsystems.com
>>
>> Company No: 6173001
>> VAT No: 906350835
>>
>
>
>
> --
> Jokin
>



-- 
Glyn Darkin

Darkin Systems Ltd
Mob: 07961815649
Fax: 08717145065
Web: www.darkinsystems.com

Company No: 6173001
VAT No: 906350835

Re: Lucene Replication Support

Reply via email to