Master/Dormant Master and NFS. SimpleFSLockFactory?

2011-06-27 Thread Parker Johnson

I am trying to run through a few failure scenarios using a dual master
approach using NFS as a shared storage solution to hold the Master's
indexes.  My goal is to be able to bring up a secondary master in the case
that the primary master fails.  I have several slaves using replication to
pull indexes from the master.

I am NOT trying to do an active/active master.  I will be failing traffic
over from master to dormant master using a F5 vip.  But it does beg the
question...has anyone here done an active/active master with shared
storage?

So assuming for a moment I am doing active/dormant:

>From a few quick google searches it looks like I need to configure both
master's to use the SimpleFSLockFactory and to set "unlockOnStartup" to
true in solconfig.xml.  For those that have done this before, are there
any other settings I should be aware of?  What are the downsides to the
SimpleFSLockFactory?  Are most folks here keeping solr up and running on
both hosts at the same time, or rather just starting solr manually on the
dormant host once the primary dies?

Thanks,
Parker





Re: Indexes in ramdisk don't show performance improvement?

2011-06-02 Thread Parker Johnson

That¹s just the thing.  Even the initial queries have similar response
times as the later ones.  WEIRD!

I was considering running from /dev/shm in production, but for slaves only
(master remains on disk).  At this point though, I'm not seeing a benefit
to ramdisk so I think I'm going back to traditional disk so the indexes
stay intact after a power cycle.

Has anyone else seen that indexes served from disk perform similarly as
indexes served from ramdisk?

-Park

On 6/2/11 4:15 PM, "Erick Erickson"  wrote:

>What I expect is happening is that the Solr caches are effectively making
>the
>two tests identical, using memory to hold the vital parts of the code in
>both
>cases (after disk warming on the instance using the local disk). I
>suspect if
>you measured the first few queries (assuming no auto-warming) you'd see
>the
>local disk version be slower.
>
>Were you running these tests for curiosity or is running from /dev/shm
>something
>you're considering for production?
>
>Best
>Erick
>
>On Thu, Jun 2, 2011 at 5:47 PM, Parker Johnson 
>wrote:
>>
>> Hey everyone.
>>
>> Been doing some load testing over the past few days. I've been throwing
>>a
>> good bit of load at an instance of solr and have been measuring response
>> time.  We're running a variety of different keyword searches to keep
>> solr's cache on its toes.
>>
>> I'm running two exact same load testing scenarios: one with indexes
>> residing in /dev/shm and another from local disk.  The indexes are about
>> 4.5GB in size.
>>
>> On both tests the response times are the same.  I wasn't expecting that.
>> I do see the java heap size grow when indexes are served from disk
>>(which
>> is expected).  When the indexes are served out of /dev/shm, the java
>>heap
>> stays small.
>>
>> So in general is this consistent behavior?  I don't really see the
>> advantage of serving indexes from /dev/shm.  When the indexes are being
>> served out of ramdisk, is the linux kernel or the memory mapper doing
>> something tricky behind the scenes to use ramdisk in lieu of the java
>>heap?
>>
>> For what it is worth, we are running x_64 rh5.4 on a 12 core 2.27Ghz
>>Xeon
>> system with 48GB ram.
>>
>> Thoughts?
>>
>> -Park
>>
>>
>>
>




Indexes in ramdisk don't show performance improvement?

2011-06-02 Thread Parker Johnson

Hey everyone.

Been doing some load testing over the past few days. I've been throwing a
good bit of load at an instance of solr and have been measuring response
time.  We're running a variety of different keyword searches to keep
solr's cache on its toes.

I'm running two exact same load testing scenarios: one with indexes
residing in /dev/shm and another from local disk.  The indexes are about
4.5GB in size.

On both tests the response times are the same.  I wasn't expecting that.
I do see the java heap size grow when indexes are served from disk (which
is expected).  When the indexes are served out of /dev/shm, the java heap
stays small.

So in general is this consistent behavior?  I don't really see the
advantage of serving indexes from /dev/shm.  When the indexes are being
served out of ramdisk, is the linux kernel or the memory mapper doing
something tricky behind the scenes to use ramdisk in lieu of the java heap?

For what it is worth, we are running x_64 rh5.4 on a 12 core 2.27Ghz Xeon
system with 48GB ram.

Thoughts?

-Park




Re: Vetting Our Architecture: 2 Repeaters and Slaves.

2011-04-14 Thread Parker Johnson

Otis and Erick,

Thanks for the responses and for thinking over my potential scenarios.

The big draw for me on 2 repeaters idea is that I can:

1. Maximize my hardware.  I don't need a standby master.  Instead, I can
use the "second" repeater to field customer requests.
2. After the primary repeater failure, I neither need to fumble with
multiple solconfig.xml edits (we're also using cores) or worry about
manually replicating or copying indexes around.

In a sense, although, perhaps not by design, a repeater solves those
problems.

We considered centralized storage and a standby master with access to
shared filesystem, but what are you using for a shared filesystem? (NFS?
Egh...)

-Parker

On 4/12/11 6:19 PM, "Erick Erickson"  wrote:

>I think the repeaters are misleading you a bit here. The purpose of a
>repeater is
>usually to replicate across a slow network, say in a remote data
>center, then slaves at that center can get more timely updates. I don't
>think
>they add anything to your disaster recovery scenario.
>
>So I'll ignore repeaters for a bit here. The only difference between a
>master
>and a slave is a bit of configuration, and usually you'll allocate, say,
>memory
>differently on the two machines when you start the JVM. You might disable
>caches on the master (since they're used for searching). You may..
>
>Let's say
>I have master M, and slaves S1, S2, S3. The slaves have an
>up-to-date index as of the last replication (just like your repeater
>would have). If any slave goes down, you can simply bring up another
>machine as a slave, point it at your master, wait for replication on that
>slave and then let your load balancer know it's there. This is the
>HOST2-4 failure you outlined
>
>Should the master fail you have two choices,
>depending upon how long you can wait for *new* content to be searchable.
>Let's say you can wait half a day in this situation. Spin up a new
>machine,
>copy the index over from one of the slaves (via a simple copy or by
>replicating). Point your indexing process at the master, point your slaves
>at the master for replication and you're done.
>
>Let's say you can't wait very long at all (and remember this better be
>quite
>a rare
>event). Then you could take a slave (let's say S1) it out of the loop that
>serves
>searches. Copy in the configuration files you use for your
>masters to it, point the indexer and searchers at it and you're done.
>Now spin up a new slave as above and your old configuration is back.
>
>Note that in two of these cases, you temporarily have 2 slaves doing the
>work
>that 3 used to, so a bit of over-capacity may be in order.
>
>But a really good question here is how to be sure all your data is in your
>index.
>After all, the slaves (and repeater for that matter) are only current up
>to
>the last
>replication. The simplest thing to do is simply re-index everything from
>the
>last
>known commit point. Assuming you have a  defined, if you index
>documents that are already in the index, they'll just be replaced, no harm
>done.
>So let's say your replication interval is 10 minutes (picking a number
>from
>thin
>air). When your system is back and you restart your indexer, restart
>indexing from,
>say, the time you noticed your master went down - 1 hour as the restart
>point for
>your indexer. You can be more deterministic than this by examining the log
>on
>the machine you're using to replace the master with and noting the last
>replication
>time and subtract your hour (or whatever) from that.
>
>Anyway, hope I haven't confused you unduly! The take-away is that a that
>a
>slave can be made into a master as fast as a repeater can, the replication
>process is the same and I just don't see what a repeater buys you in the
>scenario you described.
>
>Best
>Erick
>
>
>On Tue, Apr 12, 2011 at 6:33 PM, Parker Johnson
>wrote:
>
>>
>>
>> I am hoping to get some feedback on the architecture I've been planning
>> for a medium to high volume site.  This is my first time working
>> with Solr, so I want to be sure what I'm planning isn't totally weird,
>> unsupported, etc.
>>
>> We've got a a pair of F5 loadbalancers and 4 hosts.  2 of those hosts
>>will
>> be repeaters (master+slave), and 2 of those hosts will be pure slaves.
>>One
>> of the F5 vips, "Index-vip" will have members HOST1 and HOST2, but HOST2
>> will be "downed" and not taking traffic from that vip.  The second vip,
>> "Search-vip" will have 3 members: HOST2, HOST3, a

Vetting Our Architecture: 2 Repeaters and Slaves.

2011-04-12 Thread Parker Johnson


I am hoping to get some feedback on the architecture I've been planning
for a medium to high volume site.  This is my first time working
with Solr, so I want to be sure what I'm planning isn't totally weird,
unsupported, etc.

We've got a a pair of F5 loadbalancers and 4 hosts.  2 of those hosts will
be repeaters (master+slave), and 2 of those hosts will be pure slaves. One
of the F5 vips, "Index-vip" will have members HOST1 and HOST2, but HOST2
will be "downed" and not taking traffic from that vip.  The second vip,
"Search-vip" will have 3 members: HOST2, HOST3, and HOST4.  The
"Index-vip" is intended to be used to post and commit index changes.  The
"Search-vip" is intended to be customer facing.

Here is some ASCII art.  The line with the "X"'s thru it denotes a
"downed" member of a vip, one that isn't taking any traffic.  The "M:"
denotes the value in the solrconfig.xml that the host uses as the master.


  Index-vip Search-vip
 / \ /   |   \
/   X   /|\
   / \ / | \
  /   X   /  |  \
 / \ /   |   \
/   X   /|\
   / \ / | \
 HOST1  HOST2  HOST3  HOST4
   REPEATERREPEATERSLAVE  SLAVE
  M:Index-vipM:Index-vip M:Index-vip  M:Index-vip


I've been working through a couple failure scenarios.  Recovering from a
failure of HOST2, HOST3, or HOST4 is pretty straightforward.  Loosing
HOST1 is my major concern.  My plan for recovering from a failure of HOST1
is as follows: Enable HOST2 as a member of the Index-vip, while disabling
member HOST1.  HOST2 effectively becomes the Master.  HOST2, 3, and 4
continue fielding customer requests and pulling indexes from "Index-vip."
Since HOST2 is now in charge of crunching indexes and fielding customer
requests, I assume load will increase on that box.

When we recover HOST1, we will simply make sure it has replicated against
"Index-vip" and then re-enable HOST1 as a member of the Index-vip and
disable HOST2.

Hopefully this makes sense.  If all goes correctly, I've managed to keep
all services up and running without loosing any index data.

So, I have a few questions:

1. Has anyone else tried this dual repeater approach?
2. Am I going to have any semaphore/blocking issues if a repeater is
pulling index data from itself?
3. Is there a better way to do this?


Thanks,
Parker








Re: Will Slaves Pileup Replication Requests?

2011-04-11 Thread Parker Johnson

Thanks Larry.

-Parker

On 4/11/11 12:14 PM, "Green, Larry (CMG - Digital)"
 wrote:

>Yes. It will wait whatever the replication interval is after the most
>recent replication completes before attempting again.
>
>On Apr 11, 2011, at 2:42 PM, Parker Johnson wrote:
>
>> 
>> What is the slave replication behavior if a replication request to pull
>> indexes takes longer than the replication interval itself?
>> 
>> Anotherwords, if my replication interval is set to be every 30 seconds,
>> and my indexes are significantly large enough to take longer than 30
>> seconds to transfer, is the slave smart enough to not send another
>> replication request if one is already in progress?
>> 
>> 
>> -Parker
>> 
>> 
>
>




Will Slaves Pileup Replication Requests?

2011-04-11 Thread Parker Johnson

What is the slave replication behavior if a replication request to pull
indexes takes longer than the replication interval itself?

Anotherwords, if my replication interval is set to be every 30 seconds,
and my indexes are significantly large enough to take longer than 30
seconds to transfer, is the slave smart enough to not send another
replication request if one is already in progress?


-Parker




Re: Trying to Post. Emails rejected as spam.

2011-04-08 Thread Parker Johnson

I have tried to change to plain text format and reword my question several 
times.  Weird and annoying.  Here is my question, maybe it'll somehow go 
through 

this time:

In my master/slave setup, my slaves are polling the master every minute.  My 
indexes are getting large, to the point where I might take more than a minute 
to 


pull a fresh index over the wire.  What is the behavior of a slave if it takes 
more than 1 minute to fetch the indexes from the master?  Is the slave smart 
enough to know a previous replication request is being serviced and to not 
start 
another request?

-Parker



- Original Message 
From: Paul Rogers 
To: solr-user@lucene.apache.org
Sent: Thu, April 7, 2011 12:34:25 PM
Subject: Re: Trying to Post. Emails rejected as spam.

Hi Park

I had the same problem.  I noticed one of the issues with the blocked
messages are they are HTML/Rich Text.

(FREEMAIL_ENVFROM_END_DIGIT,FREEMAIL_FROM,FS_REPLICA,
HTML_MESSAGE 
<-,RCVD_IN_DNSWL_NONE,RFC_ABUSE_POST,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL

In GMail I can switch to plain text.  This fixed the problem for me.
If you can do the same in Yahoo you should find it reduces the spam
score sufficiently to allow the messages through.

Regards

Paul

On 7 April 2011 20:21, Ezequiel Calderara  wrote:
>
> Happened to me a couple of times, couldn't find a way a workaround...
>
> On Thu, Apr 7, 2011 at 4:14 PM, Parker Johnson  wrote:
>
> >
> > Hello everyone.  Does anyone else have problems posting to the list?  My
> > messages keep getting rejected with this response below.  I'll be surprised
> > if
> > this one makes it through :)
> >
> > -Park
> >
> > Sorry, we were unable to deliver your message to the following address.
> >
> > :
> > Remote  host said: 552 spam score (8.0) exceeded threshold
> >
> > 
>(FREEMAIL_ENVFROM_END_DIGIT,FREEMAIL_FROM,FS_REPLICA,HTML_MESSAGE,RCVD_IN_DNSWL_NONE,RFC_ABUSE_POST,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL
>
>
> >  ) [BODY]
> >
> > --- Below this line is a copy of the message.
> >
>
>
>
> --
> __
> Ezequiel.
>
> Http://www.ironicnet.com



Trying to Post. Emails rejected as spam.

2011-04-07 Thread Parker Johnson

Hello everyone.  Does anyone else have problems posting to the list?  My 
messages keep getting rejected with this response below.  I'll be surprised if 
this one makes it through :)

-Park

Sorry, we were unable to deliver your message to the following address.

:
Remote  host said: 552 spam score (8.0) exceeded threshold  
(FREEMAIL_ENVFROM_END_DIGIT,FREEMAIL_FROM,FS_REPLICA,HTML_MESSAGE,RCVD_IN_DNSWL_NONE,RFC_ABUSE_POST,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL
  ) [BODY]

--- Below this line is a copy of the message.