Master/Dormant Master and NFS. SimpleFSLockFactory?
I am trying to run through a few failure scenarios using a dual master approach using NFS as a shared storage solution to hold the Master's indexes. My goal is to be able to bring up a secondary master in the case that the primary master fails. I have several slaves using replication to pull indexes from the master. I am NOT trying to do an active/active master. I will be failing traffic over from master to dormant master using a F5 vip. But it does beg the question...has anyone here done an active/active master with shared storage? So assuming for a moment I am doing active/dormant: >From a few quick google searches it looks like I need to configure both master's to use the SimpleFSLockFactory and to set "unlockOnStartup" to true in solconfig.xml. For those that have done this before, are there any other settings I should be aware of? What are the downsides to the SimpleFSLockFactory? Are most folks here keeping solr up and running on both hosts at the same time, or rather just starting solr manually on the dormant host once the primary dies? Thanks, Parker
Re: Indexes in ramdisk don't show performance improvement?
That¹s just the thing. Even the initial queries have similar response times as the later ones. WEIRD! I was considering running from /dev/shm in production, but for slaves only (master remains on disk). At this point though, I'm not seeing a benefit to ramdisk so I think I'm going back to traditional disk so the indexes stay intact after a power cycle. Has anyone else seen that indexes served from disk perform similarly as indexes served from ramdisk? -Park On 6/2/11 4:15 PM, "Erick Erickson" wrote: >What I expect is happening is that the Solr caches are effectively making >the >two tests identical, using memory to hold the vital parts of the code in >both >cases (after disk warming on the instance using the local disk). I >suspect if >you measured the first few queries (assuming no auto-warming) you'd see >the >local disk version be slower. > >Were you running these tests for curiosity or is running from /dev/shm >something >you're considering for production? > >Best >Erick > >On Thu, Jun 2, 2011 at 5:47 PM, Parker Johnson >wrote: >> >> Hey everyone. >> >> Been doing some load testing over the past few days. I've been throwing >>a >> good bit of load at an instance of solr and have been measuring response >> time. We're running a variety of different keyword searches to keep >> solr's cache on its toes. >> >> I'm running two exact same load testing scenarios: one with indexes >> residing in /dev/shm and another from local disk. The indexes are about >> 4.5GB in size. >> >> On both tests the response times are the same. I wasn't expecting that. >> I do see the java heap size grow when indexes are served from disk >>(which >> is expected). When the indexes are served out of /dev/shm, the java >>heap >> stays small. >> >> So in general is this consistent behavior? I don't really see the >> advantage of serving indexes from /dev/shm. When the indexes are being >> served out of ramdisk, is the linux kernel or the memory mapper doing >> something tricky behind the scenes to use ramdisk in lieu of the java >>heap? >> >> For what it is worth, we are running x_64 rh5.4 on a 12 core 2.27Ghz >>Xeon >> system with 48GB ram. >> >> Thoughts? >> >> -Park >> >> >> >
Indexes in ramdisk don't show performance improvement?
Hey everyone. Been doing some load testing over the past few days. I've been throwing a good bit of load at an instance of solr and have been measuring response time. We're running a variety of different keyword searches to keep solr's cache on its toes. I'm running two exact same load testing scenarios: one with indexes residing in /dev/shm and another from local disk. The indexes are about 4.5GB in size. On both tests the response times are the same. I wasn't expecting that. I do see the java heap size grow when indexes are served from disk (which is expected). When the indexes are served out of /dev/shm, the java heap stays small. So in general is this consistent behavior? I don't really see the advantage of serving indexes from /dev/shm. When the indexes are being served out of ramdisk, is the linux kernel or the memory mapper doing something tricky behind the scenes to use ramdisk in lieu of the java heap? For what it is worth, we are running x_64 rh5.4 on a 12 core 2.27Ghz Xeon system with 48GB ram. Thoughts? -Park
Re: Vetting Our Architecture: 2 Repeaters and Slaves.
Otis and Erick, Thanks for the responses and for thinking over my potential scenarios. The big draw for me on 2 repeaters idea is that I can: 1. Maximize my hardware. I don't need a standby master. Instead, I can use the "second" repeater to field customer requests. 2. After the primary repeater failure, I neither need to fumble with multiple solconfig.xml edits (we're also using cores) or worry about manually replicating or copying indexes around. In a sense, although, perhaps not by design, a repeater solves those problems. We considered centralized storage and a standby master with access to shared filesystem, but what are you using for a shared filesystem? (NFS? Egh...) -Parker On 4/12/11 6:19 PM, "Erick Erickson" wrote: >I think the repeaters are misleading you a bit here. The purpose of a >repeater is >usually to replicate across a slow network, say in a remote data >center, then slaves at that center can get more timely updates. I don't >think >they add anything to your disaster recovery scenario. > >So I'll ignore repeaters for a bit here. The only difference between a >master >and a slave is a bit of configuration, and usually you'll allocate, say, >memory >differently on the two machines when you start the JVM. You might disable >caches on the master (since they're used for searching). You may.. > >Let's say >I have master M, and slaves S1, S2, S3. The slaves have an >up-to-date index as of the last replication (just like your repeater >would have). If any slave goes down, you can simply bring up another >machine as a slave, point it at your master, wait for replication on that >slave and then let your load balancer know it's there. This is the >HOST2-4 failure you outlined > >Should the master fail you have two choices, >depending upon how long you can wait for *new* content to be searchable. >Let's say you can wait half a day in this situation. Spin up a new >machine, >copy the index over from one of the slaves (via a simple copy or by >replicating). Point your indexing process at the master, point your slaves >at the master for replication and you're done. > >Let's say you can't wait very long at all (and remember this better be >quite >a rare >event). Then you could take a slave (let's say S1) it out of the loop that >serves >searches. Copy in the configuration files you use for your >masters to it, point the indexer and searchers at it and you're done. >Now spin up a new slave as above and your old configuration is back. > >Note that in two of these cases, you temporarily have 2 slaves doing the >work >that 3 used to, so a bit of over-capacity may be in order. > >But a really good question here is how to be sure all your data is in your >index. >After all, the slaves (and repeater for that matter) are only current up >to >the last >replication. The simplest thing to do is simply re-index everything from >the >last >known commit point. Assuming you have a defined, if you index >documents that are already in the index, they'll just be replaced, no harm >done. >So let's say your replication interval is 10 minutes (picking a number >from >thin >air). When your system is back and you restart your indexer, restart >indexing from, >say, the time you noticed your master went down - 1 hour as the restart >point for >your indexer. You can be more deterministic than this by examining the log >on >the machine you're using to replace the master with and noting the last >replication >time and subtract your hour (or whatever) from that. > >Anyway, hope I haven't confused you unduly! The take-away is that a that >a >slave can be made into a master as fast as a repeater can, the replication >process is the same and I just don't see what a repeater buys you in the >scenario you described. > >Best >Erick > > >On Tue, Apr 12, 2011 at 6:33 PM, Parker Johnson >wrote: > >> >> >> I am hoping to get some feedback on the architecture I've been planning >> for a medium to high volume site. This is my first time working >> with Solr, so I want to be sure what I'm planning isn't totally weird, >> unsupported, etc. >> >> We've got a a pair of F5 loadbalancers and 4 hosts. 2 of those hosts >>will >> be repeaters (master+slave), and 2 of those hosts will be pure slaves. >>One >> of the F5 vips, "Index-vip" will have members HOST1 and HOST2, but HOST2 >> will be "downed" and not taking traffic from that vip. The second vip, >> "Search-vip" will have 3 members: HOST2, HOST3, a
Vetting Our Architecture: 2 Repeaters and Slaves.
I am hoping to get some feedback on the architecture I've been planning for a medium to high volume site. This is my first time working with Solr, so I want to be sure what I'm planning isn't totally weird, unsupported, etc. We've got a a pair of F5 loadbalancers and 4 hosts. 2 of those hosts will be repeaters (master+slave), and 2 of those hosts will be pure slaves. One of the F5 vips, "Index-vip" will have members HOST1 and HOST2, but HOST2 will be "downed" and not taking traffic from that vip. The second vip, "Search-vip" will have 3 members: HOST2, HOST3, and HOST4. The "Index-vip" is intended to be used to post and commit index changes. The "Search-vip" is intended to be customer facing. Here is some ASCII art. The line with the "X"'s thru it denotes a "downed" member of a vip, one that isn't taking any traffic. The "M:" denotes the value in the solrconfig.xml that the host uses as the master. Index-vip Search-vip / \ / | \ / X /|\ / \ / | \ / X / | \ / \ / | \ / X /|\ / \ / | \ HOST1 HOST2 HOST3 HOST4 REPEATERREPEATERSLAVE SLAVE M:Index-vipM:Index-vip M:Index-vip M:Index-vip I've been working through a couple failure scenarios. Recovering from a failure of HOST2, HOST3, or HOST4 is pretty straightforward. Loosing HOST1 is my major concern. My plan for recovering from a failure of HOST1 is as follows: Enable HOST2 as a member of the Index-vip, while disabling member HOST1. HOST2 effectively becomes the Master. HOST2, 3, and 4 continue fielding customer requests and pulling indexes from "Index-vip." Since HOST2 is now in charge of crunching indexes and fielding customer requests, I assume load will increase on that box. When we recover HOST1, we will simply make sure it has replicated against "Index-vip" and then re-enable HOST1 as a member of the Index-vip and disable HOST2. Hopefully this makes sense. If all goes correctly, I've managed to keep all services up and running without loosing any index data. So, I have a few questions: 1. Has anyone else tried this dual repeater approach? 2. Am I going to have any semaphore/blocking issues if a repeater is pulling index data from itself? 3. Is there a better way to do this? Thanks, Parker
Re: Will Slaves Pileup Replication Requests?
Thanks Larry. -Parker On 4/11/11 12:14 PM, "Green, Larry (CMG - Digital)" wrote: >Yes. It will wait whatever the replication interval is after the most >recent replication completes before attempting again. > >On Apr 11, 2011, at 2:42 PM, Parker Johnson wrote: > >> >> What is the slave replication behavior if a replication request to pull >> indexes takes longer than the replication interval itself? >> >> Anotherwords, if my replication interval is set to be every 30 seconds, >> and my indexes are significantly large enough to take longer than 30 >> seconds to transfer, is the slave smart enough to not send another >> replication request if one is already in progress? >> >> >> -Parker >> >> > >
Will Slaves Pileup Replication Requests?
What is the slave replication behavior if a replication request to pull indexes takes longer than the replication interval itself? Anotherwords, if my replication interval is set to be every 30 seconds, and my indexes are significantly large enough to take longer than 30 seconds to transfer, is the slave smart enough to not send another replication request if one is already in progress? -Parker
Re: Trying to Post. Emails rejected as spam.
I have tried to change to plain text format and reword my question several times. Weird and annoying. Here is my question, maybe it'll somehow go through this time: In my master/slave setup, my slaves are polling the master every minute. My indexes are getting large, to the point where I might take more than a minute to pull a fresh index over the wire. What is the behavior of a slave if it takes more than 1 minute to fetch the indexes from the master? Is the slave smart enough to know a previous replication request is being serviced and to not start another request? -Parker - Original Message From: Paul Rogers To: solr-user@lucene.apache.org Sent: Thu, April 7, 2011 12:34:25 PM Subject: Re: Trying to Post. Emails rejected as spam. Hi Park I had the same problem. I noticed one of the issues with the blocked messages are they are HTML/Rich Text. (FREEMAIL_ENVFROM_END_DIGIT,FREEMAIL_FROM,FS_REPLICA, HTML_MESSAGE <-,RCVD_IN_DNSWL_NONE,RFC_ABUSE_POST,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL In GMail I can switch to plain text. This fixed the problem for me. If you can do the same in Yahoo you should find it reduces the spam score sufficiently to allow the messages through. Regards Paul On 7 April 2011 20:21, Ezequiel Calderara wrote: > > Happened to me a couple of times, couldn't find a way a workaround... > > On Thu, Apr 7, 2011 at 4:14 PM, Parker Johnson wrote: > > > > > Hello everyone. Does anyone else have problems posting to the list? My > > messages keep getting rejected with this response below. I'll be surprised > > if > > this one makes it through :) > > > > -Park > > > > Sorry, we were unable to deliver your message to the following address. > > > > : > > Remote host said: 552 spam score (8.0) exceeded threshold > > > > >(FREEMAIL_ENVFROM_END_DIGIT,FREEMAIL_FROM,FS_REPLICA,HTML_MESSAGE,RCVD_IN_DNSWL_NONE,RFC_ABUSE_POST,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL > > > > ) [BODY] > > > > --- Below this line is a copy of the message. > > > > > > -- > __ > Ezequiel. > > Http://www.ironicnet.com
Trying to Post. Emails rejected as spam.
Hello everyone. Does anyone else have problems posting to the list? My messages keep getting rejected with this response below. I'll be surprised if this one makes it through :) -Park Sorry, we were unable to deliver your message to the following address. : Remote host said: 552 spam score (8.0) exceeded threshold (FREEMAIL_ENVFROM_END_DIGIT,FREEMAIL_FROM,FS_REPLICA,HTML_MESSAGE,RCVD_IN_DNSWL_NONE,RFC_ABUSE_POST,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL ) [BODY] --- Below this line is a copy of the message.