Re: OverlappingFileLockException when using str name=replicateAfterstartup/str
Hi Guys, I'm experiencing the same issue with a single war. I'm using a brand new Solr war built from yestertay's version of the trunk. I've got one master with 2 cores and one slave with a single core. I'm using one core from master as the master of the second core (which is configured as a repeater). So that, the slave's core can poll the repeater for index changes. ( I was using solr 1.4, but experienced some issues with replication. While rebuilding the index on the one master core, the new index was not replicated succesfully to the other master core. Files were copied over but the final commit failed on the snappuller. But sometimes, while restarting the master, the replication would work fine between master cores, then no replication would be successful from master to slave core. I had the same issue as described here: https://issues.apache.org/jira/browse/SOLR-1769 . Which seems to be fixed in the trunk. So I moved on to the trunk version of solr in order to tests the fix. This seems to work better. As master cores replication works fine. But I've got a weird behavior on slave. The index replication is successful only the second time the slave is trying to get it even if for each replication trial, slave spits out the following Exception (see below). There seems to be a concurrrency issue but I don't quite undestand where the concurrency is really happening. Can you please help on that issue? org.apache.solr.common.SolrException: Index fetch failed : at org.apache.solr.handler.SnapPuller.fetchLatestIndex(SnapPuller.java:329) at org.apache.solr.handler.ReplicationHandler.doFetch(ReplicationHandler.java:264) at org.apache.solr.handler.SnapPuller$1.run(SnapPuller.java:159) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecu tor.java:98) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExec utor.java:181) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.jav a:205) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619) Caused by: java.nio.channels.OverlappingFileLockException at sun.nio.ch.FileChannelImpl$SharedFileLockTable.checkList(FileChannelImpl.java:1170) at sun.nio.ch.FileChannelImpl$SharedFileLockTable.add(FileChannelImpl.java:1072) at sun.nio.ch.FileChannelImpl.tryLock(FileChannelImpl.java:878) at java.nio.channels.FileChannel.tryLock(FileChannel.java:962) at org.apache.lucene.store.NativeFSLock.obtain(NativeFSLockFactory.java:260) at org.apache.lucene.store.Lock.obtain(Lock.java:72) at org.apache.lucene.index.IndexWriter.init(IndexWriter.java:1061) at org.apache.lucene.index.IndexWriter.init(IndexWriter.java:950) at org.apache.solr.update.SolrIndexWriter.init(SolrIndexWriter.java:192) at org.apache.solr.update.UpdateHandler.createMainIndexWriter(UpdateHandler.java:99) at org.apache.solr.update.DirectUpdateHandler2.openWriter(DirectUpdateHandler2.java:173) at org.apache.solr.update.DirectUpdateHandler2.forceOpenWriter(DirectUpdateHandler2.java:376) at org.apache.solr.handler.SnapPuller.doCommit(SnapPuller.java:471) at org.apache.solr.handler.SnapPuller.fetchLatestIndex(SnapPuller.java:319) ... 11 more -- View this message in context: http://lucene.472066.n3.nabble.com/OverlappingFileLockException-when-using-str-name-replicateAfter-startup-str-tp488686p870589.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr Architecture discussion
Hi Chris, Thanks for your insights. I totally understand your point about steps 4 and 5. I wanted to control the moment when the swap would happen on the slave side but as you say there is no use for that. It only adds up complexity that internal solr mechanisms are already providing. For the replication aspect, I re-read the whole documentation and with the light you shed on that topic, I realize that the only problem here is the huge amount of data that can be passed over the wire depending on the segments that the indexing will update. As you say, optimizing can have a devastating effect on the replication phase as, if I have a good understanding of what you said, this could potentially update all the index segments. OK! so if I rephrase it, the best strategy in my case is to limit the optimization phases in order to prioritize the replication performance, and make the optimization only when the replication activity is not so crucial in order to avoid degrading the search performances. Thank you very much. That helps a lot. -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Architecture-discussion-tp825708p860767.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr Architecture discussion
Thinking twice about this architecture ... I'm concerned about the way I'm going to automate the following steps: A- The slaves would regularly poll Master-core1 for changes B- A backup of the current index would be created C- Re-Indexing will happen on Master-core2 D- When Indexing is done, we'll trigger a swap between Master-core1 and core2 E- Slaves will then poll and pickup the freshly updated index segments F- and so on! This seems to be simple when it's done manually. But I can not just sit there and trigger a button to send the events. To reach that goal, I realized that on solution would be to have 2 cores on the master side, while the slaves would only have one core (as previously discussed). We'll just need to configure the slave polling period (A,E), and send the right http request (B,C,D). Well ok, step A is automated natively. Easy enough, using the internal solr capabilities. But how can B,C, and D. I'll do it manually. Wait! I'm not sure my boss will pay for that. All right so I imagine that I should implement a process that will automate the phases that I would otherwise do manually. This would be an external process not based on solr mechanism. My questions are: 1/Can I leverage on some solr mechanisms (that is, by configuration only) in order to reach that goal? I haven't found how to do it! 2/ Is there any issue while replicating master swapped index files? I've seen in the literature that there might be some issues. 3/If a solr configuration based solution does not exist, my first attempt would be to write a shell based process that will regularly trigger the events, wait for the end of each phase by polling the current phase status, in order to trigger the next one. Does that sound good to you? Or is there a better and more elegant way to do the trick when indexing and replication should be beating at a high pace? Thank you. -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Architecture-discussion-tp825708p860942.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr Architecture discussion
Do you have any insights that could help me and other people that might be interested in that discussion? Thanks. -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Architecture-discussion-tp825708p828658.html Sent from the Solr - User mailing list archive at Nabble.com.
Solr Architecture discussion
Hi, I'd like to get some architectural advices concerning the setup of a solr (v1.4) platform in a production environment. I'll first describe my targeted architecture and then ask the questions related to that environment. Here's briefly what I achieved so far: I've already setup an environment which serves as a proof of concept. This environment is composed of a master instance on one host, and a slave instance on a second host. The slave handles 2 solr cores. In the final version of the architecture I would add up one ore more SLAVE nodes depending on the request load. request | V [ MASTER [core] ] --- [SLAVE [core1] --swap--[core2] ] | v [index backup] The goal of this architecture is: * Isolate indexing from requesting * Enable index replication from master to slave * Control the swap between newly replicated index (use of dual core per Slave ) Here's how the whole platform works when we need to renew the index (on the slaves) 1- backup index files on master using solr backup capability (a backup is always welcome) 2- launch index creation (I'm using the delta indexing capabilities in order to limit the index generation time) 3- trigger replication from master core to slave core2 based on solr capabilities too 4- trigger swap between core 1 and core2 5- At this point Slave index has been renewed ... we can revert back to the previous index if there was any issues with the new one. As this is aimed to be a production environment, redondancy is one of the key elements, meaning that will double (or more) the front solr instances. If slave instances are not in the same network as the Master instance, our strategy will probably be to set up one of the slaves as a relay. That said, here are my questions: 1 / I'd like to have insight about issues that may happen with that kind of architecture? 2 / My first concern is about the size of the index that would need to be replicated. We need to perform indexing all day long (every 5min) and replicate as soon as the index is built. As far as I know, replication copies over all the index files. I think that there can not be delta replication (only replicating what changed). That's my assumption. But, is there any way to make a delta replication if that make any sense? 3 / How can I improve this architecture based on your own experience? Ex: Shall I use different network interface for solr commands and requests? Thank you for sharing. -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Architecture-discussion-tp825708p825708.html Sent from the Solr - User mailing list archive at Nabble.com.