We build our indexes on a remote machine (that uses a slave version of our DB), then sftp the resulting index files to our web servers, each of which run their own TS instance that uses cron to send a SIGHUP that refreshes the search, similar to what it sounds like Josh is describing.
Two weeks ago, I spent a couple days trying to update this configuration so we could use time-based delta indexing on that remote machine to rebuilding our indexes more frequently. However, we ran into a number of instances where this broke search in a variety of interesting ways... everything from only parts of the search string being used, to partial results being returned (ie., only items older than 3 months). Ultimately, we reverted back to just doing full indexes and sftping them (as described in first paragraph). I'm not entirely sure which aspect of the delta process is to blame for our troubles (was it the Sphinx merging? The Thinking Sphinx time-stamp delta indexing? Or just our own code?), but we went through a lot of pain when we tried to combine delta indexing with across multiple servers. Seeing as how our indexing now takes almost two hours (and ideally our main site search would be updated once/hour or more), we'll surely have to revisit this before too much longer. I'll post the results if/ when I manage to crack this nut. Bill On May 2, 4:16 am, Josh <[email protected]> wrote: > Sorry, I neglected half of your question. In our case, we run both a > daily full-index and a more frequent delta index on one machine. > Regardless of which type of index we are running, we rename the > resulting files and push them to each server that runs searchd, and > send the SIGHUP signal to get the indexes refreshed. > > The downside of this is that we can't use thinking_sphinx's spiffy > indexing tasks, but it does work well. Again, I'm not sure how easy > this is under EC2, I don't have any experience there. > > - Josh > > On May 1, 10:26 am, agib <[email protected]> wrote: > > > Hi Josh, thank you for the response, but I still don't see how that > > fixes the deltas issue... > > > Is anyone using sphinx's built-in distributed searching feature? > > Wouldn't that be the best solution to this problem? > > > On May 1, 7:59 am, Josh <[email protected]> wrote: > > > > There are a few ways to do this, though I'm not sure what will work on > > > EC2, check out this thread: > > > >http://groups.google.com/group/thinking-sphinx/browse_thread/thread/b... > > > > -Josh > > > > On May 1, 2:51 am, agib <[email protected]> wrote: > > > > > I'm not sure I understand how to get the deltas working on a 2+ server > > > > environment... let's say I have server A (app + sphinx) and server B > > > > (app). If a request to server B updates a model that has :delta => > > > > true, how does the sphinx index on server A get updated? Do I have to > > > > set up some sort of shared filesystem? I'm on EC2 and I'm not sure > > > > that's possible... I used to have A (app + sphinx) and B (app + > > > > sphinx) but then I realized that it was possible for both servers to > > > > return different results (i.e. I could refresh a search result page > > > > and get alternating results). Is there any good solution for remote > > > > delta indexes? --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Thinking Sphinx" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/thinking-sphinx?hl=en -~----------~----~----~----~------~----~------~--~---
