We build our indexes on a remote machine (that uses a slave version of
our DB), then sftp the resulting index files to our web servers, each
of which run their own TS instance that uses cron to send a SIGHUP
that refreshes the search, similar to what it sounds like Josh is
describing.

Two weeks ago, I spent a couple days trying to update this
configuration so we could use time-based delta indexing on that remote
machine to rebuilding our indexes more frequently.  However, we ran
into a number of instances where this broke search in a variety of
interesting ways... everything from only parts of the search string
being used, to partial results being returned (ie., only items older
than 3 months).

Ultimately, we reverted back to just doing full indexes and sftping
them (as described in first paragraph).  I'm not entirely sure which
aspect of the delta process is to blame for our troubles (was it the
Sphinx merging?  The Thinking Sphinx time-stamp delta indexing?  Or
just our own code?), but we went through a lot of pain when we tried
to combine delta indexing with across multiple servers.

Seeing as how our indexing now takes almost two hours (and ideally our
main site search would be updated once/hour or more), we'll surely
have to revisit this before too much longer.  I'll post the results if/
when I manage to crack this nut.

Bill

On May 2, 4:16 am, Josh <[email protected]> wrote:
> Sorry, I neglected half of your question.  In our case, we run both a
> daily full-index and a more frequent delta index on one machine.
> Regardless of which type of index we are running, we rename the
> resulting files and push them to each server that runs searchd, and
> send the SIGHUP signal to get the indexes refreshed.
>
> The downside of this is that we can't use thinking_sphinx's spiffy
> indexing tasks, but it does work well.  Again, I'm not sure how easy
> this is under EC2, I don't have any experience there.
>
> - Josh
>
> On May 1, 10:26 am, agib <[email protected]> wrote:
>
> > Hi Josh, thank you for the response, but I still don't see how that
> > fixes the deltas issue...
>
> > Is anyone using sphinx's built-in distributed searching feature?
> > Wouldn't that be the best solution to this problem?
>
> > On May 1, 7:59 am, Josh <[email protected]> wrote:
>
> > > There are a few ways to do this, though I'm not sure what will work on
> > > EC2, check out this thread:
>
> > >http://groups.google.com/group/thinking-sphinx/browse_thread/thread/b...
>
> > > -Josh
>
> > > On May 1, 2:51 am, agib <[email protected]> wrote:
>
> > > > I'm not sure I understand how to get the deltas working on a 2+ server
> > > > environment... let's say I have server A (app + sphinx) and server B
> > > > (app). If a request to server B updates a model that has :delta =>
> > > > true, how does the sphinx index on server A get updated? Do I have to
> > > > set up some sort of shared filesystem? I'm on EC2 and I'm not sure
> > > > that's possible... I used to have A (app + sphinx) and B (app +
> > > > sphinx) but then I realized that it was possible for both servers to
> > > > return different results (i.e. I could refresh a search result page
> > > > and get alternating results). Is there any good solution for remote
> > > > delta indexes?
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Thinking Sphinx" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/thinking-sphinx?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to