Thanks! We'll give that a shot. It seems the where *is* being applied at
search time, but just not index-time. We're actually considering indexing
everything with RT because previously when projects were unarchived or
accounts unfrozen, they'd get picked backup with the full reindexes, but it
seems with RT, that wouldn't happen because we don't run full re-indexes.
Is there a way that we could tell TS to reindex in those situations if we
use this approach and exclude them from the index?

I'm trying to track down the specific content that's tripping up the
generation, but I'm not having much luck. I found the record that's printed
out when rake files, and I manually inspected the records in nearby
proximity to it, but they all check out. Is there a way/location for me to
see the SQL query that's being used at the moment the rake task fails?


On Fri, Feb 28, 2014 at 9:21 AM, Pat Allan <[email protected]> wrote:

> The `where` method doesn't apply for real-time indices - but try this
> instead in the index definition block:
>
>   scope { Issue.where "account_status_id IN (1,2) AND archived IS false" }
>
> That should ensure only the appropriate issues are indexed as part of the
> generate call.
>
> Beyond that, you may wish to add `includes` within that scope to cover all
> associations used within the index?
>
> Unrelated to any of this: the match_mode defaults to extended with TS v3
> (indeed, it can't be anything else).
>
> --
> Pat
>
> On 1 Mar 2014, at 2:09 am, Garrett Dimon <[email protected]> wrote:
>
> > Roger on the generation. Any high-level suggestions for optimization?
> >
> > I'll see if I can't figure out the exact record that's tripping up the
> index generation. In the meantime, here's a gist of our index definition:
> > https://gist.github.com/garrettdimon/25f6c305541f30b3ce39
> >
> >
> > On Thu, Feb 27, 2014 at 9:21 PM, Pat Allan <[email protected]>
> wrote:
> > Hi Garrett
> >
> > Generation can be slow - at the end of the day, it really comes down to
> how much data you're dealing with, and if you're using aggregation methods,
> how quick they are. It's all going through your Rails app (instead of just
> SQL queries), so optimising for that is different to adding db indices and
> such.
> >
> > As for the error though... without having a copy of the database, it's a
> little hard to debug, but it sounds like there's a bug in TS with something
> in the data being passed through. Having a look at your app log may help
> identify the record in question... also, what does the index definition for
> that model look like?
> >
> > --
> > Pat
> >
> > On 28 Feb 2014, at 9:47 am, Garrett Dimon <[email protected]>
> wrote:
> >
> >> Howdy, Pat. We're in the process of upgrading from TS 2 with delayed
> deltas to TS 3 with real time indexing.
> >>
> >> We've been able to get everything up and running, but we've run into a
> couple of problems/questions around the indexing. These may ultimately be
> Sphinx questions rather than Thinking Sphinx questions, but I thought I'd
> start here since we're only changing Sphinx from 2.0 to 2.1.
> >>
> >> 1. Index Creation Performance
> >>
> >> Our production logs show about a 20 minute turnaround to do a complete
> reindex of our production data with TS 2. Running some local tests, TS 3
> generate is taking at least an hour for that data. (The generate is
> crashing, so it may ultimately take even longer.)  Our indexing
> configuration is setup so that a large portion of content is excluded from
> the index. (Inactive accounts, archived projects, etc.) We've verified that
> searching is correctly excluding the relevant records, but appears as if
> that's happening when the query is run rather than when the indexing
> occurs. Our only theory so far is that with TS 2 and traditional indexing,
> those weren't included in the index at all, but that with real time
> indexing, they're included in the index and filtered out when the query is
> run. Can you provide any insight about whether this sounds like normal
> behavior or whether we've likely screwed something up? :)
> >>
> >> 2. The Generate is crashing with "rake aborted! sphinxql: syntax error,
> unexpected $undefined, expecting CONST_INT (or 4 other tokens) near ''..."
> (where ... is content from our DB.)
> >>
> >> I've done some searching, but haven't had any luck. I've run the
> generate rake task on two separate occasions, and both times it failed with
> the same error message and content, so my gut is leading me to think that
> it's an encoding or unescaped quotation mark problem. Does that problem
> ring any bells?
> >>
> >> Thanks!
> >>
> >>
> >>
> >> --
> >> You received this message because you are subscribed to the Google
> Groups "Thinking Sphinx" group.
> >> To unsubscribe from this group and stop receiving emails from it, send
> an email to [email protected].
> >>
> >> To post to this group, send email to [email protected].
> >> Visit this group at http://groups.google.com/group/thinking-sphinx.
> >> For more options, visit https://groups.google.com/groups/opt_out.
> >
> >
> > --
> > You received this message because you are subscribed to a topic in the
> Google Groups "Thinking Sphinx" group.
> > To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/thinking-sphinx/7llAB4zO4bw/unsubscribe.
> > To unsubscribe from this group and all its topics, send an email to
> [email protected].
> > To post to this group, send email to [email protected].
> > Visit this group at http://groups.google.com/group/thinking-sphinx.
> > For more options, visit https://groups.google.com/groups/opt_out.
> >
> >
> > --
> > You received this message because you are subscribed to the Google
> Groups "Thinking Sphinx" group.
> > To unsubscribe from this group and stop receiving emails from it, send
> an email to [email protected].
> > To post to this group, send email to [email protected].
> > Visit this group at http://groups.google.com/group/thinking-sphinx.
> > For more options, visit https://groups.google.com/groups/opt_out.
>
> --
> You received this message because you are subscribed to a topic in the
> Google Groups "Thinking Sphinx" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/thinking-sphinx/7llAB4zO4bw/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> [email protected].
> To post to this group, send email to [email protected].
> Visit this group at http://groups.google.com/group/thinking-sphinx.
> For more options, visit https://groups.google.com/groups/opt_out.
>

-- 
You received this message because you are subscribed to the Google Groups 
"Thinking Sphinx" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/thinking-sphinx.
For more options, visit https://groups.google.com/groups/opt_out.

Reply via email to