On 24/12/2010, at 1:13 PM, jenny w wrote:

>> On Thu, Dec 23, 2010 at 4:27 PM, Pat Allan <[email protected]> wrote:
>> The only issue with this is that you'll be issuing a search for potentially 
>> every post in the results, on the related comments.
>>  
> Thanks for the feedback! Yes, I know, my solution can require as much as an 
> extra Sphinx query for each post that's displayed. But that's versus 
> definitely doing an additional SQL query for every post in the result set (to 
> aggregate all associated comments). I would love to have better options to 
> choose between ... if you have any ideas, please let me know.
> 
> The most performant (with regard to search times) solution I can think of 
> would be to store an aggregate field in the DB. That would have an additional 
> cost (aggregating comments for each new comment or each edited/deleted 
> comment), but that could be done in the background. Ideally, though, I'd like 
> to avoid a migration. Although now that I realize there's HTML in posts and 
> comments, we may need to have dedicated search fields anyway (that, among 
> other things, strip out the HTML). In the meantime, I'm running excerpts 
> through Nokogiri to make sure they're valid HTML, which isn't perfect but 
> seems to be good enough for my purposes right now.

If I was having performance issues, I'd just pull the body column out of the 
comments table via SQL (code from within the scope of Post):

  Comment.connection.select_values("SELECT body FROM comments WHERE post_id = 
#{id}").join(' ')

Agree that caching that combined value in a single column is the best for 
performance, but definitely feels kludgy. I'd go with the above option unless 
you're finding the speed quite slowly.

Cheers

-- 
Pat

> 
> On 24/12/2010, at 6:28 AM, jenny w wrote:
> 
> > Thanks! For now I think just pulling out the first excerpt is the most 
> > important thing (to highlight in search results, but not on post page). To 
> > do this, I have the post aggregate all comments and index that. Then I have 
> > comments also indexed. When showing search results, if there is no excerpt 
> > for the post, then I ask Sphinx for matching comments and find the first 
> > one. It's not perfect (for example, we might not get any hits on comments 
> > if no single comment contains all the keywords), but it seems to be good 
> > enough for now. If we get a lot of blanks, then I may switch match_mode 
> > when searching for comments to :any.
> >
> > On Thu, Dec 23, 2010 at 12:19 AM, Pat Allan <[email protected]> 
> > wrote:
> > Hi Jenny
> >
> > The way I'd go about this is to add the comment bodies to the post index - 
> > as you've suggested. As for highlighting, Sphinx calls this excerpts. While 
> > not quite as elegant for data via associations, I'd recommend doing the 
> > following:
> >
> > * Create a method in your post model for the comment bodies:
> >
> >  def comment_bodies
> >    comments.collect(&:body).join(' ')
> >  end
> >
> > * Access excerpts for that method:
> >
> >  <%= post.excerpts.comment_bodies %>
> >
> > http://freelancing-god.github.com/ts/en/excerpts.html
> >
> > You may want to merge the post body in with the comment bodies for that 
> > excerpted text, unless they're going to be separate when displayed in the 
> > search results.
> >
> > And yes, it's not super performant either, but nothing better comes to mind 
> > at the moment.
> >
> > Cheers
> >
> > --
> > Pat
> >
> > On 22/12/2010, at 2:18 PM, jennyw wrote:
> >
> > > Hi, everyone! Kind of new to Sphinx and Thinking Sphinx. Seems to be
> > > pretty great so far, though! I do have a question about
> > > aggregation ... For example, let's say you're creating a blog
> > > application and would like people to be able to search on posts. Since
> > > there are often insightful comments, you want the search to return
> > > posts whose comments match the search term, even if the posts don't.
> > > And you wan to highlight the search terms. The thing that's come to
> > > mind is to define an index like so:
> > >
> > > class Post < ActiveRecord::Base
> > >  define_index do
> > >    indexes subject
> > >    indexes body
> > >    indexes comments(:body), :as => comments_text
> > >  end
> > > end
> > >
> > > Which would let us include the comments in the Post. Is there a better
> > > way? Such as indexing body on comments and somehow combining them in a
> > > query? I suspect not, but thought I'd ask. Also, I suspect that if a
> > > post has many long comments, we might run into some size issues
> > > (currently setting group_concat_max_len to 1 MB).
> > >
> > > Also, if you wanted to highlight found search terms, then ... I
> > > suppose there are several ways, but none of the ones I've come up with
> > > sound great. The simplest one (and possibly the least performant)
> > > would be to index body on comments and run the search against each of
> > > those to be able to get excerpts. Any better ideas?
> > >
> > > Thanks!
> > >
> > > --
> > > You received this message because you are subscribed to the Google Groups 
> > > "Thinking Sphinx" group.
> > > To post to this group, send email to [email protected].
> > > To unsubscribe from this group, send email to 
> > > [email protected].
> > > For more options, visit this group at 
> > > http://groups.google.com/group/thinking-sphinx?hl=en.
> > >
> >
> > --
> > You received this message because you are subscribed to the Google Groups 
> > "Thinking Sphinx" group.
> > To post to this group, send email to [email protected].
> > To unsubscribe from this group, send email to 
> > [email protected].
> > For more options, visit this group at 
> > http://groups.google.com/group/thinking-sphinx?hl=en.
> >
> >
> >
> > --
> > You received this message because you are subscribed to the Google Groups 
> > "Thinking Sphinx" group.
> > To post to this group, send email to [email protected].
> > To unsubscribe from this group, send email to 
> > [email protected].
> > For more options, visit this group at 
> > http://groups.google.com/group/thinking-sphinx?hl=en.
> 
> --
> You received this message because you are subscribed to the Google Groups 
> "Thinking Sphinx" group.
> To post to this group, send email to [email protected].
> To unsubscribe from this group, send email to 
> [email protected].
> For more options, visit this group at 
> http://groups.google.com/group/thinking-sphinx?hl=en.
> 
> 
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "Thinking Sphinx" group.
> To post to this group, send email to [email protected].
> To unsubscribe from this group, send email to 
> [email protected].
> For more options, visit this group at 
> http://groups.google.com/group/thinking-sphinx?hl=en.

-- 
You received this message because you are subscribed to the Google Groups 
"Thinking Sphinx" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/thinking-sphinx?hl=en.

Reply via email to