Hi Nick

There's nothing in Thinking Sphinx that does this... however, I think the 
--buildstops flag for indexer may get you that data of figuring out the most 
commonly indexed words.

  indexer --config path/to/config.conf --buildstops words.txt 1000

That'll grab the 1000 most common indexed words. Have a read of the docs for 
more info (the flag itself is slightly misleading) - search for buildstops:

  http://sphinxsearch.com/docs/manual-1.10.html

Hope this helps

-- 
Pat


On 01/03/2011, at 5:44 PM, Nicholas Faiz wrote:

> Hi,
> 
> I'd like to create some metadata from a large source of documents (which are 
> stored in a db). This is to build a strong set of facets for search. 
> 
> The first example I'm trying to solve is reducing a text column in the 
> database, called abstract, to a set of commonly occuring keywords - e.g. 
> abstract_keywords . The abstract is a lengthy document, and I'd like to find 
> a library which can scan all values of abstract for the most common keywords, 
> then store those results in the metadata column - abstract_keywords (in 
> another table, most likely). We then hope to use abstract_keywords as a facet 
> attribute in Thinking Sphinx.
> 
> Can anyone point me to a good starting place in Thinking Sphinx where I can 
> find this sort of scanning and aggregation of keywords functionality? Has 
> anyone done something similar?
> 
> Cheers,
> Nicholas
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "Thinking Sphinx" group.
> To post to this group, send email to [email protected].
> To unsubscribe from this group, send email to 
> [email protected].
> For more options, visit this group at 
> http://groups.google.com/group/thinking-sphinx?hl=en.

-- 
You received this message because you are subscribed to the Google Groups 
"Thinking Sphinx" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/thinking-sphinx?hl=en.

Reply via email to