Hi Walter, I’m pretty sure Sphinx doesn’t index punctuation by default. If you want octothorps included, you’ll need to define a custom charset_table value (per environment in `config/thinking_sphinx.yml`) which includes that character. The Sphinx docs outline the default, so best to take that and then add in the octothorp (U+23). http://sphinxsearch.com/docs/current.html#conf-charset-table <http://sphinxsearch.com/docs/current.html#conf-charset-table> https://freelancing-gods.com/thinking-sphinx/v5/advanced_config.html#character-sets-and-tables <https://freelancing-gods.com/thinking-sphinx/v5/advanced_config.html#character-sets-and-tables>
Keep in mind that this will impact all uses of that character in all fields - there’s no way to have it apply to just some fields (or, in this case, words that only start with that character). Once you’ve added this configuration, a full rebuild will be required. Cheers, — Pat > On 23 Feb 2021, at 2:41 pm, Walter Lee Davis <[email protected]> wrote: > > I'm using GutenTag to apply tags to individual pages in a CMS. The Document > model uses TS5 with Real-Time Indexing. I've set up my index thusly: > > # in the model > def tags_for_indexing > tag_names.join ' ' > end > > # in the index > ThinkingSphinx::Index.define :document, :with => :real_time do > scope { Document.where(id: Document.publicly.map{ |d| > [d.id].concat(d.descendants.published.map(&:id)) }.flatten) } > > indexes title > indexes teaser > indexes body_html > indexes author_display > indexes tags_for_indexing > > has created_at, type: :timestamp > has updated_at, type: :timestamp > end > > I've tested the method, and confirm that it outputs a space-delimited string > of words for the tags. > > I run rake ts:rt:rebuild and everything seems to go fine. But trying to > search on some of these tag names is not returning the results I am > imagining. The client has insisted on making some of these tags start with an > octothorp, because she is writing about "hashtags" on Twitter. Most tags do > not have punctuation in them. I am able to find other terms, even very > obscure ones, when I don't use punctuation in the tag names. > > Does this sound like something that I can fix, or should I advise the client > to lay off the octothorps? > > Walter > > -- > You received this message because you are subscribed to the Google Groups > "Thinking Sphinx" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/thinking-sphinx/EA71574B-9EBF-484E-A5FA-BF7CD53A10BC%40wdstudio.com. -- You received this message because you are subscribed to the Google Groups "Thinking Sphinx" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/thinking-sphinx/05B716CE-D5C7-40F6-BDE3-EC2859738632%40freelancing-gods.com.
